Subcode — CE807

Title — Text Analytics

Due Dates:—  01 March 2021 (Assignment 01)

Assignment 02: 27 April 2021

Format— Arial font 12, only one page with references and figures, tables, etc.

References – Harvard according to the examples given 

Assignment -1 (Weightage – 25%) 

  1. Which papers or research works have studied the problem of determining the number of topics (Assignment 1)

summarise the existing literature

cite atleast six works, how they have found the number of topics.

clearly mention how they have determined the number of topics in their work

Appendix, still has to be within a page.


Assignment 2 (Weightage – 75%)

  1. How can you determine the number of topics that are ideal for the dataset (Assignment 2)
    1. Select any available text dataset which you think is sufficient to conduct topic modelling,
    2. You will do all the necessary pre-processing.
    3. You will then conduct topic modelling and do an analysis

You will write your report on a page,

You will submit your document as PDF.

At least two references in each of the three categories mentioned

only use the Latent Dirichlet Allocation model in assignment 2 

