2

When you run a topic modeling (say LDA), you can get outputs for some number of topics with corresponding keywords and their weights. Based on my understanding, people usually output top 10 or top 20 keywords for each topic. For these keywords, they also have weights which represent how important each keyword is for a certain topic.

For example, if I decided to draw out top 10 keywords for each topic, then the example output will be shown below.

topic 0: 0.2*keyword1 + 0.15*keyword2 + 0.09*keyword3 +... + 0.005*keyword10

topic 1: ... ...

topic n

I'm not sure how many maximum keywords I can draw out for each topic but do these weights for each topic add up to 1?

desertnaut
  • 1,988
  • 2
  • 14
  • 23
Todd
  • 123
  • 3

1 Answers1

4

Yes, the weights would add up to 1. This is a dirchlet random variable - review documentation on scipy here - https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.dirichlet.html

Jayaram Iyer
  • 815
  • 5
  • 8