Hierarchical Topic Models and the Nested Chinese Restaurant Process

This paper presents a method of determining priors over hierarchical topic structures in a world with infinite topics.
The paper relies on a generative description of how topic hierarchies are generated and then uses Gibbs sampling to find the best model for the data.
The generative process generates topic hierarchy trees in which each node in a tree is a "topic". A document is defined by a path from the root to a leaf node. The topics aren't necessarily semantically meaningful things, but rather a distribution over words that will be found in any document that is a child of the node. Since all documents are children of the root node, the "topic" of the root node will tend to be words that are common to all documents such as "a", "the", "or", etc. So the distribution isn't a "topic" in the LSI sense, but rather a generalization over the words common to all its children.
It relies on a Chinese Restaurant Process which is defined elsewhere and is extended to generate trees as well as branches in the tree. It has the following properties: if you choose an existing topic then choose a more populated topic, otherwise add more tables according to a gamma parameter.

It was a tough paper, but had pretty good results when it tried to learn a hierarchy over NIPS abstracts. (There is something weird going on when your research is directed toward the conference in which it is published, but...)
I have the following unanswered questions:
 What is a conjugate prior?
 Why is a Dirichlet prior a conjugate prior over a multinomial?
 How does a Latent Dirichlet Prior differ from a Dirichlet Prior?
Posted by djp3 at July 8, 2004 10:15 AM
 TrackBack (0)