LDA++
|
#include <UnsupervisedEStep.hpp>
Public Member Functions | |
UnsupervisedEStep (size_t e_step_iterations=10, Scalar e_step_tolerance=1e-2, Scalar compute_likelihood=1.0, int random_state=0) | |
virtual std::shared_ptr< parameters::Parameters > | doc_e_step (const std::shared_ptr< corpus::Document > doc, const std::shared_ptr< parameters::Parameters > parameters) override |
![]() | |
AbstractEStep (int random_state) | |
virtual void | e_step () override |
![]() | |
std::shared_ptr< EventDispatcherInterface > | get_event_dispatcher () |
void | set_event_dispatcher (std::shared_ptr< EventDispatcherInterface > dispatcher) |
Additional Inherited Members | |
![]() | |
bool | converged (const Eigen::Matrix< Scalar, Eigen::Dynamic, 1 > &gamma_old, const Eigen::Matrix< Scalar, Eigen::Dynamic, 1 > &gamma, Scalar tolerance) |
PRNG & | get_prng () |
UnsupervisedEStep implements the classic LDA expectation step.
For each document passed in UnsupervisedEStep::doc_e_step a factorized variational distribution is computed with Dirichlet parameter \(\gamma\) and multinomial parameters \(\phi\). The distribution is computed in such a way so that a lower bound of the probability of generating the document given the model parameters (the topics that is) is maximized.
See UnsupervisedEStep::doc_e_step for the mathematics.
[1] Blei, David M., Andrew Y. Ng, and Michael I. Jordan. "Latent dirichlet allocation." Journal of machine Learning research 3.Jan (2003): 993-1022.
ldaplusplus::em::UnsupervisedEStep< Scalar >::UnsupervisedEStep | ( | size_t | e_step_iterations = 10 , |
Scalar | e_step_tolerance = 1e-2 , |
||
Scalar | compute_likelihood = 1.0 , |
||
int | random_state = 0 |
||
) |
e_step_iterations | The max number of times to alternate between maximizing for \(\gamma\) and for \(\phi\). |
e_step_tolerance | The minimum relative change in the variational parameter \(\gamma\). |
compute_likelihood | The percentage of documents to compute likelihood for (1.0 means compute for every document) |
random_state | An initial seed value for any random numbers needed |
|
overridevirtual |
Maximize the ELBO w.r.t. to \(\phi\) and \(\gamma\).
The following steps are the mathematics that are implemented where \(\beta\) are the topics, \(i\) is the topic subscript, \(n\) is the word subscript, \(w_n\) is n-th word vocabulary index, \(\alpha\) is the Dirichlet prior and finally \(\Psi(\cdot)\) is the first derivative of the \(\log \Gamma\) function.
doc | A single document |
parameters | An instance of class Parameters, which contains all necessary model parameters for e-step's implementation |
Implements ldaplusplus::em::EStepInterface< Scalar >.