LDA++
|
#include <UnsupervisedMStep.hpp>
Public Member Functions | |
virtual void | m_step (std::shared_ptr< parameters::Parameters > parameters) override |
virtual void | doc_m_step (const std::shared_ptr< corpus::Document > doc, const std::shared_ptr< parameters::Parameters > v_parameters, std::shared_ptr< parameters::Parameters > m_parameters) override |
Public Member Functions inherited from ldaplusplus::events::EventDispatcherComposition | |
std::shared_ptr< EventDispatcherInterface > | get_event_dispatcher () |
void | set_event_dispatcher (std::shared_ptr< EventDispatcherInterface > dispatcher) |
Implement the M step for the traditional unsupervised LDA.
The following equations are used to maximize the lower bound of the log likelihood ( \( \mathcal{L} \)). \(D\) is the number of documents, \(N_d\) is the number of words in the \(d\)-th document and \(K\) is the number of topics.
\begin{eqnarray*} \log p(w \mid \alpha, \beta) \geq \mathcal{L}(\gamma, \phi \mid \alpha, \beta) &=& \mathbb{E}_q[\log p(\theta \mid \alpha)] + \mathbb{E}_q[\log p(z \mid \theta)] + \mathbb{E}_q[\log p(w \mid z, \beta)] + H(q) \\ \mathcal{L}_{\beta} &=& \mathbb{E}_q[\log p(w \mid \beta)] = \sum_d^D \sum_n^{N_d} \sum_i^K \phi_{dni} \log \beta_{iw_n} \\ \beta_{ij} &\propto& \sum_d^D \sum_n^{N_d} \begin{cases} \phi_{dni} & w_n = j \\ 0 & \text{otherwise} \end{cases} \end{eqnarray*}
Since we are using the bag of words count vector in the implementation, the exact equation implemented is the following if \(X_{dj}\) is the number of occurences of the vocabulary word \(j\) in the document \(d\).
\[ \beta_{ij} \propto \sum_d^D \phi_{dji} X_{dj} \]
|
overridevirtual |
Compute the \( \sum_{d=1}^{\hat{d}}\phi_{dji} X_{dj}\), where \( \hat{d}\) is this document and save its value to a temporary variable.
doc | A single document |
v_parameters | The variational parameters used in m-step in order to maximize model parameters |
m_parameters | Model parameters, used as output in case of online methods |
Implements ldaplusplus::em::MStepInterface< Scalar >.
Reimplemented in ldaplusplus::em::FastSupervisedMStep< Scalar >, ldaplusplus::em::SupervisedMStep< Scalar >, and ldaplusplus::em::SemiSupervisedMStep< Scalar >.
|
overridevirtual |
Normalize the temporary variable aggregated in doc_m_step() and set it to the model parameters.
parameters | Model parameters (changed after this method) |
Implements ldaplusplus::em::MStepInterface< Scalar >.
Reimplemented in ldaplusplus::em::FastSupervisedMStep< Scalar >, and ldaplusplus::em::SupervisedMStep< Scalar >.