|
LDA++
|
#include <LDA.hpp>
Public Member Functions | |
| LDA (std::shared_ptr< parameters::Parameters > model_parameters, std::shared_ptr< em::EStepInterface< Scalar > > e_step, std::shared_ptr< em::MStepInterface< Scalar > > m_step, size_t iterations=20, size_t workers=1) | |
| LDA (LDA &&lda) | |
| void | fit (const Eigen::MatrixXi &X, const Eigen::VectorXi &y) |
| void | fit (const Eigen::MatrixXi &X) |
| void | partial_fit (const Eigen::MatrixXi &X, const Eigen::VectorXi &y) |
| void | partial_fit (std::shared_ptr< corpus::Corpus > corpus) |
| MatrixX | transform (const Eigen::MatrixXi &X) |
| MatrixX | decision_function (const Eigen::MatrixXi &X) |
| Eigen::VectorXi | predict (const Eigen::MatrixXi &X) |
| std::tuple< MatrixX, Eigen::VectorXi > | transform_predict (const Eigen::MatrixXi &X) |
| std::shared_ptr< events::EventDispatcherInterface > | get_event_dispatcher () |
| const std::shared_ptr< parameters::Parameters > | model_parameters () |
| template<typename P > | |
| const std::shared_ptr< P > | model_parameters () |
Protected Member Functions | |
| std::shared_ptr< corpus::Corpus > | get_corpus (const Eigen::MatrixXi &X, const Eigen::VectorXi &y) |
| std::shared_ptr< corpus::Corpus > | get_corpus (const Eigen::MatrixXi &X) |
| void | create_worker_pool () |
| void | destroy_worker_pool () |
| void | process_worker_events () |
| std::tuple< std::shared_ptr< parameters::Parameters >, size_t > | extract_vp_from_queue () |
| void | doc_e_step_worker () |
| MatrixX | decision_function (const MatrixX &X) |
| Eigen::VectorXi | predict (const MatrixX &scores) |
LDA contains the logic of using an expectation step, a maximization step and some model parameters to train and make use of an LDA model.
| ldaplusplus::LDA< Scalar >::LDA | ( | std::shared_ptr< parameters::Parameters > | model_parameters, |
| std::shared_ptr< em::EStepInterface< Scalar > > | e_step, | ||
| std::shared_ptr< em::MStepInterface< Scalar > > | m_step, | ||
| size_t | iterations = 20, |
||
| size_t | workers = 1 |
||
| ) |
Create an LDA with the given model parameters, expectation and maximization steps default iterations and worker threads.
| model_parameters | A pointer to a struct containing the model parameters (for instance ModelParameters and SupervisedModelParameters) |
| e_step | A pointer to an expectation step implementation |
| m_step | A pointer to a maximization step implementation |
| iterations | The number of epochs to run when using LDA::fit |
| workers | The number of worker threads to create for computing the expectation step |
| ldaplusplus::LDA< Scalar >::LDA | ( | LDA< Scalar > && | lda | ) |
Create a move constructor that doesn't try to copy or move mutexes.
|
protected |
Create a worker thread pool.
| LDA< Scalar >::MatrixX ldaplusplus::LDA< Scalar >::decision_function | ( | const Eigen::MatrixXi & | X | ) |
Treat the SupervisedModelParameters::eta as a linear model and compute the distances from the planes of the documents in the topic space.
Use LDA::transform to obtain the \(\gamma\) for every document and then assume that the \(\eta\) parameters of the SupervisedModelParameters are a linear model. Compute the dot product between the normal vectors and the normalized topic mixtures for each document. The more positive the value for a given class the more confident is the model that a document belongs in this class.
| X | The word counts in column-major order |
|
protected |
Implement the decision function using already transformed data. Topic representations instead of BOW.
| X | The \(\gamma\) variational parameter for each document in a column-major ordered matrix. |
|
protected |
Destroy the worker thread pool
|
protected |
A doc_e_step worker thread.
|
protected |
Extract the variational parameters and the document index from the worker queue.
| void ldaplusplus::LDA< Scalar >::fit | ( | const Eigen::MatrixXi & | X, |
| const Eigen::VectorXi & | y | ||
| ) |
Compute a supervised topic model for word counts X and classes y.
Perform as many EM iterations as configured and stop when reaching max_iter_ or any other stopping criterion.
An EigenClassificationCorpus will be created from the passed parameters.
| X | The word counts in column-major order |
| y | The classes as integers |
| void ldaplusplus::LDA< Scalar >::fit | ( | const Eigen::MatrixXi & | X | ) |
Compute an unsupervised topic model for word counts X.
Perform as many EM iterations as configured and stop when reaching max_iter_ or any other stopping criterion.
An EigenCorpus will be created from the passed parameters.
| X | The word counts in column-major order |
|
protected |
Generate a Corpus from a pair of X, y matrices
|
protected |
Generate a Corpus from just the word count matrix.
|
inline |
Get the event dispatcher for this LDA instance.
|
inline |
Get a constant reference to the model's parameters.
| void ldaplusplus::LDA< Scalar >::partial_fit | ( | const Eigen::MatrixXi & | X, |
| const Eigen::VectorXi & | y | ||
| ) |
Perform a single EM iteration.
An EigenClassificationCorpus will be created from the passed parameters.
| X | The word counts in column-major order |
| y | The classes as integers |
| void ldaplusplus::LDA< Scalar >::partial_fit | ( | std::shared_ptr< corpus::Corpus > | corpus | ) |
Perform a single EM iteration.
| corpus | The implementation of Corpus that contains the observed variables. |
| Eigen::VectorXi ldaplusplus::LDA< Scalar >::predict | ( | const Eigen::MatrixXi & | X | ) |
Use the model to predict the class indexes for the word counts X.
Use LDA::decision_function to get class scores and then compute the argmax for every document.
| X | The word counts in column-major order |
|
protected |
Transform the decision function to class predictions.
|
inlineprotected |
Forward the events generated in the worker threads to this event dispatcher in this thread.
| LDA< Scalar >::MatrixX ldaplusplus::LDA< Scalar >::transform | ( | const Eigen::MatrixXi & | X | ) |
Run the expectation step and return the topic mixtures for the documents defined by the word counts X.
| X | The word counts in column-major order |
| std::tuple< typename LDA< Scalar >::MatrixX, Eigen::VectorXi > ldaplusplus::LDA< Scalar >::transform_predict | ( | const Eigen::MatrixXi & | X | ) |
Return both the class predictions and the transformed data using a single LDA expectation step.
| X | The word counts in column-major order |
1.8.11