|
STK++ 1.0
|
A Linear AAmodelMixture estimate a mixture of linear auto-associative model (LAAM) More...
#include <STK_LinearAAModelMixture.h>
Public Member Functions | |
| LinearAAModelMixture (ILinearReduct *p_index) | |
| virtual | ~LinearAAModelMixture () |
| void | run (Integer const &maxIter=Arithmetic< Integer >::max()) |
| void | initialize (Array1D< Integer > const *p_dim) |
| void | initialize (Matrix const &weights, Array1D< Integer > const *p_dim) |
| const ILinearReduct & | index () const |
| Real const & | logLikehhod () const |
| const Array1D< IAAModel * > & | autoAssocModels () const |
| const IAAModel & | autoAssocModel (Integer const &k) const |
| const Array1D< Integer > & | dim () const |
| const Matrix & | weights () const |
| const Vector & | prop () const |
| void | EM (Integer const &maxIter=Arithmetic< Integer >::max()) |
| void | EStep () |
| void | MStep () |
| Real | computeLogLikehood () |
Static Public Member Functions | |
| static void | simul (const Law::ILawReal &law, Vector const &prop, const Array1D< Vector > &mu, Vector const &std, Array1D< Matrix > &proj, Array1D< Integer > &label, Matrix &data) |
Protected Member Functions | |
| bool | convergenceEM () |
Protected Attributes | |
| ILinearReduct * | p_index_ |
| Matrix const * | p_data_ |
| Array1D< Integer > const * | p_dim_ |
| Real | logLikehood_ |
| Array1D< IAAModel * > | autoAssoc_ |
| Matrix | weights_ |
| Array1D< Matrix * > | axis_ |
| Vector | prop_ |
Private Member Functions | |
| virtual void | allocateAutoAssoc ()=0 |
| void | create () |
| void | remove () |
| Real | normalPdf (const Point &x, Real const &sigma2) |
A Linear AAmodelMixture estimate a mixture of linear auto-associative model (LAAM)
/** The LinearAAModelMixture class maximize and estimate the parameters of a AA mixture model using the EM algorithm.
Definition at line 53 of file STK_LinearAAModelMixture.h.
| STK::LinearAAModelMixture::LinearAAModelMixture | ( | ILinearReduct * | p_index | ) |
| STK::LinearAAModelMixture::~LinearAAModelMixture | ( | ) | [virtual] |
| void STK::LinearAAModelMixture::simul | ( | const Law::ILawReal & | law, |
| Vector const & | prop, | ||
| const Array1D< Vector > & | mu, | ||
| Vector const & | std, | ||
| Array1D< Matrix > & | proj, | ||
| Array1D< Integer > & | label, | ||
| Matrix & | data | ||
| ) | [static] |
Simulate a mixture of centered auto-associative linear model in
of the form
with
and d < p.
| law | the law to use in order to simulate the data. | |
| prop | the proportions of individual of each model | |
| mu | the position parameters of each model | |
| std | the standard deviations of the gaussian noise of each model | |
| [out] | proj | the simulated projection matrices. The dimension of each matrix gives the dimension of the AA model. |
| [out] | label | the number of the cluster |
| [out] | data | the data to simulate. The dimension of the container give the number of the individuals and variables. |
Definition at line 69 of file STK_LinearAAModelMixture.cpp.
References STK::IContainer1D::first(), STK::IContainer2D::firstRow(), STK::IContainer1D::last(), STK::IContainer2D::lastRow(), and STK::IContainer2D::sizeVe().
{
// get dimensions
const Integer first_cluster = prop.first(), last_cluster = prop.last();
const Integer first_ind = data.firstRow(), last_ind = data.lastRow();
const Integer nb_ind = data.sizeVe();
// simulate each clusters
Integer first, last = first_ind-1;
for (Integer k=first_cluster; k<= last_cluster; k++)
{
// compute range of the cluster
first = last+1;
if (k == last_cluster)
{ last = last_ind;}
else
{ last += Integer( nb_ind * prop[k]) +1;}
// create reference container in the range of the cluster
Matrix sub_data(data(Range(first, last)), true);
// simulate data
LinearAAModel::simul(law, mu[k], std[k], proj[k], sub_data);
// save label
label[Range(first, last)] = k;
}
}
| void STK::LinearAAModelMixture::run | ( | Integer const & | maxIter = Arithmetic<Integer>::max() | ) |
compute the model with the given dimensions of each mixtures.
| maxIter | maximal number of iteration |
Definition at line 118 of file STK_LinearAAModelMixture.cpp.
References EM().
Referenced by STK::LinearAAModelMixtureManager::run().
{
// call EM algorithm
EM(maxIter);
}
compute initial values of the EM algorithm using an ad-hoc method. The number of cluster is given by the size of p_dim.
| p_dim | dimensions of each Auto-Associative model |
Definition at line 125 of file STK_LinearAAModelMixture.cpp.
References computeLogLikehood(), create(), STK::IContainer1D::first(), STK::IContainer2D::firstRow(), STK::heapSort(), STK::IContainer1D::last(), STK::IContainer2D::lastRow(), logLikehood_, STK::LocalVariance::minimal_distance_, MStep(), p_data_, p_dim_, prop_, STK::AAModelFactory::run(), STK::IContainer1D::size(), STK::IContainer2D::sizeVe(), and weights_.
Referenced by STK::LinearAAModelMixtureManager::run().
{
// set dimensions
p_dim_ = p_dim;
// create weights_ vectors
create();
// get dimensions
const Integer first_cluster = p_dim_->first(), last_cluster = p_dim_->last();
const Integer first_ind = p_data_->firstRow(), last_ind = p_data_->lastRow();
// compute constants
const Real inv_nclust = 1./Real(p_dim_->size());
// number of individuals in each cluster
const Integer prop_ind = p_data_->sizeVe()/ p_dim_->size();
// initialize proportions
prop_ = inv_nclust;
// create an Index with local variance for the data set
LocalVariance init_ind(p_data_, LocalVariance::minimal_distance_);
// create a temporary linear AA model
LinearAAModel init_aamm(&init_ind);
init_aamm.run(1); // compute the main axis
// sort the projected
Array1D<Integer> index_res(init_aamm.reduced().rangeVe());
heapSort(index_res, init_aamm.reduced()[1]);
Integer first_ind_k, last_ind_k = first_ind-1;
for (Integer k= first_cluster; k <= last_cluster; k++)
{
first_ind_k = last_ind_k + 1;
last_ind_k += prop_ind;
if (k == last_cluster) last_ind_k = last_ind;
for (Integer i = first_ind_k; i<= last_ind_k; i++)
{
weights_(index_res[i]) = 0.0;
weights_(index_res[i], k) = 1.0;
}
}
MStep();
// // create weights vector with initial value 1/n
// Vector weight(p_data_->rangeVe(), inv_nobs);
// Array1D<Integer> index_res(p_data_->rangeVe());
// Vector res2(p_data_->rangeVe());
// for (Integer k= first_cluster; k <= last_cluster; k++)
// {
// // find kth axis
// autoAssoc_[k]->run(&weight, (*p_dim_)[k]);
// // compute the distance to the model
// for (Integer i = first_ind; i<= last_ind; i++)
// res2[i] = normTwo(autoAssoc_[k]->residuals()(i));
// // sort residuals
// heapSort(index_res, res2);
// // look at the nearest individuals and set weights 0.
// for (Integer i=last_ind; i>last_ind-prop_ind; i--)
// weight[index_res[i]] = 0.0;
// // renormalize weights
// weight /= sum(weight);
// }
// compute log likehood
logLikehood_ = computeLogLikehood();
}
| void STK::LinearAAModelMixture::initialize | ( | Matrix const & | weights, |
| Array1D< Integer > const * | p_dim | ||
| ) |
set initial values of the EM algorithm using weights. The number of cluster is given by the size of p_dim.
| p_dim | dimensions of each Auto-Associative model |
| weights | initial weights |
Definition at line 208 of file STK_LinearAAModelMixture.cpp.
References autoAssoc_, axis_, computeLogLikehood(), create(), STK::IContainer1D::first(), STK::IContainer2D::firstRow(), STK::IContainer1D::last(), STK::IContainer2D::lastRow(), logLikehood_, MStep(), STK::normTwo2(), p_data_, p_dim_, prop_, STK::sum(), STK::Stat::variance(), weights(), and weights_.
{
// set dimensions
p_dim_ = p_dim;
// create weights_ vectors
create();
// copy weights
weights_ = weights;
// evaluate parameters
MStep();
// compute likehood
logLikehood_ = computeLogLikehood();
// get dimensions
const Integer first_cluster = p_dim_->first(), last_cluster = p_dim_->last();
const Integer first_ind = p_data_->firstRow(), last_ind = p_data_->lastRow();
for (Integer k=first_cluster; k<= last_cluster; k++)
{
std::cout << "k= " << k << "\n";
std::cout << "prop = " << prop_[k] << "\n";
// std::cout << "mean = " << autoAssoc_[k]->mean() << "\n";
std::cout << "residual variance = " << STK::sum(Stat::variance(*(autoAssoc_[k]->p_residuals()))) << "\n";
std::cout << "axis =\n" << *(axis_[k]) << "\n";
std::cout << "residuals =\n";
for (Integer i = first_ind; i<= last_ind; i++)
{
std::cout << normTwo2(*(autoAssoc_[k]->p_residuals())(i)) << "\n";
}
}
}
| const ILinearReduct& STK::LinearAAModelMixture::index | ( | ) | const [inline] |
get the index
Definition at line 134 of file STK_LinearAAModelMixture.h.
Referenced by STK::LinearAAModelMixtureManager::save().
{ return *p_index_;}
| Real const& STK::LinearAAModelMixture::logLikehhod | ( | ) | const [inline] |
get the log Likehood
Definition at line 138 of file STK_LinearAAModelMixture.h.
{ return logLikehood_;}
get the auto-associative models
Definition at line 142 of file STK_LinearAAModelMixture.h.
{ return autoAssoc_;}
get the kth auto-associative model
Definition at line 146 of file STK_LinearAAModelMixture.h.
Referenced by STK::LinearAAModelMixtureManager::save().
{ return *(autoAssoc_[k]);}
get the dimensions of each models
Definition at line 150 of file STK_LinearAAModelMixture.h.
{ return *p_dim_;}
| const Matrix& STK::LinearAAModelMixture::weights | ( | ) | const [inline] |
get the weights of the individuals for each model
Definition at line 154 of file STK_LinearAAModelMixture.h.
Referenced by initialize().
{ return weights_;}
| const Vector& STK::LinearAAModelMixture::prop | ( | ) | const [inline] |
get the proportions of each model
Definition at line 158 of file STK_LinearAAModelMixture.h.
{ return prop_;}
| void STK::LinearAAModelMixture::EM | ( | Integer const & | maxIter = Arithmetic<Integer>::max() | ) |
compute the model using the EM algorithm.
| maxIter | maximal number of iteration |
Definition at line 241 of file STK_LinearAAModelMixture.cpp.
References convergenceEM(), EStep(), logLikehood_, MStep(), prop_, and weights_.
Referenced by run().
{
Integer iter = 0;
std::cout << "maxIter = " << maxIter << "\n";
std::cout << "iter = " << iter << ", log likehood = " << logLikehood_ << "\n";
std::cout << "prop =" << prop_ << "\n";
std::cout << "weights =\n" << weights_ << "\n";
if (maxIter <= 0) return;
// main steps
do
{
EStep();
MStep();
iter++;
std::cout << "iter = " << iter << ", log likehood = " << logLikehood_ << "\n";
std::cout << "prop =" << prop_ << "\n";
std::cout << "weights =\n" << weights_ << "\n";
}
while (!convergenceEM() && iter < maxIter);
}
| void STK::LinearAAModelMixture::EStep | ( | ) |
compute the E step of the EM algorithm. The weights are computed using the formula
Definition at line 297 of file STK_LinearAAModelMixture.cpp.
References autoAssoc_, STK::IContainer1D::first(), STK::IContainer2D::firstRow(), STK::IContainer1D::last(), STK::IContainer2D::lastRow(), normalPdf(), p_data_, p_dim_, prop_, STK::IContainer1D::size(), STK::sum(), STK::Stat::variance(), and weights_.
Referenced by EM().
{
// get dimensions
const Integer first_cluster = p_dim_->first(), last_cluster = p_dim_->last();
const Integer first_ind = p_data_->firstRow(), last_ind = p_data_->lastRow();
// compute common variance
Real variance = 0.0;
for (Integer k=first_cluster; k<=last_cluster; k++)
{
variance += STK::sum(Stat::variance(*(autoAssoc_[k]->p_residuals())));
}
variance /= p_dim_->size();
// compute unnormalized weights
for (Integer k=first_cluster; k<=last_cluster; k++)
{
// get residual variance of the kth model
//Real variance = autoAssoc_[k]->residualVariance();
// compute w(i, k)
for (Integer i = first_ind; i<= last_ind; i++)
{
// compute weight
weights_(i, k) = prop_[k]
* normalPdf(*(autoAssoc_[k]->p_residuals())(i), variance );
}
}
// normalize weights
for (Integer i = first_ind; i <= last_ind; i++)
{
// normalize weights
weights_(i) /= sum(weights_(i));
}
}
| void STK::LinearAAModelMixture::MStep | ( | ) |
compute the M step of the EM algorithm.
Definition at line 333 of file STK_LinearAAModelMixture.cpp.
References autoAssoc_, STK::ILinearReduct::axis(), axis_, STK::IContainer1D::first(), STK::IContainer1D::last(), p_data_, p_dim_, p_index_, prop_, STK::IContainer2D::sizeVe(), STK::sum(), and weights_.
Referenced by EM(), and initialize().
{
// get dimensions
const Integer first_cluster = p_dim_->first();
const Integer last_cluster = p_dim_->last();
const Real inv_nobs = 1./Real(p_data_->sizeVe());
// perform M running each AA model with the current weights
for (Integer k=first_cluster; k<= last_cluster; k++)
{
// compute proportions
prop_[k] = sum(weights_[k])*inv_nobs;
// run the kth AAM
Vector weights_k(weights_[k], true);
autoAssoc_[k]->run(&weights_k, (*p_dim_)[k]);
// save the kth axis of projection
*(axis_[k]) = p_index_->axis();
}
}
| Real STK::LinearAAModelMixture::computeLogLikehood | ( | ) |
compute the likehood of the mixture
Definition at line 272 of file STK_LinearAAModelMixture.cpp.
References autoAssoc_, STK::IContainer1D::first(), STK::IContainer2D::firstRow(), STK::IContainer1D::last(), STK::IContainer2D::lastRow(), normalPdf(), p_data_, p_dim_, prop_, STK::sum(), and STK::Stat::variance().
Referenced by convergenceEM(), and initialize().
{
// get dimensions
const Integer first_cluster = p_dim_->first(), last_cluster = p_dim_->last();
const Integer first_ind = p_data_->firstRow(), last_ind = p_data_->lastRow();
Real sum1 = 0.0;
for (Integer i = first_ind; i<= last_ind; i++)
{
Real sum2 = 0.0;
for (Integer k=first_cluster; k<= last_cluster; k++)
{
// get residual variance of the kth model
Real variance = STK::sum(Stat::variance(*(autoAssoc_[k]->p_residuals())));
sum2 += prop_[k]
*normalPdf(*(autoAssoc_[k]->p_residuals())(i), variance );
}
sum1 += log(double(sum2));
}
return sum1;
}
| bool STK::LinearAAModelMixture::convergenceEM | ( | ) | [protected] |
check convergence of the EM algorithm.
true if the algorithm has converged, false otherwise Definition at line 263 of file STK_LinearAAModelMixture.cpp.
References STK::abs(), computeLogLikehood(), and logLikehood_.
Referenced by EM().
{
Real new_l = computeLogLikehood();
bool res = (abs((new_l - logLikehood_)/logLikehood_) < Arithmetic<Real>::epsilon());
logLikehood_ = new_l;
return res;
}
| void STK::LinearAAModelMixture::allocateAutoAssoc | ( | ) | [private, pure virtual] |
allocate the AA models wanted by the user.
Definition at line 102 of file STK_LinearAAModelMixture.cpp.
References autoAssoc_, STK::IContainer1D::first(), STK::IContainer1D::last(), STK::IReduct::p_data(), p_dim_, and p_index_.
Referenced by create().
{
// get dimensions
const Integer first_cluster = p_dim_->first(), last_cluster = p_dim_->last();
// create weights_with same value and Axis container
for (Integer k=first_cluster; k<= last_cluster; k++)
{
autoAssoc_[k] = new LinearAAModel(p_index_->p_data());
autoAssoc_[k]->setReductor(p_index_);
}
}
| void STK::LinearAAModelMixture::create | ( | ) | [private] |
Set dimensions and allocate the containers
Definition at line 354 of file STK_LinearAAModelMixture.cpp.
References allocateAutoAssoc(), autoAssoc_, axis_, STK::IContainer1D::first(), STK::IContainer1D::last(), p_data_, p_dim_, prop_, STK::IContainer1D::range(), STK::IContainer2D::rangeHo(), STK::IContainer2D::rangeVe(), STK::IContainer2D::resize(), STK::IContainer1D::resize(), and weights_.
Referenced by initialize().
{
// remove any existing weights_
remove();
// resize proportion vector
prop_.resize(p_dim_->range());
// resize weights vector
weights_.resize(p_data_->rangeVe(), p_dim_->range());
// resize axis vector
axis_.resize(p_dim_->range());
// create AutoAssociative models
autoAssoc_.resize(p_dim_->range());
allocateAutoAssoc();
// get dimensions
const Integer first_cluster = p_dim_->first(), last_cluster = p_dim_->last();
const Range range_var = p_data_->rangeHo();
// create weights_with same value and Axis container
for (Integer k=first_cluster; k<= last_cluster; k++)
{
axis_[k] = new Matrix(range_var, Range((*p_dim_)[k]));
}
}
| void STK::LinearAAModelMixture::remove | ( | ) | [private] |
remove containers
Definition at line 380 of file STK_LinearAAModelMixture.cpp.
References autoAssoc_, axis_, STK::IContainer1D::first(), and STK::IContainer1D::last().
{
// get dimensions
Integer first = axis_.first(), last = axis_.last();
// remove each axis
for (Integer k = first; k<= last; k++)
{
if (axis_[k]) delete axis_[k];
axis_[k] = 0;
}
// get dimensions
first = autoAssoc_.first(); last = autoAssoc_.last();
// remove each cluster
for (Integer k = first; k<= last; k++)
{
if (autoAssoc_[k]) delete autoAssoc_[k];
autoAssoc_[k] = 0;
}
}
compute the pdf of the centered multivariate normal distribution
with
| x | the Point to compute normal pdf |
| sigma2 | the variance of the pdf |
Definition at line 407 of file STK_LinearAAModelMixture.cpp.
References STK::normTwo2(), STK::Const::ONE_SQRTPI2, and STK::IContainer1D::size().
Referenced by computeLogLikehood(), and EStep().
{
return Const::ONE_SQRTPI2 * exp(-0.5 * normTwo2(x)/ sigma2) / sqrt(sigma2 * x.size());
}
ILinearReduct* STK::LinearAAModelMixture::p_index_ [protected] |
The input Index to use in order to find the axis. This Index is shared by all the IAAModel objects and thus the axis and the values of the Index will be overwritten if they are not saved.
Definition at line 61 of file STK_LinearAAModelMixture.h.
Referenced by allocateAutoAssoc(), and MStep().
Matrix const* STK::LinearAAModelMixture::p_data_ [protected] |
A ponter on the input data set associted with the index.
Definition at line 64 of file STK_LinearAAModelMixture.h.
Referenced by computeLogLikehood(), create(), EStep(), initialize(), and MStep().
Array1D<Integer> const* STK::LinearAAModelMixture::p_dim_ [protected] |
The dimension of each model. The size of this container gives also the number of cluster.
Definition at line 69 of file STK_LinearAAModelMixture.h.
Referenced by allocateAutoAssoc(), computeLogLikehood(), create(), EStep(), initialize(), and MStep().
Real STK::LinearAAModelMixture::logLikehood_ [protected] |
value of the logLikehood
Definition at line 186 of file STK_LinearAAModelMixture.h.
Referenced by convergenceEM(), EM(), and initialize().
Array1D<IAAModel*> STK::LinearAAModelMixture::autoAssoc_ [protected] |
Array of the AutoAssociatif models
Definition at line 189 of file STK_LinearAAModelMixture.h.
Referenced by allocateAutoAssoc(), computeLogLikehood(), create(), EStep(), initialize(), MStep(), and remove().
Matrix STK::LinearAAModelMixture::weights_ [protected] |
Matrix of the weights. An array of size (n, K) where K is the number of cluster given by the dimension of p_dim_.
Definition at line 192 of file STK_LinearAAModelMixture.h.
Referenced by create(), EM(), EStep(), initialize(), and MStep().
Array1D<Matrix*> STK::LinearAAModelMixture::axis_ [protected] |
Matrix of the Axis. This array contain a physical copy of the axis computed for each models.
Definition at line 195 of file STK_LinearAAModelMixture.h.
Referenced by create(), initialize(), MStep(), and remove().
Vector STK::LinearAAModelMixture::prop_ [protected] |
Vector of the proportions
Definition at line 197 of file STK_LinearAAModelMixture.h.
Referenced by computeLogLikehood(), create(), EM(), EStep(), initialize(), and MStep().