STK++ 1.0

STK::LinearAAModelMixture Class Reference

A Linear AAmodelMixture estimate a mixture of linear auto-associative model (LAAM) More...

#include <STK_LinearAAModelMixture.h>

List of all members.

Public Member Functions

 LinearAAModelMixture (ILinearReduct *p_index)
virtual ~LinearAAModelMixture ()
void run (Integer const &maxIter=Arithmetic< Integer >::max())
void initialize (Array1D< Integer > const *p_dim)
void initialize (Matrix const &weights, Array1D< Integer > const *p_dim)
const ILinearReductindex () const
Real const & logLikehhod () const
const Array1D< IAAModel * > & autoAssocModels () const
const IAAModelautoAssocModel (Integer const &k) const
const Array1D< Integer > & dim () const
const Matrixweights () const
const Vectorprop () const
void EM (Integer const &maxIter=Arithmetic< Integer >::max())
void EStep ()
void MStep ()
Real computeLogLikehood ()

Static Public Member Functions

static void simul (const Law::ILawReal &law, Vector const &prop, const Array1D< Vector > &mu, Vector const &std, Array1D< Matrix > &proj, Array1D< Integer > &label, Matrix &data)

Protected Member Functions

bool convergenceEM ()

Protected Attributes

ILinearReductp_index_
Matrix const * p_data_
Array1D< Integer > const * p_dim_
Real logLikehood_
Array1D< IAAModel * > autoAssoc_
Matrix weights_
Array1D< Matrix * > axis_
Vector prop_

Private Member Functions

virtual void allocateAutoAssoc ()=0
void create ()
void remove ()
Real normalPdf (const Point &x, Real const &sigma2)

Detailed Description

A Linear AAmodelMixture estimate a mixture of linear auto-associative model (LAAM)

/** The LinearAAModelMixture class maximize and estimate the parameters of a AA mixture model using the EM algorithm.

Definition at line 53 of file STK_LinearAAModelMixture.h.


Constructor & Destructor Documentation

STK::LinearAAModelMixture::LinearAAModelMixture ( ILinearReduct p_index)

constructor. compute a mixture of Linear AA models of the matrix data using an Index as criteria for the axis.

Parameters:
p_indexthe Index to maximize

Definition at line 44 of file STK_LinearAAModelMixture.cpp.

                                          : p_index_(p_index)
                                          , p_data_(p_index_->p_data())
                                          , p_dim_(0)
{ }
STK::LinearAAModelMixture::~LinearAAModelMixture ( ) [virtual]

destructor.

Definition at line 50 of file STK_LinearAAModelMixture.cpp.

{   remove();
}

Member Function Documentation

void STK::LinearAAModelMixture::simul ( const Law::ILawReal law,
Vector const &  prop,
const Array1D< Vector > &  mu,
Vector const &  std,
Array1D< Matrix > &  proj,
Array1D< Integer > &  label,
Matrix data 
) [static]

Simulate a mixture of centered auto-associative linear model in $ \mathbb{R}^p $ of the form

\[ X = X.P.P' + \epsilon \]

with $ P'P = I_d $ and d < p.

Parameters:
lawthe law to use in order to simulate the data.
propthe proportions of individual of each model
muthe position parameters of each model
stdthe standard deviations of the gaussian noise of each model
[out]projthe simulated projection matrices. The dimension of each matrix gives the dimension of the AA model.
[out]labelthe number of the cluster
[out]datathe data to simulate. The dimension of the container give the number of the individuals and variables.

Definition at line 69 of file STK_LinearAAModelMixture.cpp.

References STK::IContainer1D::first(), STK::IContainer2D::firstRow(), STK::IContainer1D::last(), STK::IContainer2D::lastRow(), and STK::IContainer2D::sizeVe().

{
  // get dimensions
  const Integer first_cluster = prop.first(), last_cluster = prop.last();
  const Integer first_ind = data.firstRow(), last_ind = data.lastRow();
  const Integer nb_ind = data.sizeVe();
  // simulate each clusters
  Integer first, last = first_ind-1;
  for (Integer k=first_cluster; k<= last_cluster; k++)
  {
    // compute range of the cluster
    first = last+1;
    if (k == last_cluster)
    { last = last_ind;}
    else
    { last += Integer( nb_ind * prop[k]) +1;}
    // create reference container in the range of the cluster
    Matrix sub_data(data(Range(first, last)), true);
    // simulate data
    LinearAAModel::simul(law, mu[k], std[k], proj[k], sub_data);
    // save label
    label[Range(first, last)] = k;
  }
}
void STK::LinearAAModelMixture::run ( Integer const &  maxIter = Arithmetic<Integer>::max())

compute the model with the given dimensions of each mixtures.

Parameters:
maxItermaximal number of iteration

Definition at line 118 of file STK_LinearAAModelMixture.cpp.

References EM().

Referenced by STK::LinearAAModelMixtureManager::run().

{
  // call EM algorithm
  EM(maxIter);
}
void STK::LinearAAModelMixture::initialize ( Array1D< Integer > const *  p_dim)

compute initial values of the EM algorithm using an ad-hoc method. The number of cluster is given by the size of p_dim.

Parameters:
p_dimdimensions of each Auto-Associative model

Definition at line 125 of file STK_LinearAAModelMixture.cpp.

References computeLogLikehood(), create(), STK::IContainer1D::first(), STK::IContainer2D::firstRow(), STK::heapSort(), STK::IContainer1D::last(), STK::IContainer2D::lastRow(), logLikehood_, STK::LocalVariance::minimal_distance_, MStep(), p_data_, p_dim_, prop_, STK::AAModelFactory::run(), STK::IContainer1D::size(), STK::IContainer2D::sizeVe(), and weights_.

Referenced by STK::LinearAAModelMixtureManager::run().

{
  // set dimensions
  p_dim_ = p_dim;
  // create weights_ vectors
  create();
  // get dimensions
  const Integer first_cluster = p_dim_->first(), last_cluster = p_dim_->last();
  const Integer first_ind = p_data_->firstRow(), last_ind = p_data_->lastRow();
  // compute constants
  const Real inv_nclust = 1./Real(p_dim_->size());
  // number of individuals in each cluster
  const Integer prop_ind = p_data_->sizeVe()/ p_dim_->size();
  // initialize proportions
  prop_ = inv_nclust;

  // create an Index with local variance for the data set
  LocalVariance init_ind(p_data_, LocalVariance::minimal_distance_);
  // create a temporary linear AA model
  LinearAAModel init_aamm(&init_ind);
  init_aamm.run(1); // compute the main axis
  // sort the projected
  Array1D<Integer> index_res(init_aamm.reduced().rangeVe());
  heapSort(index_res, init_aamm.reduced()[1]);
  Integer first_ind_k, last_ind_k = first_ind-1;
  for (Integer k= first_cluster; k <= last_cluster; k++)
  {
    first_ind_k = last_ind_k + 1;
    last_ind_k += prop_ind;
    if (k == last_cluster)  last_ind_k = last_ind;
    for (Integer i = first_ind_k; i<= last_ind_k; i++)
    {
      weights_(index_res[i]) = 0.0;
      weights_(index_res[i], k) = 1.0;
    }
  }
  MStep();

//  // create weights vector with initial value 1/n
//  Vector weight(p_data_->rangeVe(), inv_nobs);
//  Array1D<Integer> index_res(p_data_->rangeVe());
//  Vector res2(p_data_->rangeVe());
//  for (Integer k= first_cluster; k <= last_cluster; k++)
//  {
//    // find kth axis
//    autoAssoc_[k]->run(&weight, (*p_dim_)[k]);
//    // compute the distance to the model
//    for (Integer i = first_ind; i<= last_ind; i++)
//      res2[i] = normTwo(autoAssoc_[k]->residuals()(i));
//    // sort residuals
//    heapSort(index_res, res2);
//    // look at the nearest individuals and set weights 0.
//    for (Integer i=last_ind; i>last_ind-prop_ind; i--)
//      weight[index_res[i]] = 0.0;
//    // renormalize weights
//    weight /= sum(weight);
//  }
  // compute log likehood
  logLikehood_ = computeLogLikehood();
}
void STK::LinearAAModelMixture::initialize ( Matrix const &  weights,
Array1D< Integer > const *  p_dim 
)

set initial values of the EM algorithm using weights. The number of cluster is given by the size of p_dim.

Parameters:
p_dimdimensions of each Auto-Associative model
weightsinitial weights

Definition at line 208 of file STK_LinearAAModelMixture.cpp.

References autoAssoc_, axis_, computeLogLikehood(), create(), STK::IContainer1D::first(), STK::IContainer2D::firstRow(), STK::IContainer1D::last(), STK::IContainer2D::lastRow(), logLikehood_, MStep(), STK::normTwo2(), p_data_, p_dim_, prop_, STK::sum(), STK::Stat::variance(), weights(), and weights_.

{
  // set dimensions
  p_dim_ = p_dim;
  // create weights_ vectors
  create();
  // copy weights
  weights_ = weights;
  // evaluate parameters
  MStep();
  // compute likehood
  logLikehood_ = computeLogLikehood();
  // get dimensions
  const Integer first_cluster = p_dim_->first(), last_cluster = p_dim_->last();
  const Integer first_ind = p_data_->firstRow(), last_ind = p_data_->lastRow();
  for (Integer k=first_cluster; k<= last_cluster; k++)
  {
    std::cout << "k= " << k << "\n";
    std::cout << "prop = " << prop_[k] << "\n";
//    std::cout << "mean = " << autoAssoc_[k]->mean() << "\n";
    std::cout << "residual variance = " << STK::sum(Stat::variance(*(autoAssoc_[k]->p_residuals()))) << "\n";
    std::cout << "axis =\n" << *(axis_[k]) << "\n";
    std::cout << "residuals =\n";
    for (Integer i = first_ind; i<= last_ind; i++)
    {
      std::cout << normTwo2(*(autoAssoc_[k]->p_residuals())(i)) << "\n";
    }
  }
}
const ILinearReduct& STK::LinearAAModelMixture::index ( ) const [inline]

get the index

Definition at line 134 of file STK_LinearAAModelMixture.h.

Referenced by STK::LinearAAModelMixtureManager::save().

    { return *p_index_;}
Real const& STK::LinearAAModelMixture::logLikehhod ( ) const [inline]

get the log Likehood

Definition at line 138 of file STK_LinearAAModelMixture.h.

    { return logLikehood_;}
const Array1D<IAAModel*>& STK::LinearAAModelMixture::autoAssocModels ( ) const [inline]

get the auto-associative models

Definition at line 142 of file STK_LinearAAModelMixture.h.

    { return autoAssoc_;}
const IAAModel& STK::LinearAAModelMixture::autoAssocModel ( Integer const &  k) const [inline]

get the kth auto-associative model

Definition at line 146 of file STK_LinearAAModelMixture.h.

Referenced by STK::LinearAAModelMixtureManager::save().

    { return *(autoAssoc_[k]);}
const Array1D<Integer>& STK::LinearAAModelMixture::dim ( ) const [inline]

get the dimensions of each models

Definition at line 150 of file STK_LinearAAModelMixture.h.

    { return *p_dim_;}
const Matrix& STK::LinearAAModelMixture::weights ( ) const [inline]

get the weights of the individuals for each model

Definition at line 154 of file STK_LinearAAModelMixture.h.

Referenced by initialize().

    { return weights_;}
const Vector& STK::LinearAAModelMixture::prop ( ) const [inline]

get the proportions of each model

Definition at line 158 of file STK_LinearAAModelMixture.h.

    { return prop_;}
void STK::LinearAAModelMixture::EM ( Integer const &  maxIter = Arithmetic<Integer>::max())

compute the model using the EM algorithm.

Parameters:
maxItermaximal number of iteration

Definition at line 241 of file STK_LinearAAModelMixture.cpp.

References convergenceEM(), EStep(), logLikehood_, MStep(), prop_, and weights_.

Referenced by run().

{
  Integer iter = 0;
  std::cout << "maxIter = " << maxIter << "\n";
  std::cout << "iter = " << iter << ", log likehood = " << logLikehood_ << "\n";
  std::cout << "prop =" << prop_ << "\n";
  std::cout << "weights =\n" << weights_ << "\n";
  if (maxIter <= 0) return;
  // main steps
  do
  {
    EStep();
    MStep();
    iter++;
    std::cout << "iter = " << iter << ", log likehood = " << logLikehood_ << "\n";
    std::cout << "prop =" << prop_ << "\n";
    std::cout << "weights =\n" << weights_ << "\n";
  }
  while (!convergenceEM() && iter < maxIter);
}
void STK::LinearAAModelMixture::EStep ( )

compute the E step of the EM algorithm. The weights are computed using the formula

\[ \omega_{ik} =\mathbf{E}[z_{ik}] = \frac{\hat{\pi}_k \phi(\mathbf{y}_i;\hat{\mu}_k+\hat{\mu}_{ik}, \hat{\sigma}^2_k I_p)} {\sum_{k=1}^K \hat{\pi}_k \phi(\mathbf{y}_i;\hat{\mu}_k+\hat{\mu}_{ik}, \hat{\sigma}^2_k I_p)} \]

Definition at line 297 of file STK_LinearAAModelMixture.cpp.

References autoAssoc_, STK::IContainer1D::first(), STK::IContainer2D::firstRow(), STK::IContainer1D::last(), STK::IContainer2D::lastRow(), normalPdf(), p_data_, p_dim_, prop_, STK::IContainer1D::size(), STK::sum(), STK::Stat::variance(), and weights_.

Referenced by EM().

{
  // get dimensions
  const Integer first_cluster = p_dim_->first(), last_cluster = p_dim_->last();
  const Integer first_ind = p_data_->firstRow(), last_ind = p_data_->lastRow();
  // compute common variance
  Real variance = 0.0;
  for (Integer k=first_cluster; k<=last_cluster; k++)
  {
    variance += STK::sum(Stat::variance(*(autoAssoc_[k]->p_residuals())));
  }
  variance /= p_dim_->size();
  // compute unnormalized weights
  for (Integer k=first_cluster; k<=last_cluster; k++)
  {
    // get residual variance of the kth model
    //Real variance = autoAssoc_[k]->residualVariance();
    // compute w(i, k)
    for (Integer i = first_ind; i<= last_ind; i++)
    {
      // compute weight
      weights_(i, k) = prop_[k]
                          * normalPdf(*(autoAssoc_[k]->p_residuals())(i), variance );
    }
  }
  // normalize weights
  for (Integer i = first_ind; i <= last_ind; i++)
  {
    // normalize weights
    weights_(i) /= sum(weights_(i));
  }
}
void STK::LinearAAModelMixture::MStep ( )

compute the M step of the EM algorithm.

Definition at line 333 of file STK_LinearAAModelMixture.cpp.

References autoAssoc_, STK::ILinearReduct::axis(), axis_, STK::IContainer1D::first(), STK::IContainer1D::last(), p_data_, p_dim_, p_index_, prop_, STK::IContainer2D::sizeVe(), STK::sum(), and weights_.

Referenced by EM(), and initialize().

{
  // get dimensions
  const Integer first_cluster = p_dim_->first();
  const Integer last_cluster = p_dim_->last();

  const Real inv_nobs = 1./Real(p_data_->sizeVe());
  // perform M running each AA model with the current weights
  for (Integer k=first_cluster; k<= last_cluster; k++)
  {
    // compute proportions
    prop_[k] = sum(weights_[k])*inv_nobs;
    // run the kth AAM
    Vector weights_k(weights_[k], true);
    autoAssoc_[k]->run(&weights_k, (*p_dim_)[k]);
    // save the kth axis of projection
    *(axis_[k]) = p_index_->axis();
  }
}
Real STK::LinearAAModelMixture::computeLogLikehood ( )

compute the likehood of the mixture

Definition at line 272 of file STK_LinearAAModelMixture.cpp.

References autoAssoc_, STK::IContainer1D::first(), STK::IContainer2D::firstRow(), STK::IContainer1D::last(), STK::IContainer2D::lastRow(), normalPdf(), p_data_, p_dim_, prop_, STK::sum(), and STK::Stat::variance().

Referenced by convergenceEM(), and initialize().

{
  // get dimensions
  const Integer first_cluster = p_dim_->first(), last_cluster = p_dim_->last();
  const Integer first_ind = p_data_->firstRow(), last_ind = p_data_->lastRow();

  Real sum1 = 0.0;
  for (Integer i = first_ind; i<= last_ind; i++)
  {
    Real sum2 = 0.0;
    for (Integer k=first_cluster; k<= last_cluster; k++)
    {
      // get residual variance of the kth model
      Real variance = STK::sum(Stat::variance(*(autoAssoc_[k]->p_residuals())));
      sum2 += prop_[k]
           *normalPdf(*(autoAssoc_[k]->p_residuals())(i), variance );
    }
    sum1 += log(double(sum2));
  }
  return sum1;
}
bool STK::LinearAAModelMixture::convergenceEM ( ) [protected]

check convergence of the EM algorithm.

Returns:
true if the algorithm has converged, false otherwise

Definition at line 263 of file STK_LinearAAModelMixture.cpp.

References STK::abs(), computeLogLikehood(), and logLikehood_.

Referenced by EM().

{
  Real new_l = computeLogLikehood();
  bool res = (abs((new_l - logLikehood_)/logLikehood_) < Arithmetic<Real>::epsilon());
  logLikehood_ = new_l;
  return res;
}
void STK::LinearAAModelMixture::allocateAutoAssoc ( ) [private, pure virtual]

allocate the AA models wanted by the user.

Definition at line 102 of file STK_LinearAAModelMixture.cpp.

References autoAssoc_, STK::IContainer1D::first(), STK::IContainer1D::last(), STK::IReduct::p_data(), p_dim_, and p_index_.

Referenced by create().

{
  // get dimensions
  const Integer first_cluster = p_dim_->first(), last_cluster = p_dim_->last();
  // create weights_with same value and Axis container
  for (Integer k=first_cluster; k<= last_cluster; k++)
  {
    autoAssoc_[k] = new LinearAAModel(p_index_->p_data());
    autoAssoc_[k]->setReductor(p_index_);
  }
}
void STK::LinearAAModelMixture::create ( ) [private]

Set dimensions and allocate the containers

Definition at line 354 of file STK_LinearAAModelMixture.cpp.

References allocateAutoAssoc(), autoAssoc_, axis_, STK::IContainer1D::first(), STK::IContainer1D::last(), p_data_, p_dim_, prop_, STK::IContainer1D::range(), STK::IContainer2D::rangeHo(), STK::IContainer2D::rangeVe(), STK::IContainer2D::resize(), STK::IContainer1D::resize(), and weights_.

Referenced by initialize().

{
  // remove any existing weights_
  remove();
  // resize proportion vector
  prop_.resize(p_dim_->range());
  // resize weights vector
  weights_.resize(p_data_->rangeVe(), p_dim_->range());
  // resize axis vector
  axis_.resize(p_dim_->range());
  // create AutoAssociative models
  autoAssoc_.resize(p_dim_->range());
  allocateAutoAssoc();

  // get dimensions
  const Integer first_cluster = p_dim_->first(), last_cluster = p_dim_->last();
  const Range range_var = p_data_->rangeHo();

  // create weights_with same value and Axis container
  for (Integer k=first_cluster; k<= last_cluster; k++)
  {
    axis_[k] = new Matrix(range_var, Range((*p_dim_)[k]));
  }
}
void STK::LinearAAModelMixture::remove ( ) [private]

remove containers

Definition at line 380 of file STK_LinearAAModelMixture.cpp.

References autoAssoc_, axis_, STK::IContainer1D::first(), and STK::IContainer1D::last().

{
  // get dimensions
  Integer first = axis_.first(), last = axis_.last();
  // remove each axis
  for (Integer k = first; k<= last; k++)
  {
    if (axis_[k]) delete axis_[k];
    axis_[k] = 0;
  }
  // get dimensions
  first = autoAssoc_.first(); last = autoAssoc_.last();
  // remove each cluster
  for (Integer k = first; k<= last; k++)
  {
    if (autoAssoc_[k]) delete autoAssoc_[k];
    autoAssoc_[k] = 0;
  }
}
Real STK::LinearAAModelMixture::normalPdf ( const Point x,
Real const &  sigma2 
) [private]

compute the pdf of the centered multivariate normal distribution

\[ f(x; \sigma^2 I_d) = \frac{1}{\sigma\sqrt{2 d p\pi }} \exp\left(-\frac{\left| x \right|^2} {2\sigma^2} \right) \]

with $ x\in \mathbf{R}^d $

Parameters:
xthe Point to compute normal pdf
sigma2the variance of the pdf
Returns:
the value of the pdf at the point x

Definition at line 407 of file STK_LinearAAModelMixture.cpp.

References STK::normTwo2(), STK::Const::ONE_SQRTPI2, and STK::IContainer1D::size().

Referenced by computeLogLikehood(), and EStep().

{
  return Const::ONE_SQRTPI2 * exp(-0.5 * normTwo2(x)/ sigma2) / sqrt(sigma2 * x.size());
}

Member Data Documentation

The input Index to use in order to find the axis. This Index is shared by all the IAAModel objects and thus the axis and the values of the Index will be overwritten if they are not saved.

Definition at line 61 of file STK_LinearAAModelMixture.h.

Referenced by allocateAutoAssoc(), and MStep().

A ponter on the input data set associted with the index.

Definition at line 64 of file STK_LinearAAModelMixture.h.

Referenced by computeLogLikehood(), create(), EStep(), initialize(), and MStep().

The dimension of each model. The size of this container gives also the number of cluster.

Definition at line 69 of file STK_LinearAAModelMixture.h.

Referenced by allocateAutoAssoc(), computeLogLikehood(), create(), EStep(), initialize(), and MStep().

value of the logLikehood

Definition at line 186 of file STK_LinearAAModelMixture.h.

Referenced by convergenceEM(), EM(), and initialize().

Array of the AutoAssociatif models

Definition at line 189 of file STK_LinearAAModelMixture.h.

Referenced by allocateAutoAssoc(), computeLogLikehood(), create(), EStep(), initialize(), MStep(), and remove().

Matrix of the weights. An array of size (n, K) where K is the number of cluster given by the dimension of p_dim_.

Definition at line 192 of file STK_LinearAAModelMixture.h.

Referenced by create(), EM(), EStep(), initialize(), and MStep().

Matrix of the Axis. This array contain a physical copy of the axis computed for each models.

Definition at line 195 of file STK_LinearAAModelMixture.h.

Referenced by create(), initialize(), MStep(), and remove().

Vector of the proportions

Definition at line 197 of file STK_LinearAAModelMixture.h.

Referenced by computeLogLikehood(), create(), EM(), EStep(), initialize(), and MStep().


The documentation for this class was generated from the following files: