Vignette + obsolete doc (should be updated soon)
[morpheus.git] / vignettes / report.Rmd
CommitLineData
3d5b5060 1---
c83df166 2title: Use morpheus package
3d5b5060
BA
3
4output:
5 pdf_document:
6 number_sections: true
7 toc_depth: 1
8---
9
10```{r setup, results="hide", include=FALSE}
11knitr::opts_chunk$set(echo = TRUE, include = TRUE,
12 cache = TRUE, comment="", cache.lazy = FALSE,
13 out.width = "100%", fig.align = "center")
14```
15
c83df166
BA
16## Introduction
17<!--Tell that we try to learn classification parameters in a non-EM way, using algebric manipulations.-->
3d5b5060 18
dad25cd2
BA
19*morpheus* is a contributed R package which attempts to find the parameters of a
20mixture of logistic classifiers.
21When the data under study come from several groups that have different characteristics,
22using mixture models is a very popular way to handle heterogeneity.
23Thus, many algorithms were developed to deal with various mixtures models.
24Most of them use likelihood methods or Bayesian methods that are likelihood dependent.
cff1083b 25*flexmix* is an R package which implements these kinds of algorithms.
3d5b5060 26
dad25cd2
BA
27However, one problem of such methods is that they can converge to local maxima,
28so several starting points must be explored.
29Recently, spectral methods were developed to bypass EM algorithms and they were proved
30able to recover the directions of the regression parameter
c83df166 31in models with known link function and random covariates (see [XX]).
dad25cd2
BA
32Our package extends such moment methods using least squares to get estimators of the
33whole parameters (with theoretical garantees, see [XX]).
cff1083b 34Currently it can handle only binary output $-$ which is a common case.
3d5b5060 35
c83df166
BA
36## Model
37
dad25cd2
BA
38Let $X\in \R^{d}$ be the vector of covariates and $Y\in \{0,1\}$ be the binary output.
39A binary regression model assumes that for some link function $g$, the probability that
40$Y=1$ conditionally to $X=x$ is given by $g(\langle \beta, x \rangle +b)$, where
41$\beta\in \R^{d}$ is the vector of regression coefficients and $b\in\R$ is the intercept.
42Popular examples of link functions are the logit link function where for any real $z$,
43$g(z)=e^z/(1+e^z)$ and the probit link function where $g(z)=\Phi(z),$ with $\Phi$
44the cumulative distribution function of the standard normal ${\cal N}(0,1)$.
45Both are implemented in the package.
46
47If now we want to modelise heterogeneous populations, let $K$ be the number of
48populations and $\omega=(\omega_1,\cdots,\omega_K)$ their weights such that
49$\omega_{j}\geq 0$, $j=1,\ldots,K$ and $\sum_{j=1}^{K}\omega{j}=1$.
50Define, for $j=1,\ldots,K$, the regression coefficients in the $j$-th population
51by $\beta_{j}\in\R^{d}$ and the intercept in the $j$-th population by
52$b_{j}\in\R$. Let $\omega =(\omega_{1},\ldots,\omega_{K})$,
53$b=(b_1,\cdots,b_K)$, $\beta=[\beta_{1} \vert \cdots,\vert \beta_K]$ the $d\times K$
54matrix of regression coefficients and denote $\theta=(\omega,\beta,b)$.
e36b1046 55The model of population mixture of binary regressions is given by:
dad25cd2 56
e36b1046
BA
57\begin{equation}
58\label{mixturemodel1}
59\PP_{\theta}(Y=1\vert X=x)=\sum^{K}_{k=1}\omega_k g(<\beta_k,x>+b_k).
60\end{equation}
61
dad25cd2 62## Algorithm, theoretical garantees
e36b1046 63
dad25cd2
BA
64The algorithm uses spectral properties of some tensor matrices to estimate the model
65parameters $\Theta = (\omega, \beta, b)$. Under rather mild conditions it can be
66proved that the algorithm converges to the correct values (its speed is known too).
67For more informations on that subject, however, please refer to our article [XX].
68In this vignette let's rather focus on package usage.
3d5b5060 69
dad25cd2 70## Usage
85e0343a
BA
71<!--We assume that the random variable $X$ has a Gaussian distribution.
72We now focus on the situation where $X\sim \mathcal{N}(0,I_d)$, $I_d$ being the
73identity $d\times d$ matrix. All results may be easily extended to the situation
74where $X\sim \mathcal{N}(m,\Sigma)$, $m\in \R^{d}$, $\Sigma$ a positive and
75symetric $d\times d$ matrix. ***** TODO: take this into account? -->
e36b1046 76
85e0343a 77TODO
e36b1046 78
cff1083b 793) Experiments: show package usage
e36b1046
BA
80
81\subsection{Experiments}
82In this section, we evaluate our algorithm in a first step using mean squared error (MSE). In a second step, we compare experimentally our moments method (morpheus package \cite{Loum_Auder}) and the likelihood method (with felxmix package \cite{bg-papers:Gruen+Leisch:2007a}).
83
84TODO.........