# Archive for category Linear algebra

### Linear algebra in C/C++ using Eigen

I’ve recently started using the free, open-source library Eigen for linear algebra routines in C/C++. The library is absolutely fantastic in terms of perfomance and functionality, however, as is sometimes the case with open-source software, the API documentation is somewhat lacking. If I get some time in the future I hope to contribute a few articles to the online documentation. Here is an example of what I mean.

Suppose you want to do a QR decomposition of a matrix X. There are three classes that can do this for you, one of which is ColPivHouseholderQR, and you may write something like

1 2 // perform a QR decomposition of the [n x p] X matrix; where n > p Eigen::ColPivHouseholderQR< Eigen::MatrixXd > qr = X.colPivHouseholderQr();

Seems fairly straightforward, right? But, how do we get Q and R? If we look at the online documentation for ColPivHouseholderQR, we find a public method called matrixQR() which seems to return a reference to a matrix related to Q and R in some fashion. The documentation doesn’t state anything about how to get Q and R from the returned matrix. It turns out, that the matrix returned by matrixQR() contains both Q and R, with R occupying the ‘top’ block and Q the ‘bottom’ block. Thus, to get, say, R we need to use something like

1 2 3 4 // perform a QR decomposition of the [n x p] X matrix; where n > p Eigen::ColPivHouseholderQR< Eigen::MatrixXd > qr = X.colPivHouseholderQr(); Eigen::Matrix<double,p,p> R; R = qr.matrixQR().topRightCorner(p,p).triangularView<Eigen::Upper>(); // R matrix

Here, topRightCorner() extracts the matrix of interest and triangularView() ensures that R is stored as an upper triangular dense matrix.

It didn’t take too long to figure out how QR works in Eigen. After discovering the solution I found a post on the Eigen forum that describes a similar solution to the aforementioned. I urge anyone who uses Eigen to contribute code and documentation to the project, if possible. It really is a fantastic open-source library and equals and surpasses many other commercial linear algebra packages in both performance and quality.

### Fisher Information in Single Factor Analysis

The single factor analysis model for data $\{ {\bf x}_1, {\bf x}_2, \ldots, {\bf x}_n \}$ is ${\bf x}_i = {\bf m} + v_i {\bf a} + \boldsymbol{\epsilon}_i \quad \; (i=1,2,\ldots,n),$

where ${\bf x}_i \in \mathbb{R}^k$, $\boldsymbol{\epsilon}_i$ is an i.i.d. zero-mean multivariate Gaussian random variable with a symmetric, positive definite variance-covariance matrix $\boldsymbol{\Sigma} \in \mathbb{R}^{k \times k}$; that is, $\boldsymbol{\epsilon}_i \sim {\rm N}_k({\bf 0}, \boldsymbol{\Sigma})$ for all $i=1,2,\ldots,n$. It is thus assumed that each data vector ${\bf x}_i$ follows a $k$-variate Gaussian distribution with moments ${\rm E} \{ {\bf x}_i \} = {\bf m} + v_i {\bf a},$ ${\rm Var} \{ {\bf x} \} = \boldsymbol{\Sigma}.$

The aim of single factor analysis is to estimate the unknown factor scores ${\bf v} = (v_1, v_2, \ldots, v_n)^\prime \in \mathbb{R}^n$, the mean ${\bf m} = (m_1,m_2, \ldots, m_k)^\prime \in \mathbb{R}^k$, the variance-covariance matrix $\boldsymbol{\Sigma}$ and the factor loadings ${\bf a} = (a_1, a_2, \ldots, a_k)^\prime \in \mathbb{R}^k$.

I’ve written this note to show how one may derive the Fisher information for the single factor model. This turns out to be a fairly involved task requiring a lot of linear algebra and matrix differential calculus identities. I’ve uploaded a document describing all the necessary mathematical steps to the Publications section. Anyone interested in matrix differential calculus and linear algebra should find it useful.

### Matrix inversion lemma

The other day at work I came across an interesting problem while trying to optimise some MATLAB MCMC sampling code. The major bottleneck in the code was the inversion of a [p x p] matrix M, where p can be quite large (in the order of thousands). Now, I noticed that M can be written as ${\bf M} = {\bf A} + {\bf X} {\bf G} {\bf X}^{T}$

where A is a diagonal [p x p] matrix, X is [p x n] and G is a full rank diagonal [n x n] matrix. In my setting, p could be much larger than n and speed is important since this particular code is executed numerous times within a loop. I immediately thought about using the matrix inversion lemma (or the Sherman–Morrison–Woodbury formula) to speed up the inversion when p >> n. However, it turns out that in my case the matrix A is ${\bf A} = {\rm diag}(0, a_1, \ldots, a_{p-1})$

which is of rank (p – 1) and singular, so the matrix inversion lemma cannot be applied in a straightforward manner. After talking to a colleague about this issue, he suggested a nice trick to make A full rank by replacing the top-left zero element with a non-zero entry, and then changing X and G to correct for this modification. If we apply this trick, we can write M as ${\bf M} = \underbrace{\left({\bf A} - {\bf e} {\bf e}^T\right)}_{{\bf A}_*} + \left({\bf e} \; \; {\bf X}\right) \left( \begin{array}{cc} 1 & {\bf 0}^T \\ {\bf 0} & {\bf G} \end{array}\right) \left( \begin{array}{cc} {\bf e}^T\\ {\bf X}^T \end{array} \right)$

where e = (1, 0, …, 0) is a [p x 1] vector. Application of the matrix inversion lemma is now straightforward and reduces the computational cost of inverting M from O(p^3) to O(n^3). I did some rough timing of the new code and it is (unsurprisingly) significantly faster than the previous version when p / n gets large. I’ve updated my Bayesian LASSO code for logistic regression (see Publications) to include this neat trick.