Matrix inversion lemma

The other day at work I came across an interesting problem while trying to optimise some MATLAB MCMC sampling code. The major bottleneck in the code was the inversion of a [p x p] matrix M, where p can be quite large (in the order of thousands). Now, I noticed that M can be written as

${\bf M} = {\bf A} + {\bf X} {\bf G} {\bf X}^{T}$

where A is a diagonal [p x p] matrix, X is [p x n] and G is a full rank diagonal [n x n] matrix. In my setting, p could be much larger than n and speed is important since this particular code is executed numerous times within a loop. I immediately thought about using the matrix inversion lemma (or the Sherman–Morrison–Woodbury formula) to speed up the inversion when p >> n. However, it turns out that in my case the matrix A is

${\bf A} = {\rm diag}(0, a_1, \ldots, a_{p-1})$

which is of rank (p – 1) and singular, so the matrix inversion lemma cannot be applied in a straightforward manner. After talking to a colleague about this issue, he suggested a nice trick to make A full rank by replacing the top-left zero element with a non-zero entry, and then changing X and G to correct for this modification. If we apply this trick, we can write M as

${\bf M} = \underbrace{\left({\bf A} - {\bf e} {\bf e}^T\right)}_{{\bf A}_*} + \left({\bf e} \; \; {\bf X}\right) \left( \begin{array}{cc} 1 & {\bf 0}^T \\ {\bf 0} & {\bf G} \end{array}\right) \left( \begin{array}{cc} {\bf e}^T\\ {\bf X}^T \end{array} \right)$

where e = (1, 0, …, 0) is a [p x 1] vector. Application of the matrix inversion lemma is now straightforward and reduces the computational cost of inverting M from O(p^3) to O(n^3). I did some rough timing of the new code and it is (unsurprisingly) significantly faster than the previous version when p / n gets large. I’ve updated my Bayesian LASSO code for logistic regression (see Publications) to include this neat trick.

1. No comments yet.