Using principal component analysis to invert a matrix

Published:

I came across a problem during my research where I needed to quickly invert a large matrix in my Gaussian process code. My PhD supervisor suggested I could use a neural network and principal component analysis to quickly invert the large matrix. In the end, I managed to find other ways to speed up my matrix inversion and did not pursue this idea further. However, I am curious to see if this idea actually works and decided to spend some time on this.

In this blog, I will provide a short theoretical background to principal component analysis and then dive right into trying to invert large matrices.

Theoretical background

Principal component analysis (PCA) is a dimensional reduction technique where we reduce the number of dimensions or features in the data. This is useful when we want to compress data and extract only the most informative features of a dataset.

Singular value decomposition

Let \(A\) be an \(m \times n\) matrix. A singular value decomposition (SVD) of \(A\) is the following factorisation

\[\begin{align} A = U\Sigma V^T \end{align}\]

where \(U\) is an \(m \times m\) orthogonal matrix, \(\Sigma\) is a diagonal matrix wit nonnegative entries and \(V\) is an orthogonal \(n \times n\) matrix.

Principal component analysis

Inverting the matrix

Now that we understand PCA, we will now code up a neural network to invert a large matrix. We use PCA by decomposing the matrix into its principal components, with its eigenvectors and the associated eigenvalues. We will choose the most informative principal components that have the greatest variance, and train the neural network on the eigenvalues. In other words, we train the neural network on the weights of these inforamtive principal components.