Neural networks speed up estimates of matrix condition numbers for large sparse systems
This paper introduces a fast, data-driven way to estimate the condition number of large sparse matrices using graph neural networks (GNNs).
This paper introduces a fast, data-driven way to estimate the condition number of large sparse matrices using graph neural networks (GNNs). The condition number is a simple number that measures how sensitive the solution of a linear system is to small changes in the input. Exact computation is costly for large problems, so the authors train a neural model to predict the number quickly from compact features of the matrix.
The researchers build a feature extractor that turns a sparse matrix into a fixed-length vector of descriptors. These descriptors capture structure (size and sparsity), diagonal entries, several matrix norms, measures of diagonal dominance, row sparsity patterns, statistics of nonzero values, and simple eigenvalue bounds based on Gershgorin disks. The extractor runs in time proportional to the number of nonzeros plus the matrix dimension, written O(nnz + n), so it is cheap for typical sparse matrices.
On top of these features they train a neural model to predict the logarithm (base ten) of either the inverse norm of the matrix or the full condition number. The log transform stabilizes learning when values vary widely. At inference time the model outputs the log estimate, which is then exponentiated and combined with an exactly computable matrix norm when appropriate. The paper presents two prediction schemes and reports experiments for both 1-norm and 2-norm condition-number estimation.
According to the authors, this approach gives large speedups compared with classical iterative estimators such as the Hager–Higham method for the 1-norm and Lanczos-type methods for the 2-norm. They report sub-millisecond inference times and claim acceptable relative error in their tests. The method is presented as the first use of graph learning techniques for condition-number estimation and is positioned as a fast, approximate alternative for large sparse systems.