We present and analyze a new randomized algorithm called rand-cholQR
for computing tall-and-skinny QR factorizations.
Using one or two random sketch matrices, it is proved that with
high probability, its orthogonality error is bounded by a constant
of the order of unit roundoff for any numerically full-rank matrix.
An evaluation of the performance of rand-cholQR on a NVIDIA A100...
In this talk, we explore the use of randomized sketching for solving least squares arising from inverse problems, and that may need regularization. We propose the use of sketching inside a flexible LSQR scheme, and discuss different options that are suitable for various situations (for instance, small or large residual). We show how the use of sketching can help overcome practical issues that...
We consider the column subset selection problem (CSSP), that is, the problem of selecting a submatrix $C$ made of $k$ columns of a matrix $A \in \mathbb{R}^{n \times n}$ such that $\|A - CC^\dagger A\|$ is small, where $\dagger$ denotes the pseudoinverse.
We present a randomized algorithm for CSSP, based on an adaptive leverage score sampling strategy, which guarantees, in expectation, an...
We consider damped vibrational systems of the form $M\ddot{q}(t)+D(\nu)\dot{q}(t)+Kq(t)=0$, where $M$ and $K$ are positive
definite and $D=D_{\text{int}}+D_{\text{ext}}(\nu)$ with $D_{\text{int}}$ representing some Rayleigh damping and $D_{\text{ext}}(\nu)= \sum_{i=1}^{k}\nu_i^{}\; d_i d_i^T$ representing some external damping caused by $k$ dampers. Optimal damping consists of determining a...
We describe a framework based on Riemannian optimization to solve certain nearness problem on matrices of the form
$$
\min \|X-A\|_F^2 \quad X \in \mathcal{Q} \tag{1}
$$
where $\mathcal{Q}$ is a certain subset of $\mathbb{C}^{n\times n}$ that enforces both a linear structure (e.g., a fixed sparsity pattern, Toeplitz/Hankel structure...) and a constraint on the eigenvalues (e.g., $X$ is...
We are concerned with the solution of linear operator equations with a compact operator. These operators do not have a bounded inverse and therefore these kinds of equations have to be regularized before solution. The Arnoldi process provides a convenient way to reduce a compact operator to a nearby operator of finite rank and regularization is achieved with
Tikhonov’s method. We investigates...
In this talk, we present an extension of the Maximization-Minimization Generalized Krylov Subspace (MM-GKS) method for solving \ell_p-\ell_q minimization problems, as proposed in [1], by introducing a right preconditioner aimed at accelerating convergence without compromising the quality of the computed solution. The original MM-GKS approach relies on iterative reweighting and projection onto...
Multilayer networks are often used to model systems where entities are connected in multiple ways simultaneously, capturing the complexity of real-world relationships better than traditional single-layer networks [1]. The dynamical evolution of a complex system over time can be described through a particular interlayer structure. The extent to which properties of evolving networks can be...
The Hadamard factorization is a powerful technique for data analysis and matrix compression, which decomposes a given matrix $A$ into the element-wise product of two low-rank matrices $W$ and $H$ such that $A\approx W\circ H$. Unlike the well-known SVD, this decomposition allows to represent higher-rank matrices with the same amount of variables. We present some new theoretical results which...
We consider a class of differential problems set in a Banach space, with integral boundary conditions:
\begin{equation}
\frac{dv}{dt} = Av, \qquad 0<t<T,\qquad \frac{1}{T}\int_0^T v(t) dt = f,
\end{equation}
where $A$ is a linear, closed, possibly unbounded operator (e.g., second derivative in space). Note that the finite-dimensional version of this problem, where $A$ is a matrix, is...
Given a matrix $A\in \mathbb R^{s\times s}$ and a vector $\mathbf {f} \in \mathbb{ R } ^s,$ under mild assumptions the non-local boundary value problem
\begin{eqnarray}
&&\odv{\mathbf{u}}{\tau} = A \mathbf{u}, \quad 0<\tau<1, \label{l1} \
&&\displaystyle \int_0^1 \mathbf{u}(\tau) \,\mathrm{d}\tau = \mathbf {f}, \label{l2}
\end{eqnarray}
admits as unique solution
[
...
Quantum block encoding (QBE) is a crucial step in the development of many quantum algorithms, as it embeds a given matrix into a suitable larger unitary matrix. Historically, efficient techniques for QBE have primarily focused on sparse matrices, with less attention given to data-sparse matrices, such as rank-structured matrices. In this work, we examine a specific case of rank structure:...
Given two matrices $X,Y\in\mathbb{R}^{n\times m}$ with $m
The randomized singular value decomposition (SVD) proposed in [1] has become one of the most well-established randomization-based algorithms in numerical linear algebra. Its core idea is the computation of a subspace that approximates the column space of a target matrix $\mathbf{A}$ with high probabilistic confidence. In our work [2], we introduce a modification to the standard randomized SVD...
Implicit Runge-Kutta (IRK) methods [2] are highly effective for solving stiff ordinary differential equations (ODEs)
\begin{array}{ll}
M {y}'(t)=f({y}(t),t), & t \in [0,T], \
{y}(0) = {y}_0.
\end{array}
However, their usage can be computationally expensive for large-scale problems due to the need to solve coupled algebraic equations at each step. This study improves IRK...
Kemeny's constant is a global measure of closeness to non-connectivity of a graph [4], [5]. Its variation, after removing an edge $e$, has been used in [1] to define the centrality of the edge $e$ in an undirected graph. This measure, suitably modified, has been applied to cut-edges and an explicit expression has been provided for certain classes of graphs in [2]. Here, we propose a sort of...
Many data sets and mathematical models involve higher-order, or beyond-pairwise, interactions. Such group-level connections, or hyperedges, arise when we study human activities; for example, coauthorships of documents, memberships of clubs or the use of office spaces. They also occur naturally in many biological and digital settings. Recent studies of disease propagation, message passing and...
In this talk, we propose an approach to perform multi-label segmentation. Starting from an image, we construct the associated weighted graph and assign a label to a small number of pixels (seeds). The aim is to assign each unlabeled pixel to a label to identify different regions. Our algorithm uses the notion of communicability from complex networks theory to compute the ease for an unlabeled...
The nodes in a network can be grouped into equivalence classes according to their connection patterns with other nodes, whether in the same group or different ones. This process, known as role extraction or block modelling, aims to produce a simplified, coarse-grained representation of a large, complex network. The first step in this task is to define a similarity criterion between pairs of...
Fundamental properties of weighted graphs, such as connectivity and centrality, are characterized by eigenvalues or eigenvectors of either the weighted adjacency matrix or of matrices related to it, such as the graph Laplacian. Here we study exemplary problems of robustness of partitioning/ranking, which can be viewed as matrix nearness problems that refer to eigenvalues or eigenvectors of...
In this talk we analyze an algorithm for identifying spectral gaps of a real symmetric matrix $A$ by simultaneously approximating the traces of spectral projectors associated with multiple different spectral slices. Our method utilizes Hutchinson's stochastic trace estimator together with the Lanczos algorithm to approximate quadratic forms involving spectral projectors. This approach has been...
We investigate rank revealing factorizations of $m \times n$ polynomial matrices $P(\lambda)$ into products of three, $P(\lambda) = L(\lambda) E(\lambda) R(\lambda)$, or two, $P(\lambda) = L(\lambda) R(\lambda)$, polynomial matrices. Among all possible factorizations of these types, we focus on those for which $L(\lambda)$ and/or $R(\lambda)$ is a minimal basis, since they have favorable...
We study the set $\Pi^*_{{2^m}}$, which is the set of polynomials that can be evaluated as a matrix polynomial (in the sense of [Higham, 2008]) using exactly $m$ matrix-matrix multiplications, with no restrictions on the number of scalar multiplications or matrix additions. Based on a tabular parameterization for this set we develop algebraic transformations that preserve the computed...
We will analyze the generic change of the Weierstra\ss\ Canonical Form of regular complex structured matrix pencils under
generic structure-preserving additive low-rank perturbations. Several different symmetry structures are considered
and it is shown that, for most of the structures, the generic change in the eigenvalues is analogous
to the case of generic perturbations that ignore the...
Rank-structured matrices are matrices with low-rank subblocks, typically off-diagonal blocks. A significant benefit of rank structures in contrast with traditional structures is that the former type is well preserved by usual matrix operations like summation, multiplication, diagonal scaling and shifting, etc. Sometimes, even factorization and inversion preserve the structures. It is well...
The idea of Generalized Locally Toeplitz (GLT) sequences has been introduced as a generalization both of classical Toeplitz sequences and of variable coefficient differential operators and, for every sequence of the class, it has been demonstrated that it is possible to give a rigorous description of the asymptotic spectrum in terms of a function (the symbol) that can be easily identified....
Finding the unique stabilizing solution $X = X^H$ of a large-scale continuous-time algebraic Riccati equation (CARE) $A^HX + XA + CHC - XBB^HX$ with a large, sparse $n \times n$ matrix A, an $n \times m$ matrix $B$ and an $p \times n$ matrix $C$ is of interest in a number of applications. Here, $B$ and $C^H$ are assumed to have full column and row rank, resp., with $m, p \ll n.$ The unique...
This work considers large-scale Lyapunov matrix equations of the form $AX+XA = cc^T$
with a symmetric positive definite matrix $A$ and a vector ${c}$. Motivated by the need for solving such equations in a broad range of applications, various numerical methods have been developed to compute a low-rank approximation to the solution
matrix $X$. In this work, we consider the Lanczos method,...
Large-scale multiterm linear matrix equations of the form
$$
A_1 X B_1 + \cdots + A_\ell X B_\ell = C,
$$ arise as the algebraic formulation in various application problems, such as discretized multivariable PDEs, stochastic, parameterized, or space-time PDEs and inverse problems. While effective methods exist for two-term equations ($\ell=2$), limited options are available for $\ell > 2$....
Polynomial Krylov subspace methods are among the most widely used methods for approximating $f(A)b$, the action of a matrix function on a vector, in particular when $A$ is large and sparse. When $A$ is Hermitian positive definite, the Lanczos method is the standard choice of Krylov method, and despite being very simplistic in nature, it often outperforms other, more sophisticated methods. In...
In this talk, we investigate preconditioning strategies for the Riesz operator $-(-\Delta)^{\frac{\alpha}{2}}$, $\alpha \in (1, 2]$, commonly used in fractional models such as anomalous diffusion. For $\alpha$ close to 2, it is well known that the Laplacian itself serves as an effective preconditioner with linear computational cost. However, as $\alpha$ decreases toward 1, its performance...
We investigate explicit expressions for the error associated with the block rational Krylov approximation of matrix functions. Two formulas are proposed, both derived from characterizations of the block FOM residual. The first formula employs a block generalization of the residual polynomial, while the second leverages the block collinearity of the residuals. A posteriori error bounds based on...
The canonical polyadic decomposition (CPD) of a higher-order tensor is its basic decomposition in a minimal sum of rank-1 terms. CPD plays a major role in data analysis and signal processing by allowing for unique recovery of underlying factors. However, it is well known that the low-rank approximation problem is ill-posed in general. That is, a tensor may fail to have a best rank-$R$...
Tensor decompositions have become a central tool in data science, with applications in areas such as data analysis, signal processing, and machine learning. A key property of many tensor decompositions, such as the canonical polyadic decomposition, is identifiability, that is, the factors are unique, up to trivial scaling and permutation ambiguities. This allows one to recover the groundtruth...
We introduce a mesh-free two-level hybrid Tucker tensor format for approximation of multivariate functions, which combines the product Chebyshev interpolation with the ALS-based Tucker decomposition of the tensor of Chebyshev coefficients. It allows to avoid the expenses of the rank-structured approximation of function-related tensors defined on large spatial grids, while benefiting from the...
The ParaTuck-2 decomposition (PT2D) of is a two-layer generalization of the well-known canonical polyadic decomposition (CPD) of third-order tensors. It jointly factorizes a set of matrices $T_k \in \mathbb{R}^{I \times J}, k =1,\ldots,K$, as $T_k = {A}D_{G}^{(k)}FD_{H}^{(k)}{B}^{\top}$, with common factors ${F} \in \mathbb{R}^{R \times S}, A \in \mathbb{R}^{I \times S}, B \in \mathbb{R}^{J...
Gaussian processes (GP) are a versatile tool in machine learning and computational science. In this talk we consider the case of multi-output Gaussian processes (MOGP) and present low-rank approaches for efficiently computing the posterior mean of such a MOGP. Starting from low-rank spatio-temporal data, we consider a structured covariance function, assuming separability across space and time....
We propose a non-intrusive model order reduction technique for stochastic differential equations with additive Gaussian noise. The method extends the operator inference framework and focuses on inferring reduced-order drift and diffusion coefficients by formulating and solving suitable least-squares problems based on observational data. Various subspace constructions based on the available...
We discuss an evolving low-rank approximation of the vorticity solution of the Euler equations describing the flow of a two-dimensional incompressible ideal fluid on the sphere. Such problem can be approximated by the so-called Zeitlin model, an isospectral Lie-Poisson flow on the Lie algebra of traceless skew-Hermitian matrices. We propose an approximation of Zeitlin's model based on a...
In this talk, I will describe several recent results that use low-rank approximation and machine learning together. There are two stories that I will discuss:
- Using low-rank approximation to speedup machine learning applications (hybrid models)
- Using machine learning algorithms to compute certain classes of low-rank decomposition algorithms (including a AlphaEvolve and AlphaTensor...
The CUR approximation of a matrix is attractive in that once the column and row indices are chosen, one can obtain a low-rank approximation without even looking at the whole matrix.
Its computation had previously been unattractive, often starting with the SVD to get reliable pivots. A remarkable paper by Osinsky shows that this is unnecessary, rendering CUR a practical tool in terms of...
Sketching techniques have gained popularity in numerical linear algebra to accelerate the solution of least squares problems. The so-called $\varepsilon$-subspace embedding property of a sketching matrix $S$ has been largely used to characterize the problem residual norm, since the procedure is no longer optimal in terms of the (classical) Frobenius or Euclidean norm. By building on available...
Multiple orthogonal polynomials (MOPs) are a generalization of orthogonal polynomials.
They originally appeared in Hermite-Padè approximation (simultaneous
rational approximation) and number theory. Recently, they turned out to be very useful in random matrix theory, combinatorics and Markov chains.
Given $ r $ weight functions $w^{(i)}(x) \ge 0,$ with support $ \Delta^{(i)}, \;...
Multiple orthogonal polynomials (MOP's) arise in various applications, including approximation theory, random matrix theory, and numerical integration. To define MOP's, one needs multiple inner products, leading to two types of MOP’s, which are mutually biorthogonal. These MOP's satisfy recurrence relations, which can be linked to linear algebra, via discretization. As a result we get an...
The deep connection between Krylov methods, scalar orthogonal polynomials, and moment matrices is well established, especially for Hermitian and unitary matrices. In this talk, we consider the extension of this framework to block Krylov methods and orthogonal matrix polynomials.
By representing elements of a block Krylov subspace through matrix polynomials, we consider the matrix-valued...
Reduced rank extrapolation (RRE) [1,2] can be used to accelerate convergent vector sequences. These sequences are often generated by an iterative process to solve algebraic equations.
In this presentation, I discuss the generalization of this extrapolation framework to sequences of low-rank matrices which are generated by iterative methods for large-scale matrix equations, such as, e.g., ...
In this talk, we address the exact D-optimal experimental design problem by proposing an efficient algorithm that rapidly identifies the support of its continuous relaxation. Our method leverages a column generation framework to solve such a continuous relaxation, where each restricted master problem is tackled using a Primal-Dual Interior-Point-based Semidefinite Programming solver. This...