- Indico style
- Indico style - inline minutes
- Indico style - numbered
- Indico style - numbered + minutes
- Indico Weeks View
Choose Timezone
Numerical (multi-)linear algebra is central to many computational methods for complex networks, stochastic processes, machine learning, and numerical solution of PDEs. The matrices (or tensors) encountered in applications are often rank-structured: approximately low-rank, or with low-rank blocks, or low-rank modifications of “simpler” matrices. Identifying and exploiting rank structure is crucial for achieving optimal performance and for making data interpretations feasible by means of the most insightful multi-array representation.
A common example is given by the low-rank approximation of a given matrix, for which the singular value decomposition (SVD) is classically employed. For large dimensions, this approach is unfeasible both in terms of computational costs and memory requirements. This problem is particularly crucial when dealing with the huge amount of data currently processed by data science algorithms. Alternatives have been designed to overcome the SVD drawbacks. These include variants of the Lanczos method and adaptive cross approximation.
Recently, randomized methods have gained traction, being more robust than the aforementioned approaches. They only require a few matrix-vector products, and the acceptance of a failure probability, that can be made arbitrarily small by slightly increasing the cost of the method. The analogous problem for tensors (arrays with more than 2 indices) is much harder. No explicit solution is available for the best approximant (when it exists), as no direct analogue of the SVD is known. For this reason, compression strategies have been proposed over the years.
If a computational problem can be reduced to an easier subproblem by a low-rank correction, often we can solve the simple problem first, and then reconstruct only the difference with the actual solution in the form of a low-rank update. Array structures, in the form of matrix and tensor equations, also naturally stem from the computational treatment of many application problems governed by steady-state or time-dependent partial differential equations, where the array form allows to preserve important structural properties of solutions such as symmetry and definiteness, even at low accuracies.
Low-rank structure plays a crucial role also in many other areas. Network science problems, such as clustering and community detection and the construction of preconditioners for solving network-related linear systems, can also be tackled using low-rank approximation techniques, but little work has been done so far in this direction.
The workshop will take place from the morning of Monday, September 1st, to the afternoon of Friday, September 5th. Participants are expected to register and arrive on the afternoon of Sunday, August 31st.
![]() | ![]() | ![]() |
![]() | ![]() |
The registration on Sunday is not mandatory, but it is still open from 2.30 to 7.30 pm. It is possible to register also on Monday from 8.30 am.
The registration on Sunday is not mandatory, but it is still open from 2.30 to 7.30 pm. It is possible to register also on Monday from 8.30 am.
We present and analyze a new randomized algorithm called rand-cholQR
for computing tall-and-skinny QR factorizations.
Using one or two random sketch matrices, it is proved that with
high probability, its orthogonality error is bounded by a constant
of the order of unit roundoff for any numerically full-rank matrix.
An evaluation of the performance of rand-cholQR on a NVIDIA A100 GPU
demonstrates that for tall-and-skinny matrices, rand-cholQR with
multiple sketch matrices is nearly as fast as, or in some cases faster
than, the state-of-the-art CholeskyQR2. Hence, compared to CholeskyQR2,
rand-cholQR is more stable with almost no extra computational or
memory cost, and therefore a superior algorithm both in theory and practice.
Joint work with Andrew J. Higgins, Erik Boman, and Yichitaro Yamazaki
In this talk, we explore the use of randomized sketching for solving least squares arising from inverse problems, and that may need regularization. We propose the use of sketching inside a flexible LSQR scheme, and discuss different options that are suitable for various situations (for instance, small or large residual). We show how the use of sketching can help overcome practical issues that may arise in regularization arising from computed tomography, such as the availability of approximate or inexact transposed operators.
We consider the column subset selection problem (CSSP), that is, the problem of selecting a submatrix $C$ made of $k$ columns of a matrix $A \in \mathbb{R}^{n \times n}$ such that $\|A - CC^\dagger A\|$ is small, where $\dagger$ denotes the pseudoinverse.
We present a randomized algorithm for CSSP, based on an adaptive leverage score sampling strategy, which guarantees, in expectation, an approximation error that matches the optimal existence result in the Frobenius norm. Our method, called Adaptive Randomized Pivoting (ARP), can be seen as a randomized counterpart to a recent deterministic approach proposed by Osinsky. Although the same theoretical guarantee can be achieved with other methods, such as volume sampling, ARP is simpler and less expensive.
We will highlight how this simple yet powerful strategy can be adapted to a range of other tasks related to low-rank approximation, including the Discrete Empirical Interpolation Method, cross/skeleton approximation, and the Nyström approximation of positive semidefinite matrices. We will also briefly describe a new deterministic variant for Nyström approximation that inherits strong error guarantees.
We consider damped vibrational systems of the form $M\ddot{q}(t)+D(\nu)\dot{q}(t)+Kq(t)=0$, where $M$ and $K$ are positive
definite and $D=D_{\text{int}}+D_{\text{ext}}(\nu)$ with $D_{\text{int}}$ representing some Rayleigh damping and $D_{\text{ext}}(\nu)= \sum_{i=1}^{k}\nu_i^{}\; d_i d_i^T$ representing some external damping caused by $k$ dampers. Optimal damping consists of determining a viscosity vector $\nu\in\mathbb{R}^k_+$ that maximizes the rate of decay of the energy of the system as $t$ tends to infinity.
Several algorithms have been proposed to solve this problem but without the stability constraint on the vibrating system nor the nonnegative constraint on $\nu$. We present a new approach that addresses both constraints. We start with a test that checks a priori for stability of the system for all $\nu\ge 0$. Assuming that the system is stable, we derive the Karush-Kuhn-Tucker (KKT) conditions associated with the optimisation problem. We show that the linear independence constraint qualification (LICQ) holds, which is a crucial requirement for the validity of the KKT conditions at a feasible point. We also derive second order sufficient conditions.
We solve the KKT system with a residual minimization algorithm combined
with Barzilai-Borwein stepsize.
This is joint work with Qingna Li (Beijing Institute of Technology).
We describe a framework based on Riemannian optimization to solve certain nearness problem on matrices of the form
$$
\min \|X-A\|_F^2 \quad X \in \mathcal{Q} \tag{1}
$$
where $\mathcal{Q}$ is a certain subset of $\mathbb{C}^{n\times n}$ that enforces both a linear structure (e.g., a fixed sparsity pattern, Toeplitz/Hankel structure...) and a constraint on the eigenvalues (e.g., $X$ is singular, or has $k$ eigenvalues inside a given subset $\Omega \subseteq \mathbb{C}$).
This framework reduces (1) to a Riemannian optimization problem, which is then solved using algorithms from Matlab's library Manopt.
We describe how several problems that have been studied independently in the past can be reduced to this framework and solved with similar techniques. The numerical results obtained with this technique are close, or better, to those of state-of-the-art algorithms.
## References
2. Boumal, Nicolas; Mishra, Bamdev; Absil, P.-A.; Sepulchre, Rodolphe
Manopt, a Matlab toolbox for optimization on manifolds. J. Mach. Learn. Res. 15, 1455-1459 (2014).
1. Gnazzo, Miryam; Noferini, Vanni; Nyman, Lauri; Poloni, Federico
Riemann-Oracle: A general-purpose Riemannian optimizer to solve nearness problems in matrix theory. Preprint, arXiv:2407.03957 [math.NA] (2024).
2. Noferini, Vanni; Poloni, Federico Nearest $\Omega$-stable matrix via Riemannian optimization. Numer. Math. 148, No. 4, 817-851 (2021).
An ergodic Markov chain, with transition probability matrix $P$ and stationary vector $\mathbf{\pi}$, is called reversible if $D_{\mathbf{\pi}} P = P^\top D_{\mathbf{\pi}}$, where $D_{\mathbf{\pi}}$ is the diagonal matrix with the components of $\mathbf{\pi}$ on its diagonal. Reversibility is a desirable property in Markov chain theory. Indeed, in many algorithms and applications, reversibility can be exploited in order to derive stability characteristics. Sets of possible applications include computational biology, statistical physics, and Markov Chain Monte Carlo methods.
Nevertheless, in several frameworks, the transition matrix $P$ of a Markov chain may be estimated from noisy measurements and data. Then, even in settings where the reversibility property should be satisfied, a numerical discrepancy may lead to a non-reversible Markov chain. This issue can be solved by computing the nearest reversible Markov chain to a given one.
In this talk, given a Markov chain, we present a method for the numerical approximation of the reversible Markov chain closest to it. The closeness between two Markov chains is measured by means of the distance, induced by the Frobenius norm, between the corresponding transition matrices.
The approach we propose is based on a Riemannian optimization procedure over the manifold of the reversible stochastic matrices, sharing the same stationary distribution $\mathbf{\pi}$ [1].
The proposed algorithm automatically detects any transient states and restricts its analysis to the remaining recurrent subchain, and in the presence of multiple ergodic classes, our approach processes each one
independently, by isolating and solving smaller, decoupled subchains.
The applicability of our approach has been tested by comparing it with the quadratic programming based method in [2]. Numerical tests show that, for large-sized transition matrices, the Riemannian-based method is often faster and more accurate than the QP competitor.
References
We are concerned with the solution of linear operator equations with a compact operator. These operators do not have a bounded inverse and therefore these kinds of equations have to be regularized before solution. The Arnoldi process provides a convenient way to reduce a compact operator to a nearby operator of finite rank and regularization is achieved with
Tikhonov’s method. We investigates properties of this simple solution approach. Computed examples illustrate the theory presented.
In this talk we introduce some randomized iterative methods for efficiently computing regularized solutions to large-scale discrete linear inverse problems. Specifically, we consider methods based on both the recently-developed randomized Arnoldi algorithm and a new strategy that can be regarded as a randomized Golub-Kahan algorithm. Building on such algorithms, both purely iterative and hybrid projection regularization methods are developed; the latter support automatic regularization parameter selection strategies through modifications of existing regularization parameter choice rules. Theoretical results are provided, and numerical tests from image deblurring and tomography show the benefits of these approaches.
In this talk, we present an extension of the Maximization-Minimization Generalized Krylov Subspace (MM-GKS) method for solving \ell_p-\ell_q minimization problems, as proposed in [1], by introducing a right preconditioner aimed at accelerating convergence without compromising the quality of the computed solution. The original MM-GKS approach relies on iterative reweighting and projection onto subspaces of increasing dimensions, enabling efficient resolution of minimization problems. Our enhanced method leverages a carefully designed regularizing preconditioner, inspired by Iterated Tikhonov regularization, to address the inherent ill-conditioning of the problem. We demonstrate that our preconditioned MM-GKS method preserves the stability and accuracy of the original MM-GKS method, as validated by numerical results in image deblurring, showing significant reductions in CPU time.
Reference:
[1] A. Lanza, S. Morigi, L. Reichel, F. Sgallari, A generalized Krylov subspace method for $\ell_p-\ell_q$ minimization. SIAM Journal on Scientific Computing, (2015), 37(5), S30-S50.
The idea of Generalized Locally Toeplitz (GLT) sequences has been introduced as a generalization both of classical Toeplitz sequences and of variable coefficient differential operators and, for every sequence of the class, it has been demonstrated that it is possible to give a rigorous description of the asymptotic spectrum in terms of a function (the symbol) that can be easily identified.
This generalizes the notion of a symbol for differential operators discrete and continuous) or for Toeplitz sequences, where for the latter it is identified through the Fourier coefficients and is related to the classical Fourier Analysis.
For every $r,d\ge 1$ the $r$-block $d$-level GLT class has nice $*$-algebra features and indeed it has been proven that it is stable under linear combinations, products, and inversion when the sequence which is inverted shows a sparsely vanishing symbol (sparsely vanishing symbol $=$ a symbol whose minimal singular value vanishes at most in a set of zero Lebesgue measure). Furthermore, the GLT $*$-algebras virtually include any approximation of partial differential equations (PDEs), fractional differential equations (FDEs), integro-differential equations (IDEs) by local methods (Finite Difference, Finite Element, Isogeometric Analysis etc) and, based on this, we demonstrate that our results on GLT sequences can be used in a PDE/FDE/IDE setting in various directions, including preconditioning, multigrid, spectral detection of branches, fast 'matrix-less' computation of eigenvalues, stability issues, asymptotic low-rank structures, and challenges such as the GLT use in tensors, stochastic, machine learning algorithms.
We will discuss also the impact and the further potential of the theory with special attention to new tools and to new directions as those based on symmetrization tricks, on the extra-dimensional approach, and on blocking structures/operations.
Bibliography
1. A. Adriani, A.J.A. Schiavoni-Piazza, S. Serra-Capizzano. Blocking structures, g.a.c.s. approximation, and distributions. Bol. Soc. Mat. Mex. (3) 31-2 (2025), Paper No. 41.
2. N. Barakitis, M. Donatelli, S. Ferri, V. Loi, S. Serra-Capizzano, R. Sormani. Blocking structures, approximation, and preconditioning. arXiv:2501.14874, Num. Alg. (2025), to appear.
3. G. Barbarino, C. Garoni, S. Serra-Capizzano. Block generalized locally Toeplitz sequences: theory and applications in the unidimensional case. Electron. Trans. Numer. Anal. 53 (2020), 28-112.
4. G. Barbarino, C. Garoni, S. Serra-Capizzano. Block generalized locally Toeplitz sequences: theory and applications in the multidimensional case. Electron. Trans. Numer. Anal. 53 (2020), 113-216.
5. P. Benedusi, P. Ferrari, E. Rognes, S. Serra-Capizzano. Modeling excitable cells with the EMI equations: spectral analysis and fast solution strategy. J. Sci. Comput. 98-3 (2024), Paper No. 58, 23 pp.
6. P. Ferrari, I. Furci, S. Hon, M. Mursaleen, S. Serra-Capizzano. The eigenvalue distribution of special 2-by-2 block matrix-sequences with applications to the case of symmetrized Toeplitz structures. SIAM J. Matrix Anal. Appl. 40-3 (2019), 1066-1086.
7. I. Dravins, S. Serra-Capizzano, M. Neytcheva. Spectral analysis of preconditioned matrices arising from stage-parallel implicit Runge-Kutta methods of arbitrarily high order. SIAM J. Matrix Anal. Appl. 45-2 (2024), 1007-1034.
8. S. Serra-Capizzano, P. Tilli. On unitarily invariant norms of matrix valued linear positive operators. J. Inequal. Appl. 7-3 (2002), 309-330.
Multilayer networks are often used to model systems where entities are connected in multiple ways simultaneously, capturing the complexity of real-world relationships better than traditional single-layer networks [1]. The dynamical evolution of a complex system over time can be described through a particular interlayer structure. The extent to which properties of evolving networks can be represented by a multilayer has been thoroughly examined in the literature ([2,3]).
The key idea of our project stems from the application of complex network theory to the
analysis of a zonal water system, in this case the water distribution network
of the Emilia Romagna region. The ability of water networked systems to
continue to supply water when components are damaged or fail is a central
topic in the water system design and analysis. This feature of water systems is
related to the connectivity of the internal components of the system (pipes and
reserves). In this particular case, we
assumed the network to be undirected, which implies its adjacency matrix to
be symmetric.
One important feature of our data set is a detailed account of all the pipe
breakages in the water system over the course of time, from a few hours up to many months. To appropriately track the phenomenon, we propose to model breakages as low-rank modifications of the edge adjacency matrix of a water network within a time-aware block influence matrix.
Let $\{G_\ell\}_{\ell=1\dots L}$ be a sequence of discrete time evolving graphs, $G_\ell=(V,E_\ell)$, where, for each $\ell$, $V$ are the vertices and $E_\ell$ are the edges, while $L$ is the number of time units. The composition of edges changes with time. Let $A_\ell$ be edge adjacency matrix of each $G_\ell$ and let $N$ be the number of edges of the full network, so that $A_\ell$ is $N\times N$. To strengthen the impact of the edge modification in time, by also keeping track of this network change in the upper diagonal block as follows:
$$
{\cal M}=
\begin{pmatrix}
A_1&W_1&0&\dots&0\\
0& A_2&W_2&\dots&0 \\
0 & 0&A_3 & \ddots&\vdots \\
\vdots & \vdots&\dots & \ddots&W_{L-1} \\
0 & 0 &\dots&\dots& A_L
\end{pmatrix}
\in \mathbb{R}^{NL\times NL} ,
$$
where $W_\ell=A_{\ell+1}-A_\ell$, $\ell=1, \ldots, L-1$. If the graph loses one or more edges at time $\ell$ (i.e. one or more pipes break), the corresponding nodes in the edge adjacency matrix will be removed, and thus $W_\ell$ will represent the rows and columns associated with the removed nodes. The chosen centrality measure is the Katz centrality $c=(I-\alpha {\cal M})^{-1}{ 1}$, with $\alpha<\frac{1}{\max_{\ell}\rho(A_\ell)}$ and ${1}\in\mathbb{R}^{NL}$ the vector of all ones [4]. Reshape the vector $ c$ as a matrix $C$ of size $N\times L$, and consider the index $i$ at time $\ell$, that is:
$ C(i,\ell)=e_{(\ell-1)N+i}^T { c}=e_{(\ell-1)N+i}^T
(I-\alpha {\cal M})^{-1}{1},$
where $e_k$ is the $k$th column of the identity matrix of size $NL$.
$C(i,\ell)$ is the Katz index related to the node $i$ at time $\ell$.
The $\textit{Katz marginal centrality}$ is the mean of all Katz centrality values of edge $i$ over all time layers, that is
$$
KMN(i)=\frac{1}{L}\sum_{\ell=1}^L { C}(i,\ell).
$$
In order to recover the marginal Katz centralities, the computation of the vector $ c$ is required.
For very large networks, for each edge $i$ the KMN index could be obtained by only *approximately* solving the large linear system of size $NL\times NL$. To do so, we could employ well established computational methods such as BICGstab [5]. However, we can make use of the special structure of the inverse matrix to gain some computational advance.
Let $K(A_\ell)=(I-\alpha A_\ell)^{-1}$, for $\ell=1,\dots,L$.
For instance, for $L=4$, the block structure of the matrix $K(\mathcal{M})=(I-\alpha \mathcal{M})^{-1}$ is:
$$
{\scriptsize
K({\mathcal{M}})=
\begin{bmatrix}
K(A_1)&K(A_2)-K(A_1)&K(A_1)W_1(K(A_3)-K(A_2))&(K(A_2)-K(A_1))W_1(K(A_4)-K(A_3))\\
0&K(A_2)&K(A_3)-K(A_2)&K(A_2)W_2(K(A_4)-K(A_3)) \\
0&0&K(A_3) & K(A_4)-K(A_3)\\
0&0&0&K(A_4)
\end{bmatrix}.}
$$
By a truncation of the matrix at the first super-diagonal block, we can solve the linear system by solving $L-1$ linear systems in parallel on the diagonal blocks of the matrix. We provide theoretical results on the
decay of the off-diagonal entries that justify this truncation.
1.Kivela, M., Arenas, A., Barthelemy, M., Gleeson, J. P., Moreno, Y., & Porter, M. A. (2014). Multilayer networks. Journal of Complex Networks, 2(3), 203–271. https://doi.org/10.1093/comnet/cnu016
2.Fenu, C., & Higham, D. J. (2017). Block Matrix Formulations for Evolving Networks. SIAM Journal on Matrix Analysis and Applications, 38(2), 343–360. https://doi.org/10.1137/16m1076988
3.Bergermann, K., & Stoll, M. (2022). Fast computation of matrix function-based centrality measures for layer-coupled multiplex networks. Physical Review E, 105(3). https://doi.org/10.1103/physreve.105.034305
4.Katz, L. (1953). A New Status Index Derived from Sociometric Analysis.
Psychometrika, 18(1), 39–43. https://doi.org/10.1007/bf02289026
5.van der Vorst, H. A. (1992). Bi-CGSTAB: A Fast and Smoothly Converging Variant of Bi-CG for the Solution of Nonsymmetric Linear Systems. SIAM Journal on Scientific and Statistical Computing, 13(2), 631–644. https://doi.org/10.1137/0913035
6.Taylor, D., Myers, S. A., Clauset, A., Porter, M. A., & Mucha, P. J. (2017). Eigenvector-Based Centrality Measures for Temporal Networks. Multiscale Modeling & Simulation, 15(1), 537–574. https://doi.org/10.1137/16m1066142
The Hadamard factorization is a powerful technique for data analysis and matrix compression, which decomposes a given matrix $A$ into the element-wise product of two low-rank matrices $W$ and $H$ such that $A\approx W\circ H$. Unlike the well-known SVD, this decomposition allows to represent higher-rank matrices with the same amount of variables. We present some new theoretical results which show the features of this factorization and when it is possible. Based on these facts, we derive some new initialization guesses for the two factors of the problem. After that, we develop a new algorithm for computing $W$ and $H$, by taking into account a special manifold structure of these matrices. We implement a block gradient descent on the two manifolds where $W$ and $H$ belong to and we integrate the associated gradient system. We compare our results with an existing approach in the literature [Wertz et al. 2025] and we show that the new initializations can also improve this algorithm.
We consider a class of differential problems set in a Banach space, with integral boundary conditions:
\begin{equation}
\frac{dv}{dt} = Av, \qquad 0<t<T,\qquad \frac{1}{T}\int_0^T v(t) dt = f,
\end{equation}
where $A$ is a linear, closed, possibly unbounded operator (e.g., second derivative in space). Note that the finite-dimensional version of this problem, where $A$ is a matrix, is closely related to the task of computing matrix functions $\psi_{\ell}(A)$, where $\psi_{\ell}$ denotes reciprocals of the $\varphi_{\ell}$-functions used in exponential integrators.
We prove the existence and uniqueness of the solution $v(t)$ and characterize it via a family of mixed polynomial-rational expansions w.r.t. the operator $A$. From this result we design a general numerical procedure for computing an approximation of $v(t)$ up to a given tolerance. An interesting feature of this approach is the fact that successive rational terms can be computed independently: this allows us to fine-tune the accuracy of the approximation by adding further terms as needed, without having to recompute the whole approximation.
Numerical tests focus on a model problem involving a parabolic equation and highlight the effectiveness of this approach.
Given a matrix $A\in \mathbb R^{s\times s}$ and a vector $\mathbf {f} \in \mathbb{ R } ^s,$ under mild assumptions the non-local boundary value problem
\begin{eqnarray}
&&\odv{\mathbf{u}}{\tau} = A \mathbf{u}, \quad 0<\tau<1, \label{l1} \
&&\displaystyle \int_0^1 \mathbf{u}(\tau) \,\mathrm{d}\tau = \mathbf {f}, \label{l2}
\end{eqnarray}
admits as unique solution
[
\mathbf{u}(\tau)= q(\tau,A) \mathbf {f}, \quad q(\tau,w)= \frac{w e^{w\tau}}{e^w -1}.
]
This talk deals with efficient numerical methods for computing the action
of $q(\tau,A)$ on a vector, when $A$ is a large and
sparse matrix. Methods based on the Fourier expansion of $q(\tau,w)$
are considered. First, we place
these methods in the classical framework of Krylov-Lanczos
(polynomial-rational) techniques for accelerating Fourier series.
This allows us to apply the convergence results developed in this
context to our function. Second, we design some new acceleration schemes for computing $q(\tau,A) \mathbf {f}$. Numerical results are presented to show the effectiveness of
the proposed algorithms.
Quantum block encoding (QBE) is a crucial step in the development of many quantum algorithms, as it embeds a given matrix into a suitable larger unitary matrix. Historically, efficient techniques for QBE have primarily focused on sparse matrices, with less attention given to data-sparse matrices, such as rank-structured matrices. In this work, we examine a specific case of rank structure: one-pair semiseparable matrices. We present a novel block encoding approach that utilizes a suitable factorization of the given matrix into the product of triangular and diagonal factors. Our algorithm requires $O(\rm{polylog}(N))$ qubits and allows us to compute matrix-vector products in $O(\rm{polylog}(N))$ quantum operations, where $N$ is the size of the matrix.
Given two matrices $X,Y\in\mathbb{R}^{n\times m}$ with $m In this talk, we present a randomized version of the algorithm, which computes two matrices Q and P that satisfy the sketched biorthognality condition $(\Omega Q)^T \Omega P = D$, where $\Omega \in\mathbb{R}^{s\times n}$ is a sketching matrix satisfying an oblivious $\varepsilon$-embedding property, such as a subsampled randomized Hadamard transform or a sparse sign matrix. We show how this approach can improve the stability of the algorithm and the condition number of the computed bases $Q$ and $P$. As an application, we consider the computation of approximate eigenvalues and both right and left eigenvectors, where our randomized two-sided Gram-Schmidt orthogonalization process can be implemented within the non-symmetric Lanczos algorithm.
The randomized singular value decomposition (SVD) proposed in 1 has become one of the most well-established randomization-based algorithms in numerical linear algebra. Its core idea is the computation of a subspace that approximates the column space of a target matrix $\mathbf{A}$ with high probabilistic confidence. In our work [2], we introduce a modification to the standard randomized SVD that, in general, yields better approximations to $\text{Range}(\mathbf{A})$ at comparable computational cost. This is achieved by explicitly incorporating information from the row space of $\mathbf{A}$, thus enhancing the approximation quality. Furthermore, we observe that only a limited amount of information from $\text{Range}(\mathbf{A}^T)$ is needed. Exploiting this, we show both theoretically and through numerical experiments that a variant of the algorithm equipped with a subsampling step can significantly improve efficiency while retaining competitive accuracy.
We expand our method to the high-dimensional context of tensor decompositions, building on these matrix-based randomized techniques. With applications ranging from deep learning to signal processing, the Tucker decomposition is a popular technique for compressing and analyzing multidimensional data while maintaining its intrinsic multiway structure. The deterministic SVD-based factorizations of tensor matricizations used in classical Tucker decomposition algorithms are frequently computationally costly. Motivated by our recent developments for the matrix case, we suggest a novel Tucker decomposition framework that combines randomized sketching and tensor fiber sampling techniques to overcome these drawbacks. This strategy avoids the costly formation of tensor matricizations and improves the approximation of their column spaces, reducing both computational time and memory requirements. Under a low-rank assumption, the approximation errors remain comparable to those of existing randomized Tucker methods.
References
N. Halko, P. G. Martinsson, and J. A. Tropp. Finding Structure with Randomness: Probabilistic Algorithms for Constructing Approximate Matrix Decompositions. SIAM Review 53.2, 217–288 (2011). DOI: 10.1137/090771806
Davide Palitta and Sascha Portaro. Row-aware Randomized SVD with applications. arXiv: 2408.04503 [math.NA]. URL: https://arxiv.org/abs/2408.04503. (2024).
Implicit Runge-Kutta (IRK) methods [2] are highly effective for solving stiff ordinary differential equations (ODEs)
\begin{array}{ll}
M {y}'(t)=f({y}(t),t), & t \in [0,T], \
{y}(0) = {y}_0.
\end{array}
However, their usage can be computationally expensive for large-scale problems due to the need to solve coupled algebraic equations at each step. This study improves IRK efficiency by leveraging parallelism to decouple stage computations and reduce communication overhead 1. In the linear case, the generic RK method takes the form
\begin{array}{l}
M {y}'(t) = - L {y}(t) + \hat{{f}}(t), \
{y}(0) = {y}_0,
\end{array} \qquad M,L \in {R}^{N \times N},
for which the computational expensive step is given by the computation of the stages $K$ by
$$ M K= - L {y}_n {1}_s^\top - h L K A^\top + F.$$ If the stage matrix $A$ was diagonalizable as $A = X \Lambda X^{-1}$, this could be solved by doing $$ \begin{array}{l} {r} = - ( X^{-1} {1}_s \otimes L ) {y}_n + \left(X^{-1} \otimes I\right)\textrm{vec}\left( F\right),\\ \left(I_s \otimes M + h \Lambda \otimes L\right) {z} = {r},\\ \textrm{vec} \left( K \right) = \left(X \otimes I\right) {z}. \end{array} $$ This is not the case in general. Nevertheless, for two well-known IRK families—symmetric methods and collocation methods—we write $A$ as the sum of a diagonalizable plus a low-rank matrix. With this we devise a strategy using the parallel block-diagonal solution in the previous equation followed by a sequential matrix-equation solution to advance the method. We illustrate the idea with several numerical examples and discuss the extension to the more general nonlinear setting [3].
References
Many data sets and mathematical models involve higher-order, or beyond-pairwise, interactions. Such group-level connections, or hyperedges, arise when we study human activities; for example, coauthorships of documents, memberships of clubs or the use of office spaces. They also occur naturally in many biological and digital settings. Recent studies of disease propagation, message passing and the dissemination of ideas and opinions have shown that higher-order effects can play an important role. From a linear algebra perspective, moving from graphs to hypergraphs takes us from matrices to tensors. A geometric model for graphs [1,2] or hypergraphs [3,4,5] starts by placing elements in Euclidean space and then creates interactions according to "nearness." We consider the inverse problem, where location in Euclidean space is to be inferred from the connectivity structure. In the graph context, addressing this problem has proved useful for visualisation, dimension reduction and other data processing tasks. Here, we will describe and illustrate a new embedding algorithm that is designed for hypergraphs.
References
Complex systems of interacting components are often modeled by a graph consisting of a set of nodes and edges, where each edge is equipped with a positive weight. Examples of such graphs include transportation networks, such as road and airline networks; see, e.g., [1,2]. Assume that the network is undirected and connected, which implies that the associated adjacency matrix $A$ is symmetric, irreducible, and nonnegative, while the associated graph Laplacian matrix $L=D-A$, with $D$ being the degree matrix, is symmetric, positive semidefinite, and has a one-dimensional null space. Two spectral quantities are particularly significant in network analysis: the spectral radius $\rho$ of $A$, known as the Perron value, and the second smallest eigenvalue $\mu$ of $L$, referred to as the Fiedler value. Specifically, the unique eigenvector of unit Euclidean norm with positive entries associated with $\rho$, i.e., the Perron vector, is used for node ranking, while the eigenvector of unit norm associated with $\mu$, i.e., the Fiedler vector, is used to infer a natural bipartition of the graph based on the signs of its entries; see, e.g., [3,4,5,6]. One can estimate the change of the Perron and Fiedler values for a connected network when the weight of an edge is perturbed by analyzing relevant entries of the Perron and Fiedler vectors. This is helpful for identifying edges whose weight perturbation causes the largest change in $\rho$ and $\mu$. For such structured perturbation analysis to be reliable, it is also important that the Perron and Fiedler vectors are not overly sensitive to perturbations. Applications of the perturbation analysis include the identification of edges that are critical for the structural robustness of the network.
Estrada, E.: The Structure of Complex Networks: Theory and Applications, https://doi.org/10.1093/acprof:oso/9780199591756.001.0001, (2011).
Newman, M.E.J.: Networks: An Introduction, https://doi.org/10.1093/acprof:oso/9780199206650.001.0001, (2010).
Berman, A., Plemmons, R.J.: Nonnegative matrices in the mathematical sciences, https://epubs.siam.org/doi/book/10.1137/1.9781611971262, (1994).
Bonacich, P.: Power and centrality: a family of measures, https://doi.org/10.1086/228631, (1987).
Fiedler, M.: Algebraic connectivity of Graphs, https://eudml.org/doc/12723, (1973).
Fiedler, M.: A property of eigenvectors of nonnegative symmetric matrices and its application to graph theory, https://eudml.org/doc/12900, (1975).
In this talk, we propose an approach to perform multi-label segmentation. Starting from an image, we construct the associated weighted graph and assign a label to a small number of pixels (seeds). The aim is to assign each unlabeled pixel to a label to identify different regions. Our algorithm uses the notion of communicability from complex networks theory to compute the ease for an unlabeled pixel to reach a labeled one. We can perform a good image segmentation by assigning each pixel to the label for which the largest communicability is calculated. Moreover, an extension to coloured images is proposed.
Kemeny's constant is a global measure of closeness to non-connectivity of a graph [4], [5]. Its variation, after removing an edge $e$, has been used in [1] to define the centrality of the edge $e$ in an undirected graph. This measure, suitably modified, has been applied to cut-edges and an explicit expression has been provided for certain classes of graphs in [2]. Here, we propose a sort of directional derivative of the Kemeny constant as a measure of the edge centrality. We prove the positivity of this measure, provide an explicit expression in terms of the inverse of the modified graph Laplacian, show that it can also be applied to cut-edges with no modification, relate it to the measure of [1] and [2], and provide algorithms for its computation. Advantages with respect to the previous measures are discussed, and an application to the analysis of the road map of Pisa is shown.
D. Altafini, D. Bini, V. Cutini, B. Meini, F. Poloni, An edge centrality measure based on the Kemeny constant, SIMAX 44, 2023.
D. Bini, S. Kirkland, G. Latouche, B. Meini, Cut-edge centralities in an undirected graph, Numer. Algo. 2025
D. Bini, B. Meini, F. Poloni, The derivative of Kemeny’s constant as centrality measure in undirected graphs. In preparation.
J. Breen, E. Crisostomi, S. Kim, Kemeny's constant for a graph with bridges. Discrete Appl. Math. 2022.
E. Crisostomi, S. Kirkland, R. Shorten, A Google-like model of road network dynamics and its application to regulation and control. Internat. J. Control 2011.
The nodes in a network can be grouped into equivalence classes according to their connection patterns with other nodes, whether in the same group or different ones. This process, known as role extraction or block modelling, aims to produce a simplified, coarse-grained representation of a large, complex network. The first step in this task is to define a similarity criterion between pairs of nodes. This is achieved by constructing a similarity matrix, the entries of which quantify the similarity between all node pairs. Node groups are then identified by clustering the rows and columns of this matrix.
This work presents a novel node similarity measure designed specifically for directed networks. Our approach is based on random walks that proceed in both directions along the links to unveil network structures that are not apparent in the original data. The resulting similarity matrix is derived as the solution to a generalised Stein matrix equation, for which we present a provably convergent iterative method. To enhance scalability, we also develop a low-rank approximation of this iteration, which significantly reduces memory and computational demands. Theoretical analysis and experimental results on synthetic and real-world directed networks suggest that our similarity matrix achieves superior performance when solving block modelling tasks in directed networks where node degrees exhibit significant heterogeneity.
This work was supported by the Italian Ministry of University and Research through the PRIN Project 20227PCCKZ “Low Rank Structures and Numerical Methods in Matrix and Tensor Computations and their Applications”. The author is also affiliated with the INdAM-GNCS.
Fundamental properties of weighted graphs, such as connectivity and centrality, are characterized by eigenvalues or eigenvectors of either the weighted adjacency matrix or of matrices related to it, such as the graph Laplacian. Here we study exemplary problems of robustness of partitioning/ranking, which can be viewed as matrix nearness problems that refer to eigenvalues or eigenvectors of structured symmetric real matrices. We propose algorithms that do eigenvalue/eigenvector optimization without eigenvalues/eigenvectors but work directly with the quadratic form (Rayleigh quotients).
This requires only matrix-vector products and vector inner products.
Our method is based on reformulating the problem and minimizing a suitable functional in the eigenvalues/eigenvectors.
After determining the gradient system associated with this functional, we introduce a low-rank projected system, suggested by the underlying low-rank structure of the extremizers of the functional.
Integrating this low-rank system requires both a moderate computational effort and a memory requirement, as shown in some illustrative numerical examples.
This is inspired by joint works with M. Benzi (SNS, Pisa), with C. Lubich (Universität Tübingen), and with S. Sicilia (University of Mons).
We investigate rank revealing factorizations of $m \times n$ polynomial matrices $P(\lambda)$ into products of three, $P(\lambda) = L(\lambda) E(\lambda) R(\lambda)$, or two, $P(\lambda) = L(\lambda) R(\lambda)$, polynomial matrices. Among all possible factorizations of these types, we focus on those for which $L(\lambda)$ and/or $R(\lambda)$ is a minimal basis, since they have favorable properties from the point of view of data compression and allow us to relate easily the degree of $P(\lambda)$ with some degree properties of the factors. We call these factorizations "minimal rank factorizations". Motivated by the well-known fact that, generically, rank deficient polynomial matrices over the complex field do not have eigenvalues, we pay particular attention to the properties of the minimal rank factorizations of polynomial matrices without eigenvalues. We carefully analyze the degree properties of generic minimal rank factorizations in the set of complex $m \times n$ polynomial matrices with normal rank at most $r< \min \{m,n\}$ and degree at most $d$, and we prove that there are only $rd+1$ different classes of generic factorizations according to the degree properties of the factors and that all of them are of the form $L(\lambda) R(\lambda)$, where the degrees of the $r$ columns of $L(\lambda)$ differ at most by one, the degrees of the $r$ rows of $R(\lambda)$ differ at most by one, and, for each $i=1, \ldots, r$, the sum of the degrees of the $i$th column of $L(\lambda)$ and of the $i$th row of $R(\lambda)$ is equal to $d$. Finally, we study structured versions of these factorizations in the case of symmetric, Hermitian, skew-symmetric, and skew-Hermitian polynomial matrices.
We study the set $\Pi^*_{{2^m}}$, which is the set of polynomials that can be evaluated as a matrix polynomial (in the sense of [Higham, 2008]) using exactly $m$ matrix-matrix multiplications, with no restrictions on the number of scalar multiplications or matrix additions. Based on a tabular parameterization for this set we develop algebraic transformations that preserve the computed polynomial while simplifying the parameterization. These transformations lead to a new upper bound $\dim(\Pi^*_{2m}) \leq m^2$, which numerical evidence suggests is sharp. Our results provide, to our knowledge, the first minimal parameterization of this type of polynomial evaluation schemes. Additionally, we use computational tools to investigate the inclusion problem: for a given $m$, determine the largest degree $d$ such that all degree-$d$ polynomials are in $\Pi^*_{2^m}$. We report new inclusion bounds and provide reproducible numerical experiments demonstrating, for example, using only 6 multiplications we can compute the Taylor expansion of the exponential to degree 30, improving on known methods. Our approach is constructive in the sense that we do provide a concrete iterative method with a given initialilization to compute the matrix polynomial for a given polynomial.
Joint work with Gustaf Lorentzon, KTH Royal Institute of Technology
We will analyze the generic change of the Weierstra\ss\ Canonical Form of regular complex structured matrix pencils under
generic structure-preserving additive low-rank perturbations. Several different symmetry structures are considered
and it is shown that, for most of the structures, the generic change in the eigenvalues is analogous
to the case of generic perturbations that ignore the structure.
However, for some odd/even and palindromic structures, there is a different behavior for the eigenvalues $0$ and $\infty$,
respectively $+1$ and $-1$. The differences arise in those cases where the parity of the partial multiplicities in the perturbed
matrix pencil provided by the generic behavior in the general structure-ignoring case is not in accordance with the restrictions
imposed by the structure. The new results extend results for the rank-$1$ and rank-$2$ cases that were obtained in
previous works for the case of special structure-preserving perturbations. As the main tool, we use decompositions of
matrix pencils with symmetry structure into sums of rank-one matrix pencils, as those allow a parametrization of the set of matrix pencils
with a given symmetry structure and a given rank.
Rank-structured matrices are matrices with low-rank subblocks, typically off-diagonal blocks. A significant benefit of rank structures in contrast with traditional structures is that the former type is well preserved by usual matrix operations like summation, multiplication, diagonal scaling and shifting, etc. Sometimes, even factorization and inversion preserve the structures. It is well known that traditional structured matrices like Toeplitz and Cauchy-like matrices are rank structured in Fourier space. However, the original matrices themselves may not be rank structured.
We show that, in many important applications, the relevant traditional structured matrices themselves are also rank structured. Examples include Toeplitz and circulant matrices appearing in some differential equation solutions and image processing methods. Those matrices are often scaled and/or added to other matrices, which destroys the circulant/Toeplitz structures. Thus, relevant computations like linear solutions are typically done by iterative solvers. However, for some challenging situations like differential equations with sharp variations and discontinuities in coefficients, relevant condition numbers are very large, which makes them challenging for iterative solvers. Here, we show that many circulant/Toeplitz matrices also have rank structures which are preserved under usual matrix operations. This makes it feasible to design fast direct solvers for the relevant matrices. Many circulant matrices are often generated by diagonal matrices (for the eigenvalues) related to some discretized functions. We prove that, for many different types of functions, including those with discontinuities, the circulant matrices have small off-diagonal numerical ranks. Thus, this paves the path to design fast and robust direct structured solvers for relevant linear systems. The study can also be extended to show rank structures for some other problems that are previously only proven in the asymptotic sense.
In this talk we analyze an algorithm for identifying spectral gaps of a real symmetric matrix $A$ by simultaneously approximating the traces of spectral projectors associated with multiple different spectral slices. Our method utilizes Hutchinson's stochastic trace estimator together with the Lanczos algorithm to approximate quadratic forms involving spectral projectors. This approach has been already used in the literature for related problems, such as estimating the spectral density of $A$ or the number of eigenvalues in an interval; however, the currently available theoretical results are somewhat lacking in the context of detecting spectral gaps.
In this work, we focus on finding all gaps that are wider than a specified threshold and examine the problem from this perspective. By thoroughly analyzing the Hutchinson and Lanczos components of the algorithm, we obtain error bounds that allow us to determine the numbers of Hutchinson's sample vectors and Lanczos iterations needed to ensure the detection of all gaps above the target width with high probability. In particular, we conclude that the most efficient strategy is to always use a single random sample vector for Hutchinson's estimator and concentrate all computational effort in the Lanczos algorithm. Our numerical experiments demonstrate the efficiency and reliability of this approach.
This is a joint work with Michele Benzi and Michele Rinelli.
Finding the unique stabilizing solution $X = X^H$ of a large-scale continuous-time algebraic Riccati equation (CARE) $A^HX + XA + CHC - XBB^HX$ with a large, sparse $n \times n$ matrix A, an $n \times m$ matrix $B$ and an $p \times n$ matrix $C$ is of interest in a number of applications. Here, $B$ and $C^H$ are assumed to have full column and row rank, resp., with $m, p \ll n.$ The unique stabilizing solution $X = X^H $ is positive semidefinite and makes the closed-loop matrix $A-BB^HX$ stable. Even so $A$ is large and sparse, the solution $X$ will still be a dense matrix in general. But our assumptions on $B$ and $C$ often imply that the sought-after solution $X$ will have a low numerical rank (that is, its rank is $\ll n$). This allows for the construction of iterative methods that approximate $X$ with a series of low rank matrices $X_i$ stored in low-rank factored form. That is, the Hermitian low-rank approximations $X_j$ to $X$ are of the form $X_j = Z_jY_jZ_j^H,$ where $Z_j$ is an $n \times kj$ matrix with only few columns and $Y_j$ is a small square $kj \times kj$ Hermitian matrix. There are several methods which produce such a low-rank approximation.
In this talk, we consider a class of (block) rational Krylov-subspace-based projection methods for solving CAREs. The CARE is projected onto a block rational Krylov subspace $\mathcal{K}_j$ spanned by blocks of the form $(A^H - s_kI)^{-1}C^H$ for some shifts $s_k,\ k = 1, \ldots, j.$ The considered projections do not need to be orthogonal and are built from the matrices appearing in the block rational Arnoldi decomposition associated to $\mathcal{K}_j.$ The resulting projected Riccati equation is solved for the small square Hermitian $Y_j.$ Then the Hermitian low-rank approximation $X_j = Z_jY_jZ_j^H$ to $X$ is set up where the columns of $Z_j$ span $\mathcal{K}_j.$ The residual norm $\|R(X_j )\|_F$ can be computed efficiently via the norm of a readily available $2p \times 2p$ matrix. We suggest reducing the rank of the approximate solution $X_j$ even further by truncating small eigenvalues from $X_j.$ This truncated approximate solution can be interpreted as the solution of the Riccati residual projected to a subspace of $\mathcal{K}_j.$ This gives us a way to efficiently evaluate the norm of the resulting residual. Numerical examples are presented.
This work considers large-scale Lyapunov matrix equations of the form $AX+XA = cc^T$
with a symmetric positive definite matrix $A$ and a vector ${c}$. Motivated by the need for solving such equations in a broad range of applications, various numerical methods have been developed to compute a low-rank approximation to the solution
matrix $X$. In this work, we consider the Lanczos method, which has the distinct advantage that it only requires matrix-vector products with $A$ and makes it broadly applicable. At the same time, the Lanczos method may suffer from slow convergence when $A$ is ill-conditioned, leading to exessive memory requirements for storing the Krylov subspace basis generated by Lanczos. In this work, we alleviate this problem by developing a novel compression strategy for the Krylov subspace basis, which drastically reduces the memory requirement without impeding convergence. This is confirmed by both numerical experiments and convergence analysis.
Large-scale multiterm linear matrix equations of the form
$$
A_1 X B_1 + \cdots + A_\ell X B_\ell = C,
$$ arise as the algebraic formulation in various application problems, such as discretized multivariable PDEs, stochastic, parameterized, or space-time PDEs and inverse problems. While effective methods exist for two-term equations ($\ell=2$), limited options are available for $\ell > 2$. Thus, efficiently solving multiterm matrix equations remains an open problem in numerical linear algebra.
In this talk, we present a new iterative scheme called the Subspace-Conjugate Gradient (Ss-cg) method for the efficient solution of large‑scale multiterm linear matrix equations. This method relies on the matrix‑oriented CG scheme but better uses the underlying (low-rank) matrix structure.
By imposing a peculiar orthogonality condition, the CG scalar coefficients for the iterative solution and the descent direction are replaced by low-dimensional matrices in Ss-CG. We employ truncation strategies to maintain the computed matrix iterates of low rank since limiting memory consumption becomes essential, especially when the number of terms $\ell$ is large. To this end, an additional ad-hoc randomized range-finding strategy is developed to further speed up computations.
The features of the Ss-CG method lead to remarkable computational gains, as demonstrated by several computational experiments. In particular, we compare Ss-CG with existing algorithms for solving Sylvester and Lyapunov multiterm equations to highlight its potential.
Polynomial Krylov subspace methods are among the most widely used methods for approximating $f(A)b$, the action of a matrix function on a vector, in particular when $A$ is large and sparse. When $A$ is Hermitian positive definite, the Lanczos method is the standard choice of Krylov method, and despite being very simplistic in nature, it often outperforms other, more sophisticated methods. In fact, one often observes that the error of the Lanczos method behaves almost exactly as the error of the best possible approximation from the Krylov space (which is in general not efficiently computable). However, theoretical guarantees for the deviation of the Lanczos error from the optimal error are mostly lacking so far (except for linear systems and a few other special cases). We prove a rigorous bound for this deviation when $f$ belongs to the important class of Stieltjes functions (which, e.g., includes inverse fractional powers as special cases) and a related class (which contains, e.g., the square root and the shifted logarithm), thus providing a near instance optimality guarantee.
The proof of this result crucially relies on exploting low-rank update relations between the tridiagonal matrices generated by the Lanczos process.
In this talk, we investigate preconditioning strategies for the Riesz operator $-(-\Delta)^{\frac{\alpha}{2}}$, $\alpha \in (1, 2]$, commonly used in fractional models such as anomalous diffusion. For $\alpha$ close to 2, it is well known that the Laplacian itself serves as an effective preconditioner with linear computational cost. However, as $\alpha$ decreases toward 1, its performance deteriorates, requiring more specialized approaches.
Our approach consists of the following steps:
As a result, we define and analyze preconditioners for the Riesz operator, expressed as a sum of $ m$ inverses of shifted Laplacian matrices, where $ m$ corresponds to or is closely related to the number of quadrature points.
Using a Gauss-Jacobi quadrature approach, we show that while the resulting preconditioner performs comparably to the Laplacian for $\alpha$ close to 2, it yields significantly better results for $\alpha$ close to 1, even with a reasonably small $ m $, still retaining the same computational complexity as the Laplacian. To further enhance these results, we investigate the feasibility of using exponentially convergent quadrature rules, like sinc and Gauss-Laguerre, to ensure the number of inverses remains small while achieving numerical optimality.
We investigate explicit expressions for the error associated with the block rational Krylov approximation of matrix functions. Two formulas are proposed, both derived from characterizations of the block FOM residual. The first formula employs a block generalization of the residual polynomial, while the second leverages the block collinearity of the residuals. A posteriori error bounds based on the knowledge of spectral information of the argument are derived and tested on a set of examples. Notably, both error formulas and their corresponding upper bounds do not require the evaluation of contour integrals.
The canonical polyadic decomposition (CPD) of a higher-order tensor is its basic decomposition in a minimal sum of rank-1 terms. CPD plays a major role in data analysis and signal processing by allowing for unique recovery of underlying factors. However, it is well known that the low-rank approximation problem is ill-posed in general. That is, a tensor may fail to have a best rank-$R$ approximation when $R > 1$.
In this talk, we switch to the opposite point of view, namely that the CP approximation problem is well-posed, unless the tensor is “too far” from the tensors with rank at most $R$. This is more an engineering perspective: the problem is well-posed unless the data are too noisy for a proper analysis. We give deterministic bounds under which the existence of a best low-rank approximation is guaranteed [1, 4]. In the derivation of the bounds, a tensor is viewed as a linear but multi-matrix pencil. We develop a computational strategy in which from each pencil only numerically reliable information is obtained (i.e., invariant subspaces rather than eigenvectors if the latter cannot be estimated reliably), and the results combined. We propose new CP estimation schemes that rely on such eigenspace computations [2, 3].
Eric Evert, Lieven De Lathauwer, Guarantees for Existence of a Best Canonical Polyadic Approximation of a Noisy Low-Rank Tensor, SIAM Journal on Matrix Analysis and Applications, 43 (1), 2022, pp. 328–369, https://doi.org/10.1137/20M1381046.
Eric Evert, Michiel Vandecappelle, Lieven De Lathauwer, A Recursive Eigenspace Computation for the Canonical Polyadic Decomposition, SIAM Journal on Matrix Analysis and Applications, 43 (1), 2022, pp. 274–300, https://doi.org/10.1137/21M1423026.
Eric Evert, Michiel Vandecappelle, Lieven De Lathauwer, Canonical Polyadic Decomposition via the Generalized Schur Decomposition, IEEE Signal Processing Letters, 29, 2022, pp. 937–941, https://doi.org/10.1109/LSP.2022.3156870.
Eric Evert, Lieven De Lathauwer, On Best Low Rank Approximation of Positive Definite Tensors, SIAM Journal on Matrix Analysis and Applications, 44 (2), 2023, pp. 867-893, https://doi.org/10.1137/22M1494178.
The nonnegative rank of an $m \times n$ nonnegative matrix $N \geq 0$ is the smallest integer $r$ such that a nonnegative matrix factorization (NMF) $N=L R^T$ exists, where $L \in \mathbb{R}^{m \times r}, R \in \mathbb{R}^{n \times r}$ are both nonnegative [1]. Denoting the usual linear-algebraic rank by $\mathrm{rank} (N)$, clearly $r \geq \mathrm{rank}(N)$; the inequality may be strict when $\mathrm{rank}(N) \geq 3$. A related task is computing a rank-$r$ nonnegative matrix approximation (NMA) of $N$, i.e., a rank-$r$ NMF $N_r=LR^T$ that mimimizes $\| N - N_r \|_F$. Computing a rank-$r$ NMA is NP-hard if $r \geq 3$, and solvable in $O(mn)$ flops if $r=1$; as far as I know, it is an open question to determine whether the problem is NP-hard or not if $r=2$.
The borderline case of computing rank-$2$ NMAs is particularly interesting: it is more accessible to mathematical analysis because the two notions of rank coincide, and it can solved efficiently in some special cases, but it remains computationally challenging for a general input. Furthermore, it arises in many applications, including medical imaging, network analysis, and signal processing. Currently, some of the best general-purpose algorithms are based on Alternating Nonnegative Least Squares (ANLS) [1], though this method can be sensitive to initialization and may converge to suboptimal solutions.
In my talk, I will present a full parametrization of all possible NMFs of a rank-$2$ nonnegative matrix $N$. To my knowledge, this result was previously only available [2] in the special case where $N$ is completely positive, i.e., it admits a symmetric NMF of the form $N=LL^T \in \mathbb{R}^{n \times n}, \ L \geq 0$. (Note: when $n \leq 4$, complete positivity is equivalent to being both nonnegative and positive semidefinite, but this fails when $n \geq 5$.)
Based on this parametrization, I propose a simple algorithm to compute a possibly subomptimal rank-$2$ NMA to an arbitrary input $N \geq 0$. Being very inexpensive, this method can serve as an effective initialization strategy for ANLS. Extensive numerical experiments illustrate that this approach can improve both the efficiency and the accuracy of NMA computations via ANLS. For small input size, we also compare with the guaranteed exact NMA computed by an $O(2^{mn})$ algorithm based on algebraic geometry [3]; the outcome of this experiment suggests that even the probability that the output of ANLS is the correct answer (up to noise due to floating point computations) can be improved by using our newly proposed starting point.
The talk is based on joint research with Etna Lindy and Paul Van Dooren; a preprint is under preparation at the time when this abstract is being written.
Tensor decompositions have become a central tool in data science, with applications in areas such as data analysis, signal processing, and machine learning. A key property of many tensor decompositions, such as the canonical polyadic decomposition, is identifiability, that is, the factors are unique, up to trivial scaling and permutation ambiguities. This allows one to recover the groundtruth sources that generated the data. The Tucker decomposition (TD) is a central and widely used tensor decomposition model. However, it is in general not identifiable. In this talk, we first introduce and motivate matrix and tensor decomposition models, with a focus on nonnegative matrix factorization (NMF) and nonnegative Tucker decomposition (NTD). Then, we study the identifiability of NTD. For order-2 tensors, that is, matrices, NTD is equivalent to a nonnegative tri-factorization model. By adapting and extending identifiability results of NMF, we provide uniqueness results for order-2 NTD. The conditions require the nonnegative matrix factors to have some degree of sparsity (namely, satisfy the sufficiently scattered condition), while the core matrix only needs to be full rank. We extend the result to order-3 tensors, which requires the nonnegative matrix factors to satisfy the same sufficiently scattered condition, while the core tensor only needs to have some slices (or linear combinations of them) or unfoldings with full column rank. We also discuss how this result can be extended to higher-order tensors. Finally, we propose an efficient algorithm to compute these unique NTDs, which we illustrate on synthetic and real data.
We introduce a mesh-free two-level hybrid Tucker tensor format for approximation of multivariate functions, which combines the product Chebyshev interpolation with the ALS-based Tucker decomposition of the tensor of Chebyshev coefficients. It allows to avoid the expenses of the rank-structured approximation of function-related tensors defined on large spatial grids, while benefiting from the Tucker decomposition of the rather small core tensor of Chebyshev coefficients. This leads to nearly optimal Tucker rank parameters which are close to the results for the well established Tucker-ALS algorithm applied to the large grid-based tensors. The rank parameters inherited from the Tucker-ALS decomposition of the coefficient tensor can be much less than the polynomial degrees of the initial Chebyshev interpolant obtained via a function independent basis set. Furthermore, the tensor product Chebyshev polynomials discretized on a tensor grid leads to a low-rank two-level orthogonal algebraic Tucker tensor that approximates the initial function with controllable accuracy. It is shown that our techniques could be gainfully applied to the long-range part of the electrostatic potential of multi-particle systems approximated in the range-separated tensor format. Error and complexity estimates of the proposed methods are presented. We demonstrate the efficiency of the suggested method numerically on examples of the long-range components of multi-particle interaction potentials generated by 3D Newton kernel for large bio-molecule systems and lattice-type compounds.
Sketching techniques have gained popularity in numerical linear algebra to accelerate the solution of least squares problems. The so-called $\varepsilon$-subspace embedding property of a sketching matrix $S$ has been largely used to characterize the problem residual norm, since the procedure is no longer optimal in terms of the (classical) Frobenius or Euclidean norm. By building on available results on the SVD of the sketched matrix $SA$ derived by Gilbert, Park, and Wakin (Proc. of SPARS-2013), a novel decomposition of $A$, the $S^TS$-SVD, is proposed, which is exact with high probability, and in which the left singular vectors are orthonormal with respect to a (semi-)norm defined by the sketching matrix $S$. The new decomposition is less expensive to compute, while preserving the singular values with probabilistic confidence. In addition to present this new decomposition, in this talk we will show how the $S^TS$-SVD is the right tool to analyze the quality of several sketching techniques in the literature, for which examples are reported. For instance, it is possible to simply bound the distance from (standard) orthogonality of sketched orthogonal matrices in state-of-the-art randomized QR algorithms. If time allows, as a further application, the classical problem of the nearest orthogonal matrix will be generalized to the new $S^TS$-orthogonality, and the $S^T$S-SVD will be adopted to solve it.
Gaussian processes (GP) are a versatile tool in machine learning and computational science. In this talk we consider the case of multi-output Gaussian processes (MOGP) and present low-rank approaches for efficiently computing the posterior mean of such a MOGP. Starting from low-rank spatio-temporal data, we consider a structured covariance function, assuming separability across space and time. This separability, in turn, gives a decomposition of the covariance matrix into a Kronecker product of individual covariance matrices. Incorporating the typical noise term to the model then requires the solution of a large-scale Stein equation for computing the posterior mean. For this, we propose efficient low-rank methods based on a combination of a LRPCG method with the Sylvester equation solver KPIK adjusted for solving Stein equations. We test the developed method on real world street network graphs by using graph filters as covariance matrices.
We propose a non-intrusive model order reduction technique for stochastic differential equations with additive Gaussian noise. The method extends the operator inference framework and focuses on inferring reduced-order drift and diffusion coefficients by formulating and solving suitable least-squares problems based on observational data. Various subspace constructions based on the available data are compared.
We demonstrate that the reduced order model produced by the proposed non-intrusive approach closely approximates the intrusive ROM generated using proper orthogonal decomposition. Numerical experiments illustrate the performance by analyzing errors in expectation and covariance. This is joint work with Martin Nicolaus (Potsdam) and Martin Redmann (Rostock).
We discuss an evolving low-rank approximation of the vorticity solution of the Euler equations describing the flow of a two-dimensional incompressible ideal fluid on the sphere. Such problem can be approximated by the so-called Zeitlin model, an isospectral Lie-Poisson flow on the Lie algebra of traceless skew-Hermitian matrices. We propose an approximation of Zeitlin's model based on a time-dependent low-rank factorization of the vorticity matrix and evolve a basis of eigenvectors according to the Euler equations. In particular, we show that the approximate flow remains isospectral and Lie-Poisson and that the error in the solution, in the approximation of the Hamiltonian and of the Casimir functions only depends on the approximation of the vorticity matrix at the initial time. After a suitable time discretization, the computational complexity of solving the approximate model is shown to scale quadratically with the order of the vorticity matrix and linearly if a further approximation of the stream function is introduced.
We study the numerical solution of non-autonomous linear ODEs of the form
$$ \frac{d}{dt} \tilde{u}(t) = \tilde{A}(t)\tilde{u}(t), \quad \tilde{u}(a) = v,$$
where $\tilde{A}(t) \in \mathbb{C}^{N \times N}$ is analytic and often takes the form
$$ \tilde{A}(t) = \sum_{j=1}^k A_j f_j(t),$$
with large, sparse constant matrices $A_j$ and scalar analytic functions $f_j(t)$. Such equations commonly arise in quantum chemistry, particularly in spin dynamics.
In general, no closed-form solution exists, so we proposed a spectral method—called the **$\star$-approach**—which expands the solution in Legendre polynomials and approximates the coefficients by solving a structured matrix equation:
$$ X - F_1 X A_1^T - \dots - F_k X A_k^T = \phi v^T,$$
where the matrices $F_j$ encode the functions $f_j(t)$ in a structured algebra. This equation has favorable properties: (i) banded $F_j$, (ii) Kronecker-structured $A_j$, and (iii) a rank-1 right-hand side. These allow efficient iterative and low-rank methods.
We implemented this strategy in the so-called $\star$-method, shown to be highly efficient for the generalized Rosen-Zener model, with linear scaling in system size and strong performance on large time intervals (collaboration with Christian Bonhomme - Sorbonne University, and Niel Van Buggenhout - Universidad Carlos III). Preliminary results also show promise in extending the method to fractional ODEs, where similar matrix equations arise and exhibit low-rank structure (collaboration with Fabio Durastante — University of Pisa, and Pierre-Louis Giscard — ULCO).
In this talk, I will describe several recent results that use low-rank approximation and machine learning together. There are two stories that I will discuss:
The CUR approximation of a matrix is attractive in that once the column and row indices are chosen, one can obtain a low-rank approximation without even looking at the whole matrix.
Its computation had previously been unattractive, often starting with the SVD to get reliable pivots. A remarkable paper by Osinsky shows that this is unnecessary, rendering CUR a practical tool in terms of computational efficiency and near-optimal accuracy.
In this talk I will first discuss the computational aspects of CUR, then highlight its power in practical applications, hopefully including parameter-dependent matrices, approximation of functions, model order reduction, and PCA.
Multiple orthogonal polynomials (MOPs) are a generalization of orthogonal polynomials.
They originally appeared in Hermite-Padè approximation (simultaneous
rational approximation) and number theory. Recently, they turned out to be very useful in random matrix theory, combinatorics and Markov chains.
Given $ r $ weight functions $w^{(i)}(x) \ge 0,$ with support $ \Delta^{(i)}, \; i=1,\ldots,r,$ on the real line, the sequence of MOPs $ \left\{p_n(x)\right\}_{ n=0}^{\infty} $ satisfies the following orthogonality conditions:
$$
\int_{\Delta^{(i)}} p_{n}(x) x^k w^{(i)}(x)dx = 0, \quad 0\le k \le n_{i}-1,
$$ with $ n= \sum_{i=1}^r n_i. $
A set of MOPs satisfies an $(r+2)$-term recurrence relation.
Without loss of generality, % in this paper
we focus on the case $r=2.$ Therefore, the set of monic MOPs satisfies a $4$-term recurrence relation
$$
xp_{i}(x) = a_i p_{i+1}(x) + b_i p_{i}(x) + c_i p_{i-1}(x) + d_i p_{i-2}(x), \quad a_i=1,\;\; i=0, \ldots, n-1,
$$ with $p_{-2}(x)=p_{-1}(x)=0.$
Writing in a matrix form, we obtain
$$
H_n
\left[\begin{array}{c}
p_0(x) \\ p_1 (x) \\ \vdots \\ p_{n-1}(x)
\end{array}\right] +
a_{n-1}
\left[\begin{array}{c}
0 \\ \vdots \\ 0 \\p_n (x)
\end{array}\right] =
x
\left[\begin{array}{c}
p_0(x) \\ p_1 (x) \\ \vdots \\ p_{n-1}(x)
\end{array}\right],
$$ where $ H_n $ is the $n\times n$ banded lower Hessenberg matrix with 2 sub-diagonals and one upper-diagonal:
$$
H_n := \left[\begin{array} {ccccccc} b_0 & a_0 & 0 & 0 & 0 & \ldots & 0 \\
c_1 & b_1 & a_1 & 0 & 0 & \ldots & 0 \\
d_2 & c_2 & b_2 & a_2 & 0 & \ldots & 0 \\
0 & d_3 & c_3 & b_3 & a_3 & \ddots & 0 \\
\vdots & \ddots & \ddots & \ddots & \ddots & \ddots & 0 \\
0 & \ldots & 0 & d_{n-2} & c_{n-2} & b_{n-2} & a_{n-2} \\
0 & \ldots & 0 & 0 & d_{n-1} & c_{n-1} & b_{n-1} \\
\end{array} \right].
$$ The zeros of $ p_n(x)$ are the eigenvalues of $ H_n, $ and they are the nodes of the Simultaneous Gaussian quadrature rules associated with MOPs.
Gegenbauer-Sobolev polynomials $p_n(x), \; n=0,1,\ldots, $ are orthogonal with
respect to the Sobolev inner product
$$
\left( p,q \right)_S
=
\int_{-1}^{1} v(x) w(x) \omega^{(\alpha)}(x) dx +\gamma \int_{-1}^{1} v'(x) w'(x) \omega^{(\alpha-\frac{1}{2})}(x)dx, \; \gamma >0,\;\alpha > -\frac{1}{2}.
$$
The monic Gegenbauer-Sobolev polynomial $p_n(x) $ of degree $n\ge 0 $ is odd if $ n $ is odd, is even if $ n $ is even (odd-even property).
The Althammer polynomial $ p_n(x)$ of degree $ n$ satisfies a long recurrence relation, whose coefficients can be arranged into a Hessenberg matrix of order $ n$, with eigenvalues the zeros of $ p_n(x)$.
Unfortunately, the eigenvalues of the aforementioned Hessenberg matrices are very ill-conditioned, and standard balancing procedures do not improve their condition numbers.
In this talk we describe fast and reliable algorithms for computing the zeros of the aforementioned polynomials.
The first step of the algorithm is to construct a balancing matrix that, applied to $ H_n, $ drastically reduces the condition of the Hessenberg eigenvalue problem, thereby allowing to compute the associated simultaneous Gaussian quadrature rule in floating point arithmetic in a reliable way.
Specifically, this step consists of determining a diagonal matrix $\Delta_n= {\tt diag}(\delta_1,\ldots,\delta_n)$, balancing $ H_n,$ i.e., $ \hat{H}_n= \Delta_n^{-1} H_n\Delta_n,$ such that $ {\tt triu}(\hat{H}_n,-1) $ is symmetric, where $ {\tt triu}(A, k)$ denotes the elements on and above the $k$th diagonal of $ A.$
The second step of the algorithm consists of computing the eigenvalue decomposition of $ \hat{H}_n$.
This can be done by applying the $ QR$ algorithm to $ \hat{H}_n$, requiring $ \mathcal{O} (n^3) $ floating point operations and $ \mathcal{O} (n^2) $ memory.
In order to reduce complexity and memory, for the second step we propose an alternative procedure based on a variant of the Aberth-Ehrlich method to compute the eigenvalues and associated left and right eigenvectors, requiring only $ {\mathcal O}(n)$ memory and $ {\mathcal O}(n^2)$ floating point operations.
References
Coussement, J., Van Assche, W.: Gaussian quadrature for multiple orthogonal polynomials, https://doi.org/10.1016/j.cam.2004.0(2005)
T. Laudadio, T., Mastronardi, N., Van Dooren, P.: Computing Gaussian quadrature rules with high relative accuracy, https://doi.org/10.1007/s11075-022-01297-9, (2023)
T. Laudadio, T., Mastronardi, N., Van Dooren, P.: On computing modified moments for half-range Hermite weights, https://doi.org/10.1007/s11075-023-01615-9, (2024)
T. Laudadio, T., Mastronardi, N., Van Dooren, P.: Computational aspects of simultaneous Gaussian
quadrature, https://doi.org/10.1007/s11075-024-01785-0(2024)
T. Laudadio, T., Mastronardi, N.,Van Assche, W., Van Dooren, P.: A Matlab package computing simultaneous Gaussian quadrature rules for multiple orthogonal polynomials, https://doi.org/10.1016/j.cam.2024.116109
(2024)
T. Laudadio, T., Mastronardi, N., Van Dooren, P.: Fast and reliable algorithms for computing the zeros
of Althammer polynomials, https://doi.org/10.1016/j.apnum.2024.09.004 (2025)
Multiple orthogonal polynomials (MOP's) arise in various applications, including approximation theory, random matrix theory, and numerical integration. To define MOP's, one needs multiple inner products, leading to two types of MOP’s, which are mutually biorthogonal. These MOP's satisfy recurrence relations, which can be linked to linear algebra, via discretization. As a result we get an inverse eigenvalue problem to retrieve the matrix of recurrences directly. We provide two algorithms for solving the inverse eigenvalue problem, one based on classical equivalence transformations and one linked to (block) Krylov methods. Numerical illustrations are provided to demonstrate correctness of the approach and also reveal the strong sensitivity of the problem.
The deep connection between Krylov methods, scalar orthogonal polynomials, and moment matrices is well established, especially for Hermitian and unitary matrices. In this talk, we consider the extension of this framework to block Krylov methods and orthogonal matrix polynomials.
By representing elements of a block Krylov subspace through matrix polynomials, we consider the matrix-valued inner product introduced in [1] that, under non-degeneracy, defines a linear isometry. This yields a one-to-one correspondence between orthonormal matrix polynomials and orthonormal bases of the block Krylov subspace.
For normal matrices, the block Gauss discretization [1,2] of such an inner product admits an integral representation familiar from the theory of orthogonal matrix polynomials. As an application, we extend a Szegő-type short recurrence [3] to the block Arnoldi algorithm applied to unitary matrices.
Finally, we analyze the structure of the block moment matrix and explore its relation to orthogonal matrix polynomials and recurrence coefficients via a Cholesky-like factorization.
Lund, K.: A New Block Krylov Subspace Framework with Applications to Functions of Matrices Acting on Multiple Vectors, https://www.proquest.com/docview/2046209015?sourcetype=Dissertations%20&%20Theses, (2018)
Zimmerling, J., Druskin, V., Simoncini, V.: Monotonicity, bounds and acceleration of block Gauss and Gauss-Radau quadrature for computing $B^T\phi(A)B$, https://doi.org/10.1007/s10915-025-02799-z, (2025)
Sinap, A., Van Assche, W.: Orthogonal matrix polynomials and applications, https://doi.org/10.1016/0377-0427(95)00193-X, (1996)
Reduced rank extrapolation (RRE) [1,2] can be used to accelerate convergent vector sequences. These sequences are often generated by an iterative process to solve algebraic equations.
In this presentation, I discuss the generalization of this extrapolation framework to sequences of low-rank matrices which are generated by iterative methods for large-scale matrix equations, such as, e.g., low-rank alternating directions implicit methods for Lyapunov and Riccati equations [3]. Special emphasis will also be given to inserting these RRE approaches into nonstationary iterations [4] for general linear matrix equation.
In this talk, we address the exact D-optimal experimental design problem by proposing an efficient algorithm that rapidly identifies the support of its continuous relaxation. Our method leverages a column generation framework to solve such a continuous relaxation, where each restricted master problem is tackled using a Primal-Dual Interior-Point-based Semidefinite Programming solver. This enables fast and reliable detection of the design's support. The identified support is subsequently used to construct a feasible exact design that is provably close to optimal. We show that, for large-scale instances in which the number of regression points exceeds by far the number of experiments, our approach achieves superior performance compared to existing branch-and-bound-based algorithms in both, computational efficiency and solution quality.
After a long struggle the concept of matrix geometric mean has been understood as the barycenter in a non-Euclidean geometry in the set of positive definite matrices.
This result has been somewhat surprising, but it appears more natural as soon as one observes that for positive real numbers a substantial equivalence exists between geodesics in Riemannian geometries and associative means.
We exploit this connection to provide ways to extend means from positive numbers to positive definite matrices and investigate the properties of the different extensions.
Some case studies, such as the definitions of power means of matrices, are presented.
This is a joint work with Nadia Chouaieb and Maher Moakher of the University of Tunis El Manar, Tunisia.