SPASS - Probability, Stochastic Analysis and Statistics Seminar

A two-scale complexity measure for stochastic neural networks

by Massimiliano Datres (Università di Trento)

Europe/Rome
Sala Seminari (Dipartimento di Matematica)

Sala Seminari

Dipartimento di Matematica

Description
Over-parametrized deep learning models are achieving outstanding performances in solving several complex tasks such as image classification problems, object detection and natural language processing. Despite the risk of overfitting, these parametric models show impressive generalization after training. Hence, defining appropriate complexity measures becomes crucial for understanding and quantifying the generalization capabilities of deep learning models. In this talk, I will introduce a new notion of complexity measure, called two-scale effective dimension (2sED), which is a box-covering dimension related to a metric induced by the Fisher information matrix of the parametric model. I will then show how the 2sED can be used to derive a generalization bound. Furthermore, I present an approximation of the 2sED for Markovian models, called lower 2sED, that can be computed sequentially layer-by-layer with less computational demands. Finally, I present experimental evidence that the post-training performance of given parametric models is related both with 2sED and the lower 2sED.