Informal Statistical Physics Seminar

Date
Tue, Mar 26, 2019 1:15 pm - 2:15 pm
Location
IPST 1116 Conference Room (Bldg. #085)

Description

Speaker: Dr. Grant Rotskoff, New York University, Courant Institute

Title: Neural Networks as Interacting Particle Systems: Understanding Global Convergence of Parameter Optimization Dynamics

Abstract: The performance of neural networks on high-dimensional data distributions suggests that it may be possible to parameterize a representation of a target high-dimensional function with controllably small errors, potentially outperforming standard interpolation methods. We demonstrate, both theoretically and numerically, that this is indeed the case. We map the parameters of a neural network to a system of particles relaxing with an interaction potential determined by the loss function. This mapping gives rise to a deterministic partial differential equation that governs the parameter evolution under gradient descent dynamics. We also show that in the limit that the number of parameters n is large, the landscape of the mean-squared error becomes convex and the representation error in the function scales link n^{-1}. In this limit, we prove a dynamical variant of the universal approximation theorem showing that the optimal representation can be attained by stochastic gradient descent, the algorithm ubiquitously used for parameter optimization in machine learning. This conceptual framework can be leveraged to develop algorithms that accelerate optimization using non-local transport. I will conclude by showing that using neuron birth/death processes in parameter optimization guarantees global convergence and provides a substantial acceleration in practice.