Artificial Intelligence Research Laboratory

Continual Learning (supported in part by grants from the National Science Foundation and the National Institutes of Health)

Continual learning refers to the problem of acquiring new information as new data, new classes, or new tasks are presented over time. One of the goals of CL is to maintain performance on previously learned classes or tasks while effectively leveraging previously acquired knowledge to perform well on the new data, classes, or tasks: (i) task-incremental learning where the goal is to incrementally learn a set of clearly distinguishable tasks; (ii) domain-incremental learning which is a more challenging case of task-incremental learning where the underlying data distributions, not just the data or tasks, change over time; (iii) class-incremental learning where the goal is to incrementally learn to distinguish between a growing number classes. In many applications, it is necessary to consider the regression counterpart of CIL.

Existing CL methods have several significant limitations: (i) CL methods have been evaluated primarily on limited computer vision benchmark data, on which some very simple baseline CL methods have been shown to work well; (ii) Different CL methods often make different assumptions, making it hard to rigorously compare them ; (iii) The surprisingly good performance of simple baselines on commonly used benchmarks calls into question the adequacy of the benchmarks, the current state of progress on CL, or both; (iv) Barring a few exceptions, there are few theoretical guarantees about the performance of most CL algorithms; and (v) CL methods generally fail to explicitly consider the costs of storing and processing old data, the learned model, or the cost of adapting a learned model to a new task. Hence, there is an urgent need for rigorous formulations of CL problems, CL algorithms with strong performance guarantees, and better benchmarks for rigorous evaluation. Our recent work has led to:

A simple, fast algorithm inspired by adaptive resonance theory (ART) (Ashtekar et al., 2023). To cope with the curse of dimensionality and avoid catastrophic forgetting, we apply incremental principal component analysis (IPCA) to the model’s previously learned weights. Experiments show that this approach approximates the performance achieved using static PCA and is competitive with continual deep learning methods.
A new method for efficient continual learning of sparse models (EsaCL) that can automatically prune redundant parameters without adversely impacting the model’s predictive power and circumvent the need of retraining (Ren et al., 2024). Based on a theoretical analysis of loss landscapes we design directional pruning (SDP) strategy that is informed by the sharpness of the loss function with respect to the model parameters. SDP ensures model with minimal loss of predictive accuracy, accelerating the learning of sparse models at each stage. To accelerate model update, we use a data selection strategy to identify critical instances for estimating loss landscape. The results of our experiments show that EsaCL achieves performance that is competitive with the state-of-the-art methods.

Work in progress is aimed at (i) investigating the necessary and sufficient conditions for CL problems to be solvable; (ii) formulating CL problems as optimization problems to model the relevant costs and derive CL solutions with strong performance guarantees; and (iii) developing physics-based formulations of CL where the tasks and data change, but the governing physics does not.

Penn State University

Artificial Intelligence Research Laboratory

Menu

This is Penn State