Continual learning refers to the problem of acquiring new information as new data, new classes, or new tasks are presented over time. One of the goals of CL is to maintain performance on previously learned classes or tasks while effectively leveraging previously acquired knowledge to perform well on the new data, classes, or tasks: (i) task-incremental learning where the goal is to incrementally learn a set of clearly distinguishable tasks; (ii) domain-incremental learning which is a more challenging case of task-incremental learning where the underlying data distributions, not just the data or tasks, change over time; (iii) class-incremental learning where the goal is to incrementally learn to distinguish between a growing number classes. In many applications, it is necessary to consider the regression counterpart of CIL.
Existing CL methods have several significant limitations: (i) CL methods have been evaluated primarily on limited computer vision benchmark data, on which some very simple baseline CL methods have been shown to work well; (ii) Different CL methods often make different assumptions, making it hard to rigorously compare them ; (iii) The surprisingly good performance of simple baselines on commonly used benchmarks calls into question the adequacy of the benchmarks, the current state of progress on CL, or both; (iv) Barring a few exceptions, there are few theoretical guarantees about the performance of most CL algorithms; and (v) CL methods generally fail to explicitly consider the costs of storing and processing old data, the learned model, or the cost of adapting a learned model to a new task. Hence, there is an urgent need for rigorous formulations of CL problems, CL algorithms with strong performance guarantees, and better benchmarks for rigorous evaluation. Our recent work has led to:
-
A simple, fast algorithm inspired by adaptive resonance theory (ART) (Ashtekar et al., 2023). To cope with the curse of dimensionality and avoid catastrophic forgetting, we apply incremental principal component analysis (IPCA) to the model’s previously learned weights. Experiments show that this approach approximates the performance achieved using static PCA and is competitive with continual deep learning methods.
- A new method for efficient continual learning of sparse models (EsaCL) that can automatically prune redundant parameters without adversely impacting the model’s predictive power and circumvent the need of retraining (Ren et al., 2024). Based on a theoretical analysis of loss landscapes we design directional pruning (SDP) strategy that is informed by the sharpness of the loss function with respect to the model parameters. SDP ensures model with minimal loss of predictive accuracy, accelerating the learning of sparse models at each stage. To accelerate model update, we use a data selection strategy to identify critical instances for estimating loss landscape. The results of our experiments show that EsaCL achieves performance that is competitive with the state-of-the-art methods.