What can we do when the performance of an AI system, such as a Recommender System, degrades due to changes in the data that it is processing? This is the central question we seek to answer in our recent project: “Hyper-Parameter Optimization for Latent Spaces in Dynamic Recommender Systems”
When an AI system is deployed, it is usually assumed that the data the system is processing is similar to the data it has been trained on. However, there are many scenarios in which this assumption is likely to be violated. Think, for example, of a recommendation system. These are systems designed to predict user preferences for items, such as movies, songs or books, and are commonly used by technology companies like Amazon, Netflix or Spotify to recommend new items or content to their users. These systems are a good example to illustrate the relevance of the problem studied in this work: User preferences are based on their taste and we all know that taste most definitely changes over time, just as the collection of items is constantly updated, with new items added or old items removed. Overall, we are dealing with a dynamic, rather than a static system, which means we have to rethink the way a learning algorithm is deployed.
Recommender systems are often based on collaborative filtering algorithms, and these algorithms have associated hyper-parameters, such as embedding size or learning rate, affecting the behavior and performance of the system. When the incoming data changes, for example because the collection of items is updated, there might be a new, unknown set of hyper-parameters that works more effectively on this new data than the previous setting. Our approach therefore involves i) detecting when system performance degrades (concept drift) and ii) initialise re-configuration of the matrix factorization algorithm. For this, we employ an adaption of the Nelder-Mead algorithm, which uses a set of heuristics to produce a number of potentially good configurations, from which the best one is selected and deployed.
To optimally control the characteristics of the stream, we also built a data generator, which produces realistic streams of preference data and injects drift into the data by means of expanding or shrinking the latent feature space or, in other words, change the user preferences.
We consider multiple baselines, including an online adaption of the well-known configurator SMAC as well as a carefully tuned static model, and yield promising results. In future work, we want to apply this method to situations in which the parameter space is more complex, for example, when using deep learning models requiring multiple components and specific tuning of the underlying components, such as convolutional architectures, recurrent layers or even attention models based on transformers.
This project is joint work by Matthias König and Holger H. Hoos (ADA group at LIACS, Leiden University), together with Bruno Veloso (Portucalense University, Portugal), Luciano Caroprese (ICAR-CNR, Italy), Sónia Teixeira (University of Porto, Portugal & LIAAD-INESC TEC, Portugal), Giuseppe Manco (ICAR-CNR, Italy) and João Gama (University of Porto, Portugal & LIAAD-INESC TEC, Portugal). It has been executed in the context of the HumanE-AI-Net, one of the two ICT-48 Networks of Centres of Excellence recently established by the European commission, in which ADA plays a key role. The accompanying paper will be presented at ECML PKDD 2021 and can be found here, along with the code and other materials.