The ADA research group welcomes Richard Middelkoop as a new Bachelor student

Besides working in the ADA research group, Richard studies Computer Science & Economics where his main interest lies in being able to apply codes to real-life applications. Therefore, he found the ADA group an excellent environment to work on his bachelor project.

Richards’ project is concerned with Parallel Algorithm Portfolios, and adding this new functionality to the Sparkle platform. The problems which can be solved by the functionality will encompass decision problems, optimisation problems with a single solution and optimisation problems with several possible solution. All being well,
the project will showcase a practical application of Parallel Portfolios.

AutoML adoption in software engineering for machine learning

This blog post has originally been published on automl.org

By Koen van der Blom, Holger Hoos, Alex Serban, Joost Visser

In our global survey among teams that build ML applications, we found ample room for increased adoption of AutoML techniques. While AutoML is adopted at least partially by more than 70% of teams in research labs and tech companies, for teams in non-tech and government organisations adoption is still below 50%. Full adoption remains limited to under 10% in research and tech, and to less than 5% in non-tech and government.

Software engineering for machine learning

With the inclusion of ML techniques in software, the risks and requirements of the software also changes. In turn, the best ways to maintain and develop software with ML components is different from traditional software engineering (TSE). We call this new field software engineering for machine learning (SE4ML).

Based on a study of the academic and “grey” literature (the latter comprises blog articles, presentation slides and white papers), we identified best practices for SE4ML, and composed a recommended reading list on the topic. These practices include the use of AutoML techniques to improve the use of ML components in software. All practices are described in our practice catalogue. We encourage readers to have a look and to send us their suggestions for additions.

Figure 1: Engineering best practices for machine learning.

Having a solid overview of the best practices, we set up a survey to find out the extent to which these SE4ML practices have already been adopted. We asked teams working on software with machine learning components to which degree they followed each practice in their work. In the resulting data we saw that tech companies have the highest adoption of the new ML related best practices, while research labs have the highest adoption of AutoML. Overall, more practices are adopted by teams that are larger and more experienced, with practices originating from TSE seeing slightly lower adoption than the new ML specific practices.

Effects of best practice adoption

Besides the adoption of best practices, we also investigated the effects of the practices. This resulted in insights into which specific practices relate to which desired outcomes. For instance, we found that software quality is influenced most by continuous integration, automated regression tests, and static analysis. On the other hand, agility is primarily affected by automated model deployment, having a cohesive- multi-disciplinary team, and enabling parallel training experiments. With these insights into the effectiveness of different practices, we hope to increase practice adoption and improve the quality of software with ML components. 

A brief overview of additional survey results can be found in our piece titled “The 2020 State of Engineering Practices for Machine Learning”, and in our technical publication “Adoption and Effects of Software Engineering Best Practices in Machine Learning”.

How about automated ML?

A recent line of research advancements in ML focuses on automated machine learning (AutoML), an area that aims to automate (parts of) the construction and use of ML pipelines to enable a wider audience to make effective and responsible use of ML, without needing to become an expert in the field. We took a closer look at our survey results and found that, compared to non-tech companies and governmental organisations, research labs and tech companies are ahead in the adoption of AutoML practices (see Figure 2).

Figure 2: AutoML adoption by organisation type.

While overall AutoML adoption is similar across continents, non-profit and low-tech organisations see higher adoption in Europe than in North America. We also found that teams with multiple years of experience are more likely to adopt AutoML techniques. Finally, across the board, there is significant room to increase adoption of AutoML, but this is especially true for non-tech companies and governmental organisations.

More detailed information on our findings regarding AutoML adoption can be found in our piece titled “The 2020 State of AutoML Adoption”.

Looking forward

Based on feedback from and interviews with participants, we recently revised our survey to learn more. Specifically, in our latest version of the survey, we ask in more detail about responsible use of ML and about the adoption of different AutoML techniques such as automated feature selection and neural architecture search. You can help us by taking the 10-minute survey and by spreading the word. If you want to stay up to date with our progress, keep an eye on our website.

A fairly long time ago in an office actually rather close by…

Koen van der Blom joined the ADA research group as a post-doctoral researcher. From March 2019 onward he started working with Holger Hoos in the area of meta-algorithmics. One of the things he works on is a tool called Sparkle. Sparkle aims to make meta-algorithmics such as algorithm selection and configuration easier to use for a wide audience. Besides this, he is also interested in performance analysis and prediction. What can be said about the expected performance of a new instance, based on previously seen instances? And how can you compare performance in a fair way?

Before this, during his PhD, Koen worked on multi-objective mixed-integer evolutionary algorithms applied to early-stage building design under the supervision of Michael Emmerich, Hèrm Hofmeyer, and Thomas Bäck. He continues to be interested in these problems, and particularly when it comes to optimisation in mixed-integer spaces. Who knows, perhaps combining aspects from the old and new will yet lead to other exciting work.