9 Advanced Topics (Synthetic and Augmented Data, Green ML, Symbolic Regression, Mapping Functions, Ensembles, and AutoML)

On the other hand, omitting things makes the model simpler. But the real world is complex.

Nick Huntington-Klein1

Synopsis

The chapter will cover some of the other great ML ideas we did not get to cover in previous chapters. It is in this chapter that an engineer can truly appreciate the powerfulness of ML. I will be presenting techniques and ideas2 behind the following concepts, generating synthetic and augmented data, Green ML,3 symbolic regression, Mapping Functions,4 and ensembles and AutoML.5 Personally,6 I had these lumped into one chapter as I believe that these techniques can, and are likely, to be handy in practical or research problems for some senior/graduate students or practicing engineers.7

9.1 Synthetic and Augmented Data

In reality, and given a healthy dataset, the majority of the accepted ML algorithms are likely to work well. These algorithms have been verified against many, many datasets and applications.8 When these algorithms are applied to different problems, they could use a bit of tuning to arrive at decent performance. Beyond that, an algorithm can only get you as good of a performance as the data can get!9 As we have noticed, our data can make or break ML models.

9.1.1 Big Ideas

Simply, to affordably improve model performance, we can either modify the structure of the algorithm or work on our data. However, if you remember ...

Get Machine Learning for Civil and Environmental Engineers now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.