From Software Engineer to AI Data Scientist
Published by Pearson
For software engineers exploring a career move into Data Science, GenAI, LLMs
- If you’re a software engineer or have some coding experience, this is a fast-track class targeted specifically for you
- This is an intensive, interactive and immersive way to learn about Data Science in 5 hours
- The class will aim to make a career transition into data science – should you choose the path!
The dizzying rate of progress in GenAI has taken us all by surprise. Even industry experts are repeatedly caught off guard by Frontier models that break records. The field of Data Science is an extraordinarily exciting place to be.
The good news? Thanks to all the open-source frameworks, models and datasets, the barrier to entry is super low. You can be training your own LLM literally in minutes. And that’s exactly what we’ll do in the first section of the class.
Even better news? As a software engineer, much of this will come very naturally to you. There are dark arts in Data Science that we will cover in the middle of the class, including a small amount of math, and the practice of “hyper-parameter tuning” that feels a lot like a fancy name for trial-and-error. This can all be explained using code – and that’s a language you’re already fluent in.
In the later sections we’ll return to the LLM we created at the start – it should now make complete sense. We’ll improve it with some techniques, including multi-modality, RAG, agentic workflows and more. The class wraps up by exploring the career trajectory for a Software Engineer into this world and by giving you the resources to decide if this is the path for you – and if so, what to do next.
What you’ll learn and how you can apply it
- The foundations of Machine Learning, Deep Learning, and Transformers
- How to solve problems as a Data Scientist using Frontier models and open-source models
- The tools and techniques of a Data Scientist
This live event is for you because...
- You are a current software engineer
- You’re intrigued by the power of LLMs and their potential for business impact and you’d like to know more
- You’re open to exploring a career in Data Science
Prerequisites
- Intermediate Python knowledge, as a software engineer
- Awareness of Jupyter Notebooks
Course Set-up
- Access to Github repo
- Not required but recommended: If you have time before the class, try following the README instructions in the repo to set up a local environment. There are troubleshooting tips if needed.
- Not required, but recommended – and full instructions will be included in the README of the github repo well in advance of the class:
- A Google colab account to follow along with the colab examples
- An OpenAI API key to run the examples yourself
- A Hugging Face account
Recommended Preparation
- Watch: Next Level Python by Arianne Dee
Recommended Follow-up
- Watch: Machine Learning Foundations by Jon Krohn
- Attend: Hugging Face in 4 Hours by Sinan Ozdemir
- Attend: Retrieval-Augmented Generation (RAG) and Agents Using LLMs by Sinan Ozdemir
Schedule
The time frames are only estimates and may vary according to how the class is progressing.
Introduction (10 mins)
- Welcome, intros
- Agenda & Goals – what you will leave with
- Setting you up for success – GitHub & resources
- Testing out Live Event features!
Segment 1: TEASER - Straight in at the deep end! (30 mins)
- A brief history of DS
- The dizzying rise of LLMs
- 5 ways to use LLMs
- Directly calling a Frontier Model
- Training your own LLM, literally in minutes
- Coding multi-modality, LLM battles, and more
Q&A (10 minutes)
Break (10 minutes)
Segment 2: A crash course in Machine Learning (60 mins)
The basics
- Motivations, definitions
- Supervised, unsupervised, and reinforcement learning
- Examples of classification, regression, clustering
The foundations
- Training vs. inference
- Training, validation, testing
- Generalizing and overfitting
Coding time
- NumPy, SciPy, pandas, motplatlib, scikit-learn
- Putting it into practice
- How it feels to generalize – and to overfit!
Building on the foundations
- Traditional NLP
- Advanced techniques like SVM, random forests, kNN
- Emsembling and the Netflix Prize
- State of the Art prior to Deep Learning
Q&A (10 minutes)
Break (10 minutes)
Segment 3: From Machine Learning to Deep Learning to Transformers (60 mins)
The basics: the Neural Network
- A neuron
- Weights, Biases, and Activations
- A layer of neurons
Foundations: a tiny bit of math
- The forward pass and backward pass
- Enter the matrix
- The backprop trick
Coding time
- PyTorch, TensorFlow
- A simple neural network
- Training and comparing with the traditional approach
Building on the foundations
- Encoding vs. Generating
- Vector Embeddings
- Model weights and hyper-parameters
What happened next
- The trifecta: data, algorithms, compute (with GPUs and TPUs)
- The Transformer architecture
- LLMs and the Generative Pre-trained Transformer
Q&A (10 minutes)
Break (10 minutes)
Segment 4: Time to be a Data Scientist! (50 mins)
Tools of the trade
- A tour of Hugging Face
- Working with Google Colab
- Using Weights & Biases
- The joy of Gradio
Techniques
- Multi-shot prompting
- Multi-modality
- RAG
- Agentic workflows
- Fine-tuning
Coding time
- Revisiting the example from the start
- Training
- Carrying out Research & Development
- The results
Segment 5: Career Trajectories into Data Science (20 mins)
Quick class recap
Career paths
- Transitioning from Software Engineer to Data Science
- Data Engineers, Data Analytics, ML Engineers, ML Ops
- Resources for exploring career paths
Tools and further learning
- A fun extra for those who stayed till the end!!
- Takeaways and survey
Final Q&A (10 mins)
Your Instructor
Ed Donner
Ed Donner is a technology leader and repeat founder of AI startups. He’s the co-founder and CTO of Nebula.io, the platform to source, understand, engage and manage talent, using Generative AI and other forms of machine learning. Nebula matches people and roles with greater accuracy and speed than previously imaginable — no keywords required. Nebula’s long-term goal is to help people discover their potential and pursue their reason for being. Previously, Ed was the founder and CEO of AI startup untapt, an Accenture Fintech Innovation Lab company, acquired in 2020. Before that, Ed was a Managing Director at JPMorgan Chase, leading a team of 300 software engineers in Risk Technology across 3 continents, after a 15-year technology career on Wall Street. Ed holds a patent for a Deep Learning matching engine issued in 2023, and an MA in Physics from Oxford.