From Software Engineer to AI Data Scientist

Published by Pearson

Beginner to intermediate

For software engineers exploring a career move into Data Science, GenAI, LLMs

If you’re a software engineer or have some coding experience, this is a fast-track class targeted specifically for you
This is an intensive, interactive and immersive way to learn about Data Science in 5 hours
The class will aim to make a career transition into data science – should you choose the path!

The dizzying rate of progress in GenAI has taken us all by surprise. Even industry experts are repeatedly caught off guard by Frontier models that break records. The field of Data Science is an extraordinarily exciting place to be.

The good news? Thanks to all the open-source frameworks, models and datasets, the barrier to entry is super low. You can be training your own LLM literally in minutes. And that’s exactly what we’ll do in the first section of the class.

Even better news? As a software engineer, much of this will come very naturally to you. There are dark arts in Data Science that we will cover in the middle of the class, including a small amount of math, and the practice of “hyper-parameter tuning” that feels a lot like a fancy name for trial-and-error. This can all be explained using code – and that’s a language you’re already fluent in.

In the later sections we’ll return to the LLM we created at the start – it should now make complete sense. We’ll improve it with some techniques, including multi-modality, RAG, agentic workflows and more. The class wraps up by exploring the career trajectory for a Software Engineer into this world and by giving you the resources to decide if this is the path for you – and if so, what to do next.

What you’ll learn and how you can apply it

The foundations of Machine Learning, Deep Learning, and Transformers
How to solve problems as a Data Scientist using Frontier models and open-source models
The tools and techniques of a Data Scientist

This live event is for you because...

You are a current software engineer
You’re intrigued by the power of LLMs and their potential for business impact and you’d like to know more
You’re open to exploring a career in Data Science

Prerequisites

Intermediate Python knowledge, as a software engineer
Awareness of Jupyter Notebooks

Course Set-up

Access to Github repo
Not required but recommended: If you have time before the class, try following the README instructions in the repo to set up a local environment. There are troubleshooting tips if needed.
Not required, but recommended – and full instructions will be included in the README of the github repo well in advance of the class:
- A Google colab account to follow along with the colab examples
- An OpenAI API key to run the examples yourself
- A Hugging Face account

Recommended Preparation

Watch: Next Level Python by Arianne Dee

Recommended Follow-up

Watch: Machine Learning Foundations by Jon Krohn
Attend: Hugging Face in 4 Hours by Sinan Ozdemir
Attend: Retrieval-Augmented Generation (RAG) and Agents Using LLMs by Sinan Ozdemir

Schedule

The time frames are only estimates and may vary according to how the class is progressing.

Introduction (10 mins)

Welcome, intros
Agenda & Goals – what you will leave with
Setting you up for success – GitHub & resources
Testing out Live Event features!

Segment 1: TEASER - Straight in at the deep end! (30 mins)

A brief history of DS
The dizzying rise of LLMs
5 ways to use LLMs
Directly calling a Frontier Model
Training your own LLM, literally in minutes
Coding multi-modality, LLM battles, and more

Q&A (10 minutes)

Break (10 minutes)

Segment 2: A crash course in Machine Learning (60 mins)

The basics

Motivations, definitions
Supervised, unsupervised, and reinforcement learning
Examples of classification, regression, clustering

The foundations

Training vs. inference
Training, validation, testing
Generalizing and overfitting

Coding time

NumPy, SciPy, pandas, motplatlib, scikit-learn
Putting it into practice
How it feels to generalize – and to overfit!

Building on the foundations

Traditional NLP
Advanced techniques like SVM, random forests, kNN
Emsembling and the Netflix Prize
State of the Art prior to Deep Learning

Q&A (10 minutes)

Break (10 minutes)

Segment 3: From Machine Learning to Deep Learning to Transformers (60 mins)

The basics: the Neural Network

A neuron
Weights, Biases, and Activations
A layer of neurons

Foundations: a tiny bit of math

The forward pass and backward pass
Enter the matrix
The backprop trick

Coding time

PyTorch, TensorFlow
A simple neural network
Training and comparing with the traditional approach

Building on the foundations

Encoding vs. Generating
Vector Embeddings
Model weights and hyper-parameters

What happened next

The trifecta: data, algorithms, compute (with GPUs and TPUs)
The Transformer architecture
LLMs and the Generative Pre-trained Transformer

Q&A (10 minutes)

Break (10 minutes)

Segment 4: Time to be a Data Scientist! (50 mins)

Tools of the trade

A tour of Hugging Face
Working with Google Colab
Using Weights & Biases
The joy of Gradio

Techniques

Multi-shot prompting
Multi-modality
RAG
Agentic workflows
Fine-tuning

Coding time

Revisiting the example from the start
Training
Carrying out Research & Development
The results

Segment 5: Career Trajectories into Data Science (20 mins)

Quick class recap

Career paths

Transitioning from Software Engineer to Data Science
Data Engineers, Data Analytics, ML Engineers, ML Ops
Resources for exploring career paths

Tools and further learning

A fun extra for those who stayed till the end!!
Takeaways and survey

Final Q&A (10 mins)

Your Instructor

Ed Donner
Ed Donner is a technology leader and repeat founder of AI startups. He’s the co-founder and CTO of Nebula.io, the platform to source, understand, engage and manage talent, using Generative AI and other forms of machine learning. Nebula matches people and roles with greater accuracy and speed than previously imaginable — no keywords required. Nebula’s long-term goal is to help people discover their potential and pursue their reason for being. Previously, Ed was the founder and CEO of AI startup untapt, an Accenture Fintech Innovation Lab company, acquired in 2020. Before that, Ed was a Managing Director at JPMorgan Chase, leading a team of 300 software engineers in Risk Technology across 3 continents, after a 15-year technology career on Wall Street. Ed holds a patent for a Deep Learning matching engine issued in 2023, and an MA in Physics from Oxford.

linkedin link search

Skill covered

Data Science

Cloud Computing

Data Engineering

Data Science

AI & ML

Programming Languages

Software Architecture

IT/Ops

Security

Design

Business

Soft Skills

From Software Engineer to AI Data Scientist

What you’ll learn and how you can apply it

This live event is for you because...

Prerequisites

Schedule

Your Instructor

Ed Donner

Skill covered