Skip to Content
Software Engineering for Data Scientists
book

Software Engineering for Data Scientists

by Catherine Nelson
April 2024
Intermediate to advanced
260 pages
6h 22m
English
O'Reilly Media, Inc.
Content preview from Software Engineering for Data Scientists

Chapter 1. What Is Good Code?

This book aims to help you write better code. But first, what makes code “good”? There are a number of ways to think about this: the best code could be the code that runs fastest. Or it could be easiest to read. Another possible definition is that good code is easy to maintain. That is, if the project changes, it should be easy to go back to the code and change it to reflect the new requirements. The requirements for your code will change frequently because of updates to the business problem you’re solving, new research directions, or updates elsewhere in the codebase.

In addition, your code shouldn’t be complex, and it shouldn’t break if it gets an unexpected input. It should be easy to add a simple new feature to your code; if this is hard it suggests your code is not well written. In this chapter, I’ll introduce aspects of good code and show examples for each. I’ll divide these into five categories: simplicity, modularity, readability, performance, and robustness.

Why Good Code Matters

Good code is especially important when your data science code integrates with a larger system. This could be putting a machine learning model into production, writing packages for wider distribution, or building tools for other data scientists. It’s most useful for larger codebases that will be run repeatedly. As your project grows in size and complexity, the value of good code will increase.

Sometimes, the code you write will be a one-off, a prototype that needs ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

AirBnbBlueOriginElectronic ArtsHomeDepotNasdaqRakutenTata Consultancy Services

QuotationMarkO’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.
Julian F.
Head of Cybersecurity
QuotationMarkI wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.
Addison B.
Field Engineer
QuotationMarkI’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.
Amir M.
Data Platform Tech Lead
QuotationMarkI'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.
Mark W.
Embedded Software Engineer

You might also like

Data Science: The Hard Parts

Data Science: The Hard Parts

Daniel Vaughan
Software Engineering at Google

Software Engineering at Google

Titus Winters, Tom Manshreck, Hyrum Wright

Publisher Resources

ISBN: 9781098136192Errata Page