Skip to Content
The Quick Python Book, Third Edition
book

The Quick Python Book, Third Edition

by Naomi Ceder, David Fugate
May 2018
Beginner
472 pages
15h 3m
English
Manning Publications
Content preview from The Quick Python Book, Third Edition

Chapter 21. Processing data files

This chapter covers

  • Using ETL (extract-transform-load)
  • Reading text data files (plain text and CSV)
  • Reading spreadsheet files
  • Normalizing, cleaning, and sorting data
  • Writing data files

Much of the data available is contained in text files. This data can range from unstructured text, such as a corpus of tweets or literary texts, to more structured data in which each row is a record and the fields are delimited by a special character, such as a comma, a tab, or a pipe (|). Text files can be huge; a data set can be spread over tens or even hundreds of files, and the data in it can be incomplete or horribly dirty. With all the variations, it’s almost inevitable that you’ll need to read and use data from text ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Think Python, 3rd Edition

Think Python, 3rd Edition

Allen Downey
Learn Python Programming - Third Edition

Learn Python Programming - Third Edition

Fabrizio Romano, Heinrich Kruger

Publisher Resources

ISBN: 9781617294037Publisher SupportOtherPublisher WebsiteSupplemental ContentErrata PagePurchase Link