Skip to Content
Bioinformatics Data Skills
book

Bioinformatics Data Skills

by Vince Buffalo
July 2015
Intermediate to advanced
538 pages
15h 29m
English
O'Reilly Media, Inc.
Book available
Content preview from Bioinformatics Data Skills

Chapter 7. Unix Data Tools

We often forget how science and engineering function. Ideas come from previous exploration more often than from lightning strokes.

John W. Tukey

In Chapter 3, we learned the basics of the Unix shell: using streams, redirecting output, pipes, and working with processes. These core concepts not only allow us to use the shell to run command-line bioinformatics tools, but to leverage Unix as a modular work environment for working with bioinformatics data. In this chapter, we’ll see how we can combine the Unix shell with command-line data tools to explore and manipulate data quickly.

Unix Data Tools and the Unix One-Liner Approach: Lessons from Programming Pearls

Understanding how to use Unix data tools in bioinformatics isn’t only about learning what each tool does, it’s about mastering the practice of connecting tools together—creating programs from Unix pipelines. By connecting data tools together with pipes, we can construct programs that parse, manipulate, and summarize data. Unix pipelines can be developed in shell scripts or as “one-liners”—tiny programs built by connecting Unix tools with pipes directly on the shell. Whether in a script or as a one-liner, building more complex programs from small, modular tools capitalizes on the design and philosophy of Unix (discussed in “Why Do We Use Unix in Bioinformatics? Modularity and the Unix Philosophy”). The pipeline approach to building programs is a well-established tradition in Unix (and bioinformatics) ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Analytical Skills for AI and Data Science

Analytical Skills for AI and Data Science

Daniel Vaughan
R for Data Science, 2nd Edition

R for Data Science, 2nd Edition

Hadley Wickham, Mine Çetinkaya-Rundel, Garrett Grolemund

Publisher Resources

ISBN: 9781449367480Errata Page