Skip to Content
Bioinformatics Data Skills
book

Bioinformatics Data Skills

by Vince Buffalo
July 2015
Intermediate to advanced
538 pages
15h 29m
English
O'Reilly Media, Inc.
Book available
Content preview from Bioinformatics Data Skills

Chapter 5. Git for Scientists

In Chapter 2, we discussed organizing a bioinformatics project directory and how this helps keep your work tidy during development. Good organization also facilitates automating tasks, which makes our lives easier and leads to more reproducible work. However, as our project changes over time and possibly incorporates the work of our collaborators, we face an additional challenge: managing different file versions.

It’s likely that you already use some sort of versioning system in your work. For example, you may have files with names such as thesis-vers1.docx, thesis-vers3_CD_edits.docx, analysis-vers6.R, and thesis-vers8_CD+GM+SW_edits.docx. Storing these past versions is helpful because it allows us to go back and restore whole files or sections if we need to. File versions also help us differentiate our copies of a file from those edited by a collaborator. However, this ad hoc file versioning system doesn’t scale well to complicated bioinformatics projects—our otherwise tidy project directories would be muddled with different versioned scripts, R analyses, README files, and papers.

Project organization only gets more complicated when we work collaboratively. We could share our entire directory with a colleague through a service like Dropbox or Google Drive, but we run the risk of something getting deleted or corrupted. It’s also not possible to drop an entire bioinformatics project directory into a shared directory, as it likely contains gigabytes ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Analytical Skills for AI and Data Science

Analytical Skills for AI and Data Science

Daniel Vaughan
R for Data Science, 2nd Edition

R for Data Science, 2nd Edition

Hadley Wickham, Mine Çetinkaya-Rundel, Garrett Grolemund

Publisher Resources

ISBN: 9781449367480Errata Page