Skip to Content
Getting Started with RStudio
book

Getting Started with RStudio

by John Verzani
September 2011
Beginner
94 pages
2h 26m
English
O'Reilly Media, Inc.
Content preview from Getting Started with RStudio

Chapter 2. Case Study: Data Cleaning

Now that we know how to start RStudio, let’s dive in. We’ll begin with a blow-by-blow account of a sample data analysis for which we read in some data, clean it up, then format it for further study. We deliberately chose an example that will take us on some detours, as the point of the exercise is to show how many of RStudio’s features can be used during the process to speed the task along. We will postpone for now an example of the “development” aspect of RStudio.

The data set we look at here comes from a colleague, and contains records from a psychology experiment on a colony of naked mole rats. The experimenter is interested in both the behavior of each naked mole rat in time and the social aspect of the colony as a whole.

Each rat wears an RFID chip that allows the researcher to track its motion. The experiment consists of 15 chambers (bubbles) in a linear arrangement separated by 14 tubes. Each tube has a gate with a sensor. When a mole rat passes through the tube, the time and gate are recorded. Unfortunately, gates can be missed, and the recording device can erroneously replicate values, so the raw data must be cleaned up.

This data comes to us in rich-text format (rtf). This quasi text-based format is a bit unusual for data transfer but presumably is used by the recording apparatus. We will see that this format has some idiosyncrasies that will require us to work a little harder than we might normally do to read data into an RStudio session, ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Applied Data Visualization with R and ggplot2

Applied Data Visualization with R and ggplot2

Dr. Tania Moulik

Publisher Resources

ISBN: 9781449314798Supplemental ContentErrata Page