O'Reilly logo

Data Analysis with R - Second Edition by Tony Fischetti

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

More normalization

Since titles are so finicky, a more common method of linking book entities (if available) is to use bibliographic codes like ISBNs. Unfortunately for us (but, I suppose, fortunate from a pedagogical point of view), a quick look at our ISBNs reveals there's nonsense going on there, too, as you can see in the following code:

> lib$ISBN[1] "9781447286813"         "147460725X"            "144205374-7 "[4] "9780525433576"         "1-405-88229-8"         "8496886611fsds34Recur"[7] NA                      "889882002x"            "978-0060000578"

Some of the problems that we can spot are:

  • ISBN-13s are mixed with ISBN-10s
  • Some of the check digits (the last character of an ISBN) are x and some are upper case
  • Some of the ISBNs are hyphenated and some are not
  • One of the ISBNs have trailing whitespace ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required