Parsing Comma-Separated Data
Problem
You have a string or a file of lines containing comma-separated values (CSV) that you need to read in. Many MS-Windows-based spreadsheets and some databases use CSV to export data.
Solution
Use my CSV
class or a regular expression (see
Chapter 4).
Discussion
CSV is deceptive. It looks simple at first glance, but the values may
be quoted or unquoted. If quoted, they may further contain escaped
quotes. This far exceeds the capabilities of the
StringTokenizer
class (Section 3.3). Either considerable Java coding
or the use of regular
expressions is required. I’ll show both ways.
First, a Java program. Assume for now that we have a class called
CSV
that has a no-argument constructor, and a
method called parse( )
that takes a string
representing one line of the input file. The parse( )
method returns a list of fields. For flexibility, this
list is returned as
an
Iterator
(see Section 7.5). I
simply use the
Iterator’s hasNext( )
method to control the loop, and its next( )
method to get the next object.
import java.util.*; /* Simple demo of CSV parser class. */ public class CSVSimple { public static void main(String[] args) { CSV parser = new CSV( ); Iterator it = parser.parse( "\"LU\",86.25,\"11/4/1998\",\"2:19PM\",+4.0625"); while (it.hasNext( )) { System.out.println(it.next( )); } } }
After the quotes are escaped, the string being parsed is actually the following:
"LU",86.25,"11/4/1998","2:19PM",+4.0625
Running CSVSimple
yields the following output:
> java ...
Get Java Cookbook now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.