O'Reilly logo

Apache Spark 2.x for Java Developers by Sumit Kumar, Sourav Gulati

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Working with JSON data

JSON is a schemaless human-readable file format. The rise in web and mobile applications has made JSON a very popular web interchange format for API services and data storage. There are different approaches in which we can read Json files, such as reading the file as text and then processing it on a record basis, or by using SparkSession/Spark SQL, which support JSON natively.

Let's first understand how we can process JSON using records using external JSON libraries/parser with an assumption that each record in the file is of type JSON. Consider the JSON file that consists of data of people. The following is an example JSON record:

 { "year": "2013", "firstName": "DAVID", "county": "KINGS", "sex": "M", "cid": 272,"dateOfBirth":"2016-01-07T00:01:17Z" ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required