O'Reilly logo

Apache Spark 2.x for Java Developers by Sumit Kumar, Sourav Gulati

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Working with CSV data

CSV files can be treated as plain text files, having a comma as a delimiter and will generally work as expected. Consider, we have data of movies in the following format:

movieId, title, genre

Let's load this file as an RDD of the Movie object, as follows:

Movie POJO:

public class Movie implements Serializable {
private Integer movieId;
private String title;
private String genre;
public Movie() {}; public Movie(Integer movieId, String title, String genere ) { super(); this.movieId = movieId; this.title = title; this.genre = genere; } public Integer getMovieId() { return movieId; } public void setMovieId(Integer movieId ) { this.movieId = movieId ; } public String getTitle() { return title; } public void setTitle(String ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required