February 2017
Intermediate to advanced
274 pages
5h 58m
English
As noted in the previous section, you can start off by using collect(), show(), or take() to view the data within your DataFrame (with the last two including the option to limit the number of returned rows).
To get the number of rows within your DataFrame, you can use the count() method:
swimmers.count()
This gives the following output:
Out[13]: 3
To run a filter statement, you can use the filter clause; in the following code snippet, we are using the select clause to specify the columns to be returned as well:
# Get the id, age where age = 22 swimmers.select("id", "age").filter("age = 22").show() # Another way to write the above query is below swimmers.select(swimmers.id, swimmers.age).filter(swimmers.age ...Read now
Unlock full access