Skip to Content
Bioinformatics with Python Cookbook - Second Edition
book

Bioinformatics with Python Cookbook - Second Edition

by Tiago Antao
November 2018
Intermediate to advanced
360 pages
9h 36m
English
Packt Publishing
Content preview from Bioinformatics with Python Cookbook - Second Edition

How to do it...

Let's take a look at the following steps:

  1. Let's start by making sure that we can access PySpark:
import syssys.path.append('/PATH_TO/spark-2.3.2-bin-hadoop2.7/python/') # Not conda#Careful with Java version#conda install py4j

Be sure to change PATH_TO to whatever path you have for your Spark installation.

  1. Now, let's import pyspark:
import pyspark as sparkfrom pyspark.sql.functions import col,round as round_

We will be using the round function, but we will rename it to round_ to avoid clashes with the builtin round function.

  1. Let's connect to our Spark server:
sc = spark.SparkContext('spark://127.0.1.1:7077')
  1. Now, we will create SQLcontext:
sqlc = spark.SQLContext(sc)

There are other contexts for Spark, and we will discuss ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Bioinformatics with Python Cookbook

Bioinformatics with Python Cookbook

Tiago Antao

Publisher Resources

ISBN: 9781789344691Supplemental Content