2Data Input
2.1 Data Input in Pandas
The pandas library offers many flexible formats for reading in data.
The most commonly used is read_csv to read in comma‐separated values (from the Internet URL). That is,
anscombe=pd.read_csv("https://vincentarelbundock.github.io/Rdatasets/csv/datasets/anscombe.csv")
See the top few lines at http://nbviewer.jupyter.org/gist/decisionstats/3737642751895f470d5c07194302f53e. © GitHub repository.
Or read in csv data from a local file.
See http://nbviewer.jupyter.org/gist/decisionstats/4142e98375445c5e4174
import pandas as pd #importing packages
import os as os
In [2]:
#pd.describe_option() #describe options for customizing
In [3]:
#pd.get_option("display.memory_usage")#setting some options
In [4]:
os.getcwd() #current working directory
Out [4]:
'/home/ajay'
In [5]:
os.chdir('/home/ajay/Desktop')
In [6]:
os.getcwd()
Out [6]:
'/home/ajay/Desktop'
In [7]:
a=os.getcwd()
os.listdir(a)
Out [7]:
['adult.data']
In [8]:
names2=["age","workclass","fnlwgt","education","education-num","marital-status","occupation","relationship","race","sex","capital-gain","capital-loss","hours-per-week","native-country","income"]
In [9]:
len(names2)
Out [9]:
15
In [10]:
adult=pd.read_csv("adult.data",header=None)
In [11]:
len(adult)
Out [11]:
32562
In [12]:
adult.columns
Out [12]:
Int64Index([0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14], ...
Get Python for R Users now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.