7 Machine Learning Made Easier

Machine learning is the buzzword of the decade as students and companies vie to get this skill for business applications. However many parts of machine learning are quite easy. In supervised learning, we know what we are trying to predict (a group to class in classification and a number/equation to predict in regression), whereas in unsupervised learning we do not know what is to be predicted (no given tag is there), so we do association analysis and cluster analysis. Text mining on the other hand looks at frequency of words for pattern analysis. Social network analysis looks at relationships between nodes, edges, and actors to see how networks behave. Deep learning is an even more recent case of such advances in techniques.

One of the most widely used techniques is decision trees.

Decision trees in Python (weather dataset)

https://nbviewer.jupyter.org/gist/decisionstats/47a2324b14ebfd22657b40ec1ae5b480

#rattle package in R has weather dataset
#(see help at
http://artax.karlin.mff.cuni.cz/r‐help/library/rattle/html/weather.html)

 In [259]:

import os as os

 In [260]:

import pandas as pd

 In [261]:

os.getcwd()

 Out[261]:

'/home/ajayohri'

 In [262]:

os.listdir()

 Out[262]:

['.hplip',
 '.xsession‐errors.old',
 'VirtualBox VMs',
 'filename.pkl_04.npy',
 '.thunderbird',
 'SVM.R',
 'R',
 'Desktop',
 'filename.pkl_07.npy',
 '.cache',
 '.webex',
 'file.R',
 '.ipython',
 'unique_ids_for_list.html',
 'filename.pkl_11.npy',
 '.Xauthority',
 'Dropbox', ...

Get Python for R Users now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.