December 2018
Intermediate to advanced
318 pages
8h 28m
English
We will start by importing the relevant packages. The pandas package will be used to enable data frame capabilities. The sklearn package will be used to divide the data into training and testing datasets. We will also use the logistic regression available in sklearn:
import pandas as pdimport numpy as npfrom sklearn.feature_extraction.text import TfidfVectorizerfrom sklearn.linear_model.logistic import LogisticRegressionfrom sklearn.model_selection import train_test_split, cross_val_score
We import SMSSpamCollectiondataSet using pandas, as follows:
dataframe = pd.read_csv('SMSSpamCollectionDataSet', delimiter='\t',header=None)X_train_dataset, X_test_dataset, y_train_dataset, y_test_dataset = train_test_split(dataframe[1],dataframe[0]) ...Read now
Unlock full access