January 2017
Beginner to intermediate
446 pages
8h 46m
English
Gender identification is an interesting problem. In this case, we will use the heuristic to construct a feature vector and use it to train a classifier. The heuristic that will be used here is the last N letters of a given name. For example, if the name ends with ia, it's most likely a female name, such as Amelia or Genelia. On the other hand, if the name ends with rk, it's likely a male name such as Mark or Clark. Since we are not sure of the exact number of letters to use, we will play around with this parameter and find out what the best answer is. Let's see how to do it.
Create a new python file and import the following packages:
import random from nltk import NaiveBayesClassifier from nltk.classify import accuracy ...
Read now
Unlock full access