September 2018
Intermediate to advanced
412 pages
11h 12m
English
Feature extraction is the first step of model building. Typically, data scientists leverage some of the common tools, such as Jupyter and Zeppelin, for data exploration and for the feature extraction process, and eventually deploy the model as an ML pipeline to operationalize. In this step, the incoming transformed data as a stream, or the data at rest in the database or filesystem, will be analyzed for tokenizing, along with the transformation for identifying the features that are relevant to the business use case. Once the relevant features are identified, the sample dataset can be identified from the explored data for the next steps of the pipeline.