Approach to data science

The approach to data science solutions involves the following staged approach:

Knowledge mining (discovery) in datasets is an interactive and iterative process involving several steps for identifying valid, useful, and understandable patterns in data.

  • Data processing: Data cleansing pre-transforms the raw data into an easy and convenient format for usage; a few related tasks are:
    • Sampling, that is, selecting representative subsets from a large population of data
    • Remove noise
    • Missing data handling for incomplete rows
    • Normalization of data
    • Feature extraction: data useful in a particular context is extracted
  • Data ...

