Skip to Main Content
Machine Learning for Civil and Environmental Engineers
book

Machine Learning for Civil and Environmental Engineers

by M. Z. Naser
August 2023
Intermediate to advanced content levelIntermediate to advanced
608 pages
19h 29m
English
Wiley
Content preview from Machine Learning for Civil and Environmental Engineers

3 Data and Statistics

Data have no meaning in themselves; they are meaningful only in relation to a conceptual model of the phenomenon studied.

G. Box, W. Hunter, and J. Hunter1

Synopsis

This chapter covers the principles and techniques of data collection, handling, and manipulation needed for a variety of machine learning (ML) investigations. In addition, this chapter reviews fundamentals of statistics and serves as a refresher and/or a reinforcer to some of the essential principles behind statistical analyses. First, we define data and then lay out strategies for visualizing and plotting the different types of data we are likely to encounter in our field. Then, we spend some time exploring and diagnosing data, and for this, a series of methods will be presented and showcased. I will be sharing techniques to carry out fundamental2 data analysis and visualization using Excel, Python, R, and Exploratory.

3.1 Data and Data Science

We have previously defined data as pieces (or units) of information, facts, quantities, and statistics collected for the purpose of reference or analysis. Data is a big element of ML,3 especially one that is data-driven (or driven by the data4). The data has two main components: an explanatory portion (a numerical value or a category) and a label describing such a unit. Numerical data are made of numeric or numbers, and categorical data are made from categories or classes. We also have footage data, audio data, video data, text data, time series ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Advanced R Statistical Programming and Data Models: Analysis, Machine Learning, and Visualization

Advanced R Statistical Programming and Data Models: Analysis, Machine Learning, and Visualization

Matt Wiley, Joshua F. Wiley
Practical Simulations for Machine Learning

Practical Simulations for Machine Learning

Paris Buttfield-Addison, Mars Buttfield-Addison, Tim Nugent, Jon Manning
Environmental Data Analysis with MatLab

Environmental Data Analysis with MatLab

William Menke, Joshua Menke

Publisher Resources

ISBN: 9781119897606Purchase Link