25Dealing with Missing Values in a Relation Dataset Using the DROPNA Function in Python

Vikash Kumar Mishra1*, Shoney Sebastian2, Maria Iqbal2 and Yashwin Anand2

1School of Computing Science and Engineering, Galgotias University, Greater Noida, Uttar Pradesh, India

2Christ (Deemed to be) University Delhi-NCR, Ghaziabad, Uttar Pradesh, India

Abstract

Python provides a rich data structure library called PANDAS, which provides fast and efficient data transformation and analysis. The word PANDAS is an abbreviation of Python Data Analysis Library. PANDAS facilitate optimized and dynamic data structure designs work with “relational” or “labeled” data. Python’s approach is meant to provide a high-level, high-performance building block that can be used to do real-world analysis of data. PANDAS Library is allowing users to import data from different file formats, such as CSV, SQL, Microsoft Excel etc. [1]. It helps in data preparation, as well as in data modeling, for those projects, which aims data analysis for the extraction of information. Python’s future will be built on this layer for statistical computing. In addition to discussing future areas of work and growth opportunities for statistics and data analytics applications built on Python, the study provides details about the language’s design and features [2]. In this research paper, we intend to solve the problem of missing values in a dataset using the DROPNA function in Python using PANDAS library.

Keywords: Python, DROPNA, ...

Get Mathematics and Computer Science, Volume 1 now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.