May 2018
Beginner to intermediate
364 pages
7h 43m
English
In the following dataset, we have three valid values: 2, 3, and 4. Obviously, their mean is 3. Since there are two NAs, we plan to replace them with the mean, that is, 3 in this case. The following R code achieves this:
> x<-c(NA,2,3,4,NA) > y<-na.omit(x) > m<-mean(y) > m [1] 3 > x[is.na(x)]<-m > x [1] 3 2 3 4 3 >
For Python, see the following program:
import scipy as sp
import pandas as pd
df = pd.DataFrame({'A' : [2,sp.nan,3,4]})
print(df)
df.fillna(df.mean(), inplace=True)
print(df)
The related output is:
A
0 2.0
1 NaN
2 3.0
3 4.0
A
0 2.0
1 3.0
2 3.0
3 4.0
Read now
Unlock full access