Imputer.py

A python implementation of missing value imputation with kNN

MIT License

Stars
11

Imputer

A python implementation for missing value imputation using kNN.

Install

git clone https://github.com/bwanglzu/Imputer.py.git
cd Imputer.py
# install dependencies
pip install -r requirements.txt
# install imputer
python setup.py install

Usage

from imputer import Imputer
impute = Imputer()

Default Usage (X should be a pandas.dataframe/np.ndarray, column is the name or index of the dataframe):

X_imputed = impute.knn(X=data, column='age') # default 10nn

Change Number of k:

X_imputed = impute.knn(X=data, column='age', k=3)

Default impute for numerical features, for categorical feature imputation:

X_imputed = impute.knn(X=data, column='gender', k=10, is_categorical=True)

Test

nosetests --with-coverage

Reference

Troyanskaya O, Cantor M, Sherlock G, et al. Missing value estimation methods for DNA microarrays[J]. Bioinformatics, 2001, 17(6): 520-525.

Badges
Extracted from project README
CircleCI codecov Language License
Related Projects