Open Source Ecosystems

Use https://github.com/MaxHalford/Prince instead

skmca

A scikit-learn pipeline API compatible implementation of Multiple Correspondence Analysis (MCA).

Usage


.. code-block:: python

   import pandas as pd
   from skmca import MCA

   df = pd.read_csv('http://www.statoek.wiso.uni-goettingen.de/'
                    'CARME-N/download/wg93.txt',
                    sep='\t', dtype='category')
   mca = MCA()
   mca.fit(df)


Crucially, the input to ``MCA.fit`` must be a ``pandas.DataFrame``
where all the columns have a ``category`` dtype. This is necessary
to ensure that the dummy encoding of the columns is consistent across
training and test datasets.

Background

MCA is like PCA_, but for categorical data. You can use it to visualize high-dimensional datasets. It can also be useful as a pre-processing step for clustering, to avoid the curse of dimensionality.

skmca requires pandas and scikit-learn.

References


This library follows the setup in `Nenadic and Greenacre (2005)`_.

.. PCA: http://scikit-learn.org/stable/modules/generated/sklearn.decomposition.PCA.html
.. Nenadic and Greenacre (2005): https://core.ac.uk/download/pdf/6591520.pdf

Related Projects

Machine_Learning

Some fundamental machine learning and data-analysis techniques are explained through realistic ex...

19 Sep 2018 118

deepSCCAN

Deep & Sparse CCA for Neuroimaging in Tensorflow

30 Mar 2017 20

dpcca

Code for the paper "End-to-end training of deep probabilistic CCA on paired biomedical observatio...

05 Jun 2019 24

deep-cca

Deep Canonical Correlation Analysis with Python

30 Jun 2021 6

early-classification

Early Text Classification in Python

13 Nov 2017 3

TabPFN

Official implementation of the TabPFN paper (https://arxiv.org/abs/2207.01848) and the tabpfn pac...

01 Jul 2022 1,188

Binary_classification_phase_separation

Python code for the paper "Binary classification as a phase separation process", by Rafael Montei...

10 Mar 2020 4

prince

Multivariate exploratory data analysis in Python — PCA, CA, MCA, MFA, FAMD, GPA

22 Oct 2016 1,259

metric-learn

Metric learning algorithms in Python

02 Nov 2013 1,383

CACTU

15 Oct 2018 2

Lihang

Statistical learning methods, 统计学习方法(第2版)[李航] [笔记, 代码, notebook, 参考文献, Errata, lihang]

19 Jun 2018 5,986

skll

SciKit-Learn Laboratory (SKLL) makes it easy to run machine learning experiments.

02 Aug 2013 551

sklearn-utilities

Utilities for scikit-learn. Append prediction to x, append prediction to x single, append x predi...

09 Oct 2023 3

ML_for_learner

Implementations of the machine learning algorithm with Python and numpy

20 Dec 2018 87

dPCA

An implementation of demixed Principal Component Analysis (a supervised linear dimensionality red...

08 Aug 2014 276