kPCA-denoising-python

Reproduction of the experiments presented in Kernel PCA and De-noising in Feature Spaces, as a project in DD2434 Machine Learning Advance Course during Winter 2016

Stars
15
Committers
5

Kernel PCA for denoising

Project in DD2434 Machine Learning Advance Course, Winter 2016.

Our team

Name GitHub
Federico Baldassarre baldassarreFe
Zacharie Brodard zach-b
Alfredo Fanghella alfredojf
Lucas Rods lucasrodes

Our work

We reproduced the experiments presented in the paper Kernel PCA and De-noising in Feature Spaces by Sebastian Mika, Bernhard Schlkopf, Alex Smola Klaus-Robert Mller, Matthias Scholz and Gunnar Rtsch. In this regard, you can read our report and our presentation.

Dependencies

In order to run the experiments, make sure you have all dependencies installed

  • matplotlib (>= 2.0.0)
  • pandas (>=0.19.2)
  • rpy2 (>=2.8.5)
  • scikit-image (>=0.12.3)
  • scipy (>=0.19.0)
  • numpy (>=1.12.1)
  • sklearn (>=0.0)

You can install them by typing

pip3 install -r requirements.txt

We strongly recommend using a virtual environment in order to keep these dependencies isolated from the rest of the system. Follow the instructions here to set up you virtual environment.

Running the experiments

In the paper, there are three major experiments:

  • Toy example: 11 Gaussians
  • Toy example: De-noising
  • Digit denoising (USPS Dataset)

The file our_kpca.py contains our own implementation of the kPCA method, based on the paper approach.

Toy example: 11 Gaussians

The code related to this example can be found in example1.py.

Run the script as

python3 example1.py

By default, this script outputs the kPCA MSE, PCA MSE and their ratio for 45 different settings of sigma.

Toy example: De-noising

The code related to this example can be found in example2.py

Run the script as

python3 example2.py

Once the execution has ended, a picture as follows will be displayed.

You might get some warnings, just ignore them.

Digit denoising (USPS Dataset)

⚠️ Known issue: the USPS dataset is no longer available at mldata.org, we will look into an alternative source

The code related to this example can be found in example3.py

Run the script as

python3 example3.py