Kaggle

Kaggle competition library. Uses Python 3.4.1 with almost all known python libraries for Machine Learning

Stars
8

Kaggle

A few python scripts that perform well on Kaggle competions.

Competition

MNIST

Contains various scripts using either XGBoost, PCA + SVM or Convolutional Neural Networks.

  • CNNs tend to perform the best, with the VGG-like net performing the best but requiring the largest amount of time.
  • SqueezeNet is fast to train but does not perform as well as VGG
  • The current best script I posses is the DCCNN MNIST architecture, but since it has been trained on the entire data set it will
    over fit in this Kaggle dataset and give 100% accuracy. (Original accuracy = 99.77 %)

Titanic

Starter dataset to teach about the basics of Machine Learning principles such as data cleaning / preprocessing and feature construction.

  • Best performing model is XGBoost

Bike Sharing Demand

A time series data set which is very useful to understand how to manipulate and train time series datasets.

  • Best performing model is an ensemble of two XGBoost which learn the two different time factored outputs and then merge them into a single output.

BNP

Bank BNP Paribas contest.

  • Tried various combinations of stacking and neural nets. Best was a combination of neural nets and XGBoost stacked with Logistic Regression as final layer.