Comparing keras, pytorch and gluon using neural collaborative filtering
This repo contains an implementation of Xiangnan He, et al,
2017 neural
collaborative filtering in Keras
(original paper), Gluon
and Pytorch
.
The Keras
code is mostly borrowed from the author's original
repo, adapted
to the new keras 2.2
API and python 3
. Of course, I strongly recommend
reading their paper.
Everything one needs to run the experiment is in this repo. The code is organized as follows:
The core of the repo are of course the GMF_DLFRAME.py
,
MLP_DLFRAME.py
and NeuMF_DLFRAME.py
where DLFRAME
is keras
,
pytorch
and gluon
I have also included data_preparation.py
and data_comparison.ipynb
. The
first shows how to prepare the data for the experiment (not included in the
author's original
repo) and the
second simply shows that the results of my data preparation and those of
Xiangnan He are consistent.
If you are just interested in a comparison between the results obtained
with Keras
, Pytorch
and Gluon
, you can directly go to
results_summary.ipynb
.
All the experiments run are included in run_net.sh
. If you clone this repo you could directly copy and paste the content in that file. For example, the following line will run a GMF model using Gluon
, with batch_size 256, learning rate 0.01, 32 dim embeddings for 30 epochs:
python GMF_gluon.py --batch_size 256 --lr 0.01 --n_emb 32 --epochs 30
The best performing GMF and MLP models are included in the dir models
.
Given the relative simplicity of the model, I thought this would be a good exercise to illustrate the similarities and differences between the 3 frames . In addition the results obtained turned out to be quite interesting.
The Figure below shows the Hit Ratio (HR) and Normalized Discounted Cumulative Gain (NDCG) at k=10 for the MLP, GMF models and also the training time for the MLP model.
Top: Hit Ratio (HR) and Normalized Discounted Cumulative Gain (NDCG) at k=10 for both the GMF and MLP models vs the number of embeddings. Bottom: training time for the GMF and MLP models per batch size and number of embeddings respectively.
For more details, go to results_summary.ipynb
Any suggestion, email me at: [email protected]