A PyTorch implementation of "SimGNN: A Neural Network Approach to Fast Graph Similarity Computation" (WSDM 2019).
GPL-3.0 License
⠀⠀⠀
A PyTorch implementation of SimGNN: A Neural Network Approach to Fast Graph Similarity Computation (WSDM 2019).
This repository provides a PyTorch implementation of SimGNN as described in the paper:
SimGNN: A Neural Network Approach to Fast Graph Similarity Computation. Yunsheng Bai, Hao Ding, Song Bian, Ting Chen, Yizhou Sun, Wei Wang. WSDM, 2019. [Paper]
A reference Tensorflow implementation is accessible [here] and another implementation is [here].
The codebase is implemented in Python 3.5.2. package versions used for development are just below.
networkx 2.4
tqdm 4.28.1
numpy 1.15.4
pandas 0.23.4
texttable 1.5.0
scipy 1.1.0
argparse 1.1.0
torch 1.1.0
torch-scatter 1.4.0
torch-sparse 0.4.3
torch-cluster 1.4.5
torch-geometric 1.3.2
torchvision 0.3.0
scikit-learn 0.20.0
Every JSON file has the following key-value structure:
{"graph_1": [[0, 1], [1, 2], [2, 3], [3, 4]],
"graph_2": [[0, 1], [1, 2], [1, 3], [3, 4], [2, 4]],
"labels_1": [2, 2, 2, 2, 2],
"labels_2": [2, 3, 2, 2, 2],
"ged": 1}
--training-graphs STR Training graphs folder. Default is `dataset/train/`.
--testing-graphs STR Testing graphs folder. Default is `dataset/test/`.
--filters-1 INT Number of filter in 1st GCN layer. Default is 128.
--filters-2 INT Number of filter in 2nd GCN layer. Default is 64.
--filters-3 INT Number of filter in 3rd GCN layer. Default is 32.
--tensor-neurons INT Neurons in tensor network layer. Default is 16.
--bottle-neck-neurons INT Bottle neck layer neurons. Default is 16.
--bins INT Number of histogram bins. Default is 16.
--batch-size INT Number of pairs processed per batch. Default is 128.
--epochs INT Number of SimGNN training epochs. Default is 5.
--dropout FLOAT Dropout rate. Default is 0.5.
--learning-rate FLOAT Learning rate. Default is 0.001.
--weight-decay FLOAT Weight decay. Default is 10^-5.
--histogram BOOL Include histogram features. Default is False.
python src/main.py
Training a SimGNN model for a 100 epochs with a batch size of 512.
python src/main.py --epochs 100 --batch-size 512
Training a SimGNN with histogram features.
python src/main.py --histogram
Training a SimGNN with histogram features and a large bin number.
python src/main.py --histogram --bins 32
Increasing the learning rate and the dropout.
python src/main.py --learning-rate 0.01 --dropout 0.9
You can save the trained model by adding the --save-path
parameter.
python src/main.py --save-path /path/to/model-name
Then you can load a pretrained model using the --load-path
parameter; note that the model will be used as-is, no training will be performed.
python src/main.py --load-path /path/to/model-name
License