Benchmark datasets, data loaders, and evaluators for graph machine learning
MIT License
Bot releases are hidden (Show)
Published by weihua916 almost 2 years ago
Published by weihua916 about 2 years ago
This release introduces the following two:
ogbl-vessel
dataset (described here) @jqmcginnisPublished by weihua916 about 3 years ago
We have included two updates:
Published by weihua916 over 3 years ago
Thanks to the DGL Team, all the LSC data is now hosted on AWS. This significantly improves the download speed around the globe! The underlying data stays exactly the same.
Published by weihua916 over 3 years ago
This release includes the three large-scale datasets for OGB-LSC at KDD Cup 2021. Details of the datasets and the KDD Cup can be found here.
Published by weihua916 over 3 years ago
The dataset downloading now uses http instead of https.
Published by weihua916 over 3 years ago
This version provides a major change in ogbg-code
.
ogbg-code
has been deprecated due to prediction target (i.e., method name) leakage in input AST.ogbg-code2
has been introduced that fixes the issue., where the method name and its recursive definition in AST are replaced with a special token _mask_
.We sincerely thank Charles Sutton (@casutton) for finding the data leakage in our dataset.
Published by weihua916 almost 4 years ago
This release fixes the dataset bug in negative samples in ogbl-wikikg
and ogbl-citation
and releases new versions of them: ogbl-wikikg2
and ogbl-citation2
. The old versions are deprecated.
Published by weihua916 about 4 years ago
This release enhances the OGB package in the following ways.
ogbn-papers100M
data loading more tractable by using compressed binary files https://github.com/snap-stanford/ogb/issues/46
Published by weihua916 about 4 years ago
This release is mainly for changing the evaluation metric of ogbg-molpcba
from PRC-AUC to Average Precision (AP). AP is shown to be more appropriate to summarize the non-convex nature of the Precision Recall Curve [1]. The leaderboard and our paper have been updated accordingly.
We also fix an issue and add a feature:
[1] Jesse Davis and Mark Goadrich. The relationship between precision-recall and roc curves. InInternational Conference on Machine Learning (ICML), pp. 233–240, 2006.
Published by weihua916 over 4 years ago
This release fixes bugs in a dataset, evaluator, and data loader.
ogbn-mag
are removed. The updated dataset will be downloaded and processed automatically as you run your script for ogbn-mag
. #40ogbl-collab
and ogbl-ddi
are updated. Specifically, ogbl-collab
now uses Hits@50, and ogbl-ddi
now uses Hits@20.ogbn-mag
and ogbl-biokg
is fixed. #36Published by weihua916 over 4 years ago
This is the second major release of OGB, in which we have curated many more exciting graph datasets, including heterogeneous graphs and a web-scale gigantic graph (100+ million nodes, 1+ billion edges).
First, we note that there is no change in the datasets released in version 1.1.1
. Therefore, any experimental results obtained using 1.1.1
on the existing datasets are compatible to version 1.2.0
.
In this new release, we have additionally released 5 new datasets listed below.
ogbn-papers100M
: Web-scale gigantic paper citation network.ogbn-mag
: Heterogeneous academic graph.ogbl-biokg
: Heterogeneous biomedical knowledge graph.ogbl-ddi
: Drug-drug interaction network.ogbg-code
: Source code Abstract Syntax Trees.Published by weihua916 over 4 years ago
OGB package can now automatically fetch the datasets if they have been updated.
Published by weihua916 over 4 years ago
This is the first major release of OGB.
A number of changes have been made to the datasets, which are summarized below.
mapping/
directory that contains information to map node/edge/graph/label indices to real-world entities (e.g., mapping from nodes in PPA to unique protein identifiers, mapping from molecular graphs into the SMILES strings.)ogbn-proteins
node features, and put them in the species variable.ogbl-reviews
datasets.ogbn-arxiv
, ogbl-citation
, ogbl-collab
, ogbl-wikikg
.ogbg-ppi
to ogbg-ppa
.ogbg-mol-hiv
and ogbg-mol-pcba
to ogbg-molhiv
and ogbg-molpcba
, respectively.get_split_edge()
interface in LinkPropPredDataset
. The downloaded dataset files are also changed accordingly.num_classes
attribute for multi-class classification datasets.Published by rusty1s over 4 years ago
OGB datasets can now be imported more conveniently, e.g.:
from ogb.graphproppred import GraphPropPredDataset
from ogb.graphproppred import PygGraphPropPredDataset
from ogb.graphproppred import DglGraphPropPredDataset
Note that this will throw an ImportError
if OGB can not find installations of Pyg or DGL, respectively.