ogb

Benchmark datasets, data loaders, and evaluators for graph machine learning

MIT License

Downloads
62.7K
Stars
1.9K
Committers
25

Bot releases are visible (Hide)

ogb - Pandas 2.0 compatibility Latest Release

Published by weihua916 over 1 year ago

ogb - Fix stuck import bug

Published by weihua916 almost 2 years ago

ogb - ogbl-vessel and improved rank prediction

Published by weihua916 about 2 years ago

This release introduces the following two:

ogb - OGB-LSC dataset updates

Published by weihua916 about 3 years ago

We have included two updates:

  • WikiKG90M --> WikiKG90Mv2
  • PCQM4M --> PCQM4Mv2
ogb - Hosting LSC data on AWS

Published by weihua916 over 3 years ago

Thanks to the DGL Team, all the LSC data is now hosted on AWS. This significantly improves the download speed around the globe! The underlying data stays exactly the same.

ogb - Including datasets for KDD Cup 2021

Published by weihua916 over 3 years ago

This release includes the three large-scale datasets for OGB-LSC at KDD Cup 2021. Details of the datasets and the KDD Cup can be found here.

ogb - Fix download bug

Published by weihua916 over 3 years ago

The dataset downloading now uses http instead of https.

ogb - Deprecate ogbg-code and update to ogbg-code2

Published by weihua916 over 3 years ago

This version provides a major change in ogbg-code.

  • ogbg-code has been deprecated due to prediction target (i.e., method name) leakage in input AST.
  • ogbg-code2 has been introduced that fixes the issue., where the method name and its recursive definition in AST are replaced with a special token _mask_.

We sincerely thank Charles Sutton (@casutton) for finding the data leakage in our dataset.

ogb - Fix dataset bug, release new datasets

Published by weihua916 almost 4 years ago

This release fixes the dataset bug in negative samples in ogbl-wikikg and ogbl-citation and releases new versions of them: ogbl-wikikg2 and ogbl-citation2. The old versions are deprecated.

ogb - 1.2.3

Published by weihua916 about 4 years ago

This release enhances the OGB package in the following ways.

ogb - Fix evaluation metric of ogbg-molpcba

Published by weihua916 about 4 years ago

This release is mainly for changing the evaluation metric of ogbg-molpcba from PRC-AUC to Average Precision (AP). AP is shown to be more appropriate to summarize the non-convex nature of the Precision Recall Curve [1]. The leaderboard and our paper have been updated accordingly.

We also fix an issue and add a feature:

[1] Jesse Davis and Mark Goadrich. The relationship between precision-recall and roc curves. InInternational Conference on Machine Learning (ICML), pp. 233–240, 2006.

ogb - Minor fix and update

Published by weihua916 over 4 years ago

This release fixes bugs in a dataset, evaluator, and data loader.

  • Duplicated edges in ogbn-mag are removed. The updated dataset will be downloaded and processed automatically as you run your script for ogbn-mag. #40
  • Evaluators for ogbl-collab and ogbl-ddi are updated. Specifically, ogbl-collab now uses Hits@50, and ogbl-ddi now uses Hits@20.
  • DGL data loader bug for ogbn-mag and ogbl-biokg is fixed. #36
ogb - Second major release

Published by weihua916 over 4 years ago

This is the second major release of OGB, in which we have curated many more exciting graph datasets, including heterogeneous graphs and a web-scale gigantic graph (100+ million nodes, 1+ billion edges).

First, we note that there is no change in the datasets released in version 1.1.1. Therefore, any experimental results obtained using 1.1.1 on the existing datasets are compatible to version 1.2.0.

In this new release, we have additionally released 5 new datasets listed below.

  • ogbn-papers100M: Web-scale gigantic paper citation network.
  • ogbn-mag: Heterogeneous academic graph.
  • ogbl-biokg: Heterogeneous biomedical knowledge graph.
  • ogbl-ddi: Drug-drug interaction network.
  • ogbg-code: Source code Abstract Syntax Trees.
ogb - Automatic dataset update

Published by weihua916 over 4 years ago

OGB package can now automatically fetch the datasets if they have been updated.

ogb - First major release

Published by weihua916 over 4 years ago

First Major Release

This is the first major release of OGB.
A number of changes have been made to the datasets, which are summarized below.

  1. Re-indexed all the nodes in the node/link datasets (The graphs remain essentially the same).
  2. In dataset folders for all the datasets, added mapping/ directory that contains information to map node/edge/graph/label indices to real-world entities (e.g., mapping from nodes in PPA to unique protein identifiers, mapping from molecular graphs into the SMILES strings.)
  3. Deleted the ogbn-proteins node features, and put them in the species variable.
  4. Deleted ogbl-reviews datasets.
  5. Added 4 datasets: ogbn-arxiv, ogbl-citation, ogbl-collab, ogbl-wikikg.
  6. Renamed ogbg-ppi to ogbg-ppa.
  7. Renamed ogbg-mol-hiv and ogbg-mol-pcba to ogbg-molhiv and ogbg-molpcba, respectively.
  8. Changed the evaluation metric of imbalanced molecule dataset (e.g., pcba) from ROC-AUC to PRC-AUC.
  9. Changed the get_split_edge() interface in LinkPropPredDataset. The downloaded dataset files are also changed accordingly.
  10. Added num_classes attribute for multi-class classification datasets.
ogb - 1.0.1

Published by rusty1s over 4 years ago

Minor Changes

OGB datasets can now be imported more conveniently, e.g.:

from ogb.graphproppred import GraphPropPredDataset
from ogb.graphproppred import PygGraphPropPredDataset
from ogb.graphproppred import DglGraphPropPredDataset

Note that this will throw an ImportError if OGB can not find installations of Pyg or DGL, respectively.

Package Rankings
Top 1.55% on Pypi.org
Top 17.0% on Spack.io
Top 16.43% on Conda-forge.org
Badges
Extracted from project README
PyPI License
Related Projects