noisy-quadratic-model

The major contributors of this repository include Roger Grosse and Guodong Zhang.

Introduction

This repository contains the toy code to reproduce the NQM results from the paper Which Alorithmic Choices Matter at Which Batch Sizes?.

Particularly, you can reproduce our results on momentum (left figure), preconditioning (both figures), exponential moving average (right figures) and learning rate decay with this code. Here are a few figures from our paper.

Citation

To cite this work, please use

@inproceedings{zhang2019algorithmic,
  title={Which algorithmic choices matter at which batch sizes? insights from a noisy quadratic model},
  author={Zhang, Guodong and Li, Lala and Nado, Zachary and Martens, James and Sachdeva, Sushant and Dahl, George E and Shallue, Christopher J and Grosse, Roger},
  booktitle={Advances in Neural Information Processing Systems},
  year={2019}
}

Related Projects

deepmind-research

This repository contains implementations and illustrative code to accompany DeepMind publications

15 Jan 2019 12,870

networkqit

Optimization of spectral entropies on complex networks

16 Apr 2019 3

IQC-Game

Minimax Optimization, Monotone Variational Inequalities

17 Feb 2021 7

Adaptive-Gradient-Clipping

Minimal implementation of adaptive gradient clipping (https://arxiv.org/abs/2102.06171) in Tensor...

18 Feb 2021 77

mgplvm-pytorch

20 Oct 2020 22