GIT TRACKING FOR PYTHON NOTEBOOKS
A simple idea: GitNB doesn't actually track python notebooks. Instead, GitNB creates and updates python versions of your notebooks which are in turn tracked by git.
This quick-start is just an example. It looks long (due to bash-output) but its quick: 1-2 minutes tops.
A. INITIALIZE GIT REPO
test| $ tree
.
├── A-Notebook.ipynb
├── A_BUGGY_NOTEBOOK.ipynb
├── Py2NB.ipynb
├── another_python_file.py
├── some_python_file.py
└── widget
├── I\ have\ spaces\ in\ my\ name.ipynb
├── Notebook1.ipynb
└── widget.py
1 directory, 8 files
test| $ git init
Initialized empty Git repository in /Users/brook/code/jupyter/gitnb/test/.git/
test| $ git add .
test| $ git commit -am "Initial Commit: python files"
[master (root-commit) b29b6c4] ...
B. INITIALIZE GitNB, ADD NOTEBOOKS TO GitNB TO BE TRACKED
# initialize gitnb
test|master $ gitnb init
gitnb: INSTALLED
- nbpy.py files will be created/updated/tracked
- install user config with: $ gitnb configure
# lets list our (untracked) notebooks
test|master $ gitnb list
gitnb[untracked]:
Py2NB.ipynb
A-Notebook.ipynb
widget/I have spaces in my name.ipynb
A_BUGGY_NOTEBOOK.ipynb
widget/Notebook1.ipynb
# adding an individual file
test|master $ gitnb add A_BUGGY_NOTEBOOK.ipynb
gitnb: add (A_BUGGY_NOTEBOOK.ipynb | nbpy/A_BUGGY_NOTEBOOK.nbpy.py)
# adding all the files in a directory
test|master $ gitnb add widget
gitnb: add (widget/I have spaces in my name.ipynb | nbpy/I have spaces in my name.nbpy.py)
gitnb: add (widget/Notebook1.ipynb | nbpy/Notebook1.nbpy.py)
# the default directory for the python versions of the notebooks is nbpy/
test|master $ tree
.
├── A-Notebook.ipynb
├── A_BUGGY_NOTEBOOK.ipynb
├── Py2NB.ipynb
├── another_python_file.py
├── nbpy
│ ├── A_BUGGY_NOTEBOOK.nbpy.py
│ ├── I\ have\ spaces\ in\ my\ name.nbpy.py
│ └── Notebook1.nbpy.py
├── some_python_file.py
└── widget
├── I\ have\ spaces\ in\ my\ name.ipynb
├── Notebook1.ipynb
└── widget.py
2 directories, 11 files
# our list now conatins tracked and untracked notebooks
test|master $ gitnb list
gitnb[tracked]:
widget/Notebook1.ipynb
widget/I have spaces in my name.ipynb
A_BUGGY_NOTEBOOK.ipynb
gitnb[untracked]:
A-Notebook.ipynb
Py2NB.ipynb
# note these files are now in our git repo
test|master $ git status
On branch master
Changes to be committed:
(use "git reset HEAD <file>..." to unstage)
new file: nbpy/A_BUGGY_NOTEBOOK.nbpy.py
new file: nbpy/I have spaces in my name.nbpy.py
new file: nbpy/Notebook1.nbpy.py
# git commit the new nbpy.py versions
test|master $ git commit -am "add nbpy.py versions of notebooks"
[master 868b0a2] ...
C. QUICK LOOK AT A "NBPY.PY" VERSION OF A NOTEBOOK
test|master $ cat nbpy/A_BUGGY_NOTEBOOK.nbpy.py
"""[markdown]
## This is a notebook with bugs
"""
"""[code]"""
import numpy as np
""""""
"""[code]"""
def feature(food=True):
if foo:
return "I am not a bug"
else:
return "I told you I am not a bug"
""""""
"""[code]"""
print("Are you a bug?")
print(feature(True))
""""""
D. UPDATE NBPY.PY FILE AFTER EDITING YOUR NOTEBOOK
That notebook is buggy ...[updating python notebook]... I just went to the python-notebook and fixed the bugs. Let's see what happened:
# note the changes have not appeared in our nbpy.py file
test|master $ git diff
# however, we can see the changes with 'gitnb diff'
test|master $ gitnb diff A_BUGGY_NOTEBOOK.ipynb
gitnb[diff]: A_BUGGY_NOTEBOOK.ipynb[->nbpy.py] - nbpy/A_BUGGY_NOTEBOOK.nbpy.py
--- +++ @@ -1,7 +1,7 @@
"""[markdown]
-## This is a notebook with bugs
+## This is a notebook without bugs
"""
@@ -11,7 +11,7 @@
"""[code]"""
-def feature(food=True):
+def feature(foo=True):
if foo:
return "I am not a bug"
else:
# we now use 'gitnb update' to update the tracked files
# this creates a new nbpy.py version and adds the changes
# to the git repo
test|master $ gitnb update
# now we can see the bug fixes with 'git diff'
test|master $ git diff
diff --git a/nbpy/A_BUGGY_NOTEBOOK.nbpy.py b/nbpy/A_BUGGY_NOTEBOOK.nbpy.py
index e80204b..955b359 100644
--- a/nbpy/A_BUGGY_NOTEBOOK.nbpy.py
+++ b/nbpy/A_BUGGY_NOTEBOOK.nbpy.py
@@ -1,7 +1,7 @@
"""[markdown]
-## This is a notebook with bugs
+## This is a notebook without bugs
"""
@@ -11,7 +11,7 @@ import numpy as np
"""[code]"""
-def feature(food=True):
+def feature(foo=True):
if foo:
return "I am not a bug"
else:
# commit the changes
test|master $ git commit -am "fixed bug: i fixed .ipynb, gitnb fixed .nbpy.py"
[master 812a4f0] ...
E. CREATE PYTHON-NOTEBOOK FROM NBPY.PY FILE
Finally, lets say we actually need that buggy notebook after all
test|master $ git checkout 868b0a2
Note: checking out '868b0a2'.
[... git detached head messaging ...]
HEAD is now at 868b0a2... add nbpy.py versions of notebooks
# create notebook from nbpy.py file
test|(HEAD detached at 868b0a2) $ gitnb tonb nbpy/A_BUGGY_NOTEBOOK.nbpy.py
# the default directory for generated notebooks versions is nbpy/
test|(HEAD detached at 868b0a2) $ tree nbpy_nb
nbpy_nb
└── A_BUGGY_NOTEBOOK.nbpy.ipynb
0 directories, 1 file
My bugs are back!
If the quick-start seemed like too much how about this...
$ gitnb commit -am "I just updated and commited every notebook in my project"
How in the what? Two things are going on here
# ./gitnb.config.yaml
...
GIT_ADD_ON_GITNB_UPDATE: True
AUTO_TRACK_ALL_NOTEBOOKS: True
...
Now each time I gitnb commit
:
git commit --allow-empty
is called
Note: the --allow-empty
flag is there because the at the time of the commit (before the nbpy.py files are generated there may or may not be changes to commit)
Here's the super-quick-quick-start-example:
test|master $ git init
Initialized empty Git repository in /Users/brook/code/jupyter/gitnb/test/.git/
test| $ gitnb init
gitnb: INSTALLED
- nbpy.py files will be created/updated/tracked
- install user config with: $ gitnb configure
test| $ gitnb configure
gitnb: USER CONFIG FILE ADDED (./gitnb.config.yaml)
... go update gitnb.config.yaml ...
test| $ gitnb commit -am "Initial Commit with everything"
gitnb: add (A-Notebook.ipynb | nbpy/A-Notebook.nbpy.py)
gitnb: add (A_BUGGY_NOTEBOOK.ipynb | nbpy/A_BUGGY_NOTEBOOK.nbpy.py)
gitnb: add (Py2NB.ipynb | nbpy/Py2NB.nbpy.py)
gitnb: add (widget/I have spaces in my name.ipynb | nbpy/I have spaces in my name.nbpy.py)
gitnb: add (widget/Notebook1.ipynb | nbpy/Notebook1.nbpy.py)
[master (root-commit) cb8c106] ...
test|master $ tree
.
├── A-Notebook.ipynb
├── A_BUGGY_NOTEBOOK.ipynb
├── Py2NB.ipynb
├── another_python_file.py
├── gitnb.config.yaml
├── nbpy
│ ├── A-Notebook.nbpy.py
│ ├── A_BUGGY_NOTEBOOK.nbpy.py
│ ├── I\ have\ spaces\ in\ my\ name.nbpy.py
│ ├── Notebook1.nbpy.py
│ └── Py2NB.nbpy.py
├── some_python_file.py
└── widget
├── I\ have\ spaces\ in\ my\ name.ipynb
├── Notebook1.ipynb
└── widget.py
2 directories, 14 files
pip install gitnb
git clone https://github.com/brookisme/gitnb.git
cd gitnb
sudo pip install -e .
$ gitnb --help
usage: gitnb [-h] {init,configure,list,update,add,remove,topy,tonb} ...
git commit
Initialize Project:
git init
required before gitnb init
$ gitnb init
Install Config:
$ gitnb configure
Update .gitignore:
Appends (or creates) gitignore with the recommended settings. Namely,
$ gitnb gitignore
List Project Notebooks, or nbpy.py files
positional arg (type):
$ gitnb list --help
usage: gitnb list [-h] [type]
positional arguments:
type notebooks: ( all | tracked | untracked ), or nbpy
Add notebook to gitnb:
git add
on nbpy.py file(s)$ gitnb add --help
usage: gitnb add [-h] path [destination_path]
positional arguments:
path path to ipynb file
destination_path if falsey uses default destination path
Remove notebook from gitnb:
$ gitnb remove --help
usage: gitnb remove [-h] path
positional arguments:
path path to ipynb file
Update nbpy files:
$ gitnb update
Update and Commit:
git commit --allow-empty
with optional flags [a|m]:
--allow-empty
because the nbpy files are not yet updated#
# this line of code is equivalent to
# - $ gitnb update
# - $ git add .
# - $ git commit --allow-empty -am "COMMIT MESSAGE"
#
$ gitnb commit [-a] [-m "COMMIT MESSAGE"]
Diff for recent changes.
Creates a diff between the most recent nbpy.py version of the noteboook and the nbpy.py version of the notebook in its current state (the working copy).
$ gitnb diff <PATH-TO-NOTEBOOK(.ipynb)-FILE>
To-Python:
$ gitnb topy --help
usage: gitnb topy [-h] path [destination_path]
positional arguments:
path path to ipynb file
destination_path if falsey uses default destination path
To-Notebook:
$ gitnb tonb --help
usage: gitnb tonb [-h] path [destination_path]
positional arguments:
path path to ipynb file
destination_path if falsey uses default destination path
The configure method installs gitnb.config.yaml
in your root directory. This is a copy of the default config. Note at anytime you can go back to the default configuration by simply deleting the user config file (gitnb.config.yaml
).
There are comment-docs in the config file that should explain what each configuration control. However I thought I'd touch a couple of the perhaps more interesting configurations here.
see But I'm Lazy!
If True the add method will perform a git add
after creating the nbpy file and adding it to the gitnb tracking list. You can set this to False if you want to explicity call git add
yourself after looking over the file.
If True, the gitnb update method will automatically be called when performing a git commit
(during pre-commit hook).
If True, gitnb add .
(see add method) will automatically be called when performing a git commit
(during pre-commit hook). This will add all notebooks in your project to gitnb.
Note if the only thing that has changed is your notebooks, you'll still need to explicity call gitnb update
or add the --allow-empty
flag to your git commit
.
A list of directories not to include when searching for notebooks
You can also configure, default location for new files, if they include an indentifier (like 'nbpy' in somefile.nbpy.py
), spacing in nbpy files and more. Check the comment-docs for more info.