# Create Jupyter Git

A CLI command that generates a fresh Git repository with files and configs to optimize version control for Jupyter Notebooks

## Description

A common use of Jupyter Notebooks is for learning, taking notes, and having code examples that you can modify and run later. Other commons uses include proving out data analysis or machine learning which generates a lot of output in the form of images and data. In both of these scenarios, the output changes frequently and it not as important as the notebook configuration. The output can easily be regenerated for many use cases. 

The output generated by notebooks is a great candidate for to be ignored in a `Git` repoitory so commits are minimal and point to meaningful code and not data that is derived from that code. 

There are a number of methods one can take to version control your Jupyter Notebooks and ignore the output.

One of the best ( [documented here](https://github.com/toobaz/ipynb_output_filter) ) is to utilize a `Git` filter to target `*.ipynb` files and strip out the `output` field in the json before it gets staged.

This approach requires a few steps that you may not be interested in or may want to have to deal with when setting up a new repo for Jupyter Notebooks so this CLI command can be used to create and initialize a `Git` repository with configs already in place. Simply startup your Jupyter Notebooks and commit when you hit a meaningful checkpoint. 

## Installation

Install the CLI

```bash
pip install create-jupyter-git
```

## Usage

Run the CLI and specify in the path to where you want your **NEW** Git repository created

```bash
create-jupyter-git <new repository path>
```

This repository will have a `.gitignore` to ensure checkpoints aren't versioned. It also creates a `.gitattributes` with a configuration for filtering and then adds `.git/config` values to utilize the Python scripts that handle the filtering via `git filter clean`.

### Start Jupyter

```
cd <new repository path>
jupyter lab notebooks
```


### Start Jupyter with `.venv`

This setup is great for pulling in dependencies just for your Notebooks that don't clutter your global or personal python library space.

Setup your `.venv` and allow your global or user Jupyter install to be utilized. 

```
cd <new repository path>
python3 -m venv .venv --system-site-packages
```

Activate the `.venv`:

```bash
source .venv/bin/activate
```

Add your `.venv` as a Juypyter kernel

```bash
python -m ipykernel install --user --name=.venv
```

Start the Jupyter Lab

```bash
jupyter lab notebooks
```

### Commit Your Changes

You can create directories, notebooks, and fill your notebooks with wonderful code and generate beautiful output. When you are at a meaningful spot in your development simply do a `git commit`. The Git configurations that are inplace will filter out all output within your notebook files and stage them.

If you push up to a remote repository like GitHub, you will see that the output fields in your notebooks are empty! Great!

You will also notice GitHub does some cool magic to re-generate the output in a preview format for you when you view a `*.ipynb` file. So you can still see the output in GitHub without storing it in your source. Neat!

# Development

## Publishing

First bump the version

```bash
bumpversion --current-version x.x.x <major | minor | patch> setup.py create_jupyter_git/__init__.py
```

Next generate the distribution files

```bash
python setup.py sdist bdist_wheel
```

Validate the package

```bash
twine check dist/*
```

Upload the package for verification

```bash
twine upload --repository-url https://test.pypi.org/legacy/ dist/*
```

Upload the package for publication

```bash
twine upload dist/*
```