This page is the fourth part of an introduction to Git:
- Installation
- Quickstart
- Cheat Sheet
- Gitignore
Git works best when used to version control plaintext files such as:
Git does not work well with binary files. These are files that are too complex for Git to be able to tell what’s changed inside them, it can only tell if something’s changed. This is why they are called ‘binary files’ - they’ve either changed or they haven’t with no other information being available. Some examples are:
Plaintext data files may or may not work well in Git (it depends on how large they are):
In general, large files and binary files cause a Git repo to increase in size (which slows it down because the Git commands take longer to run) for no benefit (because they are not able to take advantage of Git’s functionality). As a result, you should avoid adding them. A “Gitignore file” can help you do this: it is a file that tells Git to intentionally ignore certain folders, files and file types.
To create a Gitignore file, create a new file and name it “.gitignore” (with a dot at the start and all lowercase) in the top level of your Git repo:
touch .gitignoreOpen the file in a plaintext editor (eg Notepad, TextEdit, gedit, Sublime Text, etc) and add the files, folders and file types that you want Git to ignore:
filename.txt will cause all files with that exact name to be ignoredfoldername/ will cause all folders with that exact name to be ignored (including any and all files and sub-folders inside of them)*.docx will cause all files that end with that extension to be ignored# Like thisFiles and folders that are already added to a Git repo are not affected by a Gitignore file that is added after them. In other words, changes made to them will continue to be tracked. This stops being true if the file/folder is moved into a different folder or has its name changed to something that is also in the Gitignore file.
Creating a folder called “gitignored” and then adding
gitignored/to the .gitignore file will cause anything placed inside that folder to be ignored. This can be a simple way to keep on top of what within your repo is being tracked and what isn’t: if it’s in the folder it’s ignored and if it’s not it’s not! This can be useful as an ‘archive’ for storing outputs, logs and intermediate files that are not immediately useful but which may still have value as references.
Here’s an example of what you might have in your .gitignore file:
# Anything explicitly ignored
gitignored/
# R local files
*.Rproj.user*
*.Rhistory
*.RDataTmp
*.Rproj
# Files generated by roxygen or package building/test/checking
*.Rd
*.Rcheck*
*.RData
# Notebook files
*.nb.html
*.html
# Archive files
*.tar.gz
*.tar
*.gz
*.zip
*.pkl
# Configuration/initialisation file
*.ini
# Binary files
*.doc
*.docx
*.jpeg
*.jpg
*.ods
*.pdf
*.png
*.ppt
*.pptx
*.tiff
*.xls
*.xlsx
# Python
*.egg*
*.pyc
venv/
__pycache__
# Hidden ipython notebook files
.ipynb_checkpoints
# Pycharm helper files
.idea
# Test result files
*.xml 
# Open files
*.~lock.*
# Latex files
*.aux
*.bbl
*.blg
*.cls
*.dtx
*.dvi
*.fdb_latexmk
*.fls
*.glo
*.idx
*.ins
*.log
*.out
*.tex.bak
*.toc
*.nav
*.snm
# macOS hidden files with info about folder display
*.DS_Store
# Mat files
*.mat
# Font files
*.ttf
# Data files
*.csv
*.json