Introduction to Git and GitHub

Published

April 22, 2024

1 Git and GitHub

Git and GitHub are different but complementary tools for the development of programming projects or related work. Git is a computational version control tool that is used practically worldwide. GitHub can be described as the most popular online platform for hosting programming projects (i.e., code). However, this definition is simplistic: GitHub offers many other tools to facilitate collaboration and publication of programming-related projects.

  • Git → version control
  • Github → online repository
git and github

 

2 Git: Version Control

A version control system tracks the history of changes as people and teams collaborate on projects together. As the project evolves, teams can run tests, fix bugs, and contribute new code with the confidence that any version can be retrieved at any time.

version control

Taken from Bryan 2018

Version control:

  • Keeps track of changes made
  • Facilitates collaborative work
  • Allows knowing who changed what and when
  • Allows reverting changes

Git does not need a permanent connection to a central repository and allows asynchronous work of collaborators.

 

2.1 Installing Git


2.1.1 Install Git on Linux

Git was originally developed on the Linux operating system. Therefore, it makes sense that it’s easier to set up to run on Linux. You can install Git on Linux via the package management tool that comes with your distribution:

  • Git packages are available using apt
  • It’s a good idea to ensure you’re running the latest version. To do this, navigate to the command prompt terminal and run the following command to ensure everything is up to date: sudo apt update
  • To install Git, run the following command: sudo apt install git-all
  • Once the command output is complete, you can check the installation by typing: git version

2.1.2 Install Git on Windows

  • Find the latest installer for Git for Windows and download the latest version
  • Once the installer has started, follow the instructions provided on the Git Setup Wizard screen until installation is complete
  • Open the Windows terminal or Git Bash
  • Type the git version to verify that Git is installed (git version)

Note: git-scm is recommended to download Git for Windows. The advantage of downloading Git from git-scm is that it uses the latest version of Git and includes Git Bash.


2.1.3 Install Git on macOS

Most versions of macOS have Git installed. However, if you don’t have Git installed, you can install the latest version of Git using one of these methods:

2.1.3.1 Install Git from an installer
  • Run git --version in the terminal, if it’s not installed yet you will be prompted to install it
  • Once the installer has started, follow the instructions provided until installation is complete
2.1.3.2 Install Git from Homebrew

Homebrew is a package manager for macOS. If you already have Homebrew installed, you can follow these steps to install Git:

  • Open a terminal window and install Git using the following command: brew install git
  • Once the command output is complete, you can verify the installation by typing: git --version

 

2.2 Using Git for Version Control

Git is typically used from a terminal (console where code is run). In Windows, it is recommended to use the gitbash console. In Unix-based operating systems (macOS and Linux), it can be used directly from the default terminal.

 

2.2.1 Steps

  1. Set up your Git identity

Git uses a name and an email address to identify commits with an author (i.e., track which person is making which changes). Your Git identity is not the same as your GitHub account (we’ll talk about GitHub shortly), although both should use the same email address. Typically, you only need to set up your Git identity once per computer.

You can set your Git username like this:

Code
# define username in git
git config --global user.name "Your name"

… and you can set the email like this:

Code
# define email in git
git config --global user.email "your@email.com"

 

  1. Create a local repository

Navigate to the folder where you want to keep the repository (cd) and initialize the local repository (git init):

Code
# example folder
cd /path/to/my/repository

# initialize local repository
git init

git init initializes a new Git repository and begins tracking

an existing directory. It adds a hidden subfolder within the existing directory that houses the internal data structure needed for version control .

 

  1. Add a new file

Create a new file. It can be an .R file or a README.md file describing the new repository (in this case, it’s just an example). Now we can use git status to check the current status of the project:

Code
# check status
git status

Note: Git can track specific changes within a file when they are text files (non-binary files). For binary files, only the version history will be tracked.

 

  1. Commit changes in Git

To start tracking a new file, you must use the following command:

Code
# start tracking a file
git add filename

 

You can also do it for all new files like this:

Code
# start tracking all new files
git add .

 

Once you’ve started tracking file(s), you can commit the changes to the project’s version history:

Code
# commit changes
git commit -m "short but explanatory message here"

 

A common workflow is to edit files in your project, add the files you want to save using the git add command, and then commit the changes to the file using the git commit command. We can also use git status to check the project’s status.

  1. Add branches
  • A clean way to make changes is to keep them on a parallel development line known as a “branch”
  • It’s also the safest way to work on different versions of a repository simultaneously
  • By default, the repository has a main branch considered the main version
  • We use branches to experiment and make edits before pushing them to the main branch
  • Branches can maintain independent versions of the project that can later be (or not) merged into the main branch
repository github branches

 

The following code creates a new branch and activates it:

Code
# add a branch
git checkout -b new_branch

 

Now we can make changes to the project without affecting the main branch. Once satisfied with the changes, they can be reintegrated into the main branch:

Code
# go back to the main branch
 git checkout master

# merge new branch
git merge new_branch

 

… and delete the alternate branch:

Code
# delete branch
git branch -d new_branch

 

Additional Commands to Monitor Project Status in Git

 

  • git log: returns a history of commits
  • git reflog: allows visualizing recent changes such as additions, commits, and branch changes
  • git diff: allows visualizing specific changes in non-binary files (i.e., text files)
  • git reset: resets the main branch to a previous state
  • Use git --help to see a list of other useful commands for tracking/modifying histories

 


2.3 Using GitHub to Host Projects

GitHub is a platform for storing code, version control, and facilitating collaborations. It allows users to sort and save copies of code, as well as collaborate on projects related to programming remotely.

2.3.1 Steps

  1. Create an account on GitHub
  • Go to (this address)[https://github.com/join] and click the “Create an account” button
  1. Create a repository
  • In the top right corner, next to your identification icon, click “+” and then select “New repository”
  • Name the repository and write a brief description
  • Select “Initialize this repository with a README file” (optional but recommended)
new github repository

 

  1. Create a branch
  • Branches on GitHub are equivalent to those we create in Git and have the same function: running parallel versions of a project
repository github branches

 

  1. Merge the secondary branch
  • To merge branches, a pull request must be made
  • The pull request is a way to alert the repository owner that you want to make some changes to their code
  • The request allows reviewing the code and making sure it looks good before including the changes in the main branch. Once reviewed, the owner can decide whether to accept or reject those changes (“merge”).

First, a pull request must be requested:

pull request

 

And after ensuring there are no conflicts, the alternate branch and the main branch can be merged:

pull request

 

 

2.4 Git + GitHub

  • Git-managed local repositories can be hosted and/or synchronized with remote repositories hosted on GitHub
  • This combination of tools is particularly useful for remote collaborations and for sharing code with the community

 

pull request

 

2.4.1 Steps

  1. Clone the repository locally
  • This step is done using git on your computer
  • git clone creates a local copy of a project that already exists remotely
  • The clone includes all files, history, and branches of the project
  • We can use the course repository as an example: https://github.com/maRce10/curso_reproducible
  • First, copy the https address of the repository (https://github.com/maRce10/curso_reproducible.git)

 

pull request

 

  • And then run git clone locally:
Code
# clone remote repository
git clone https://github.com/maRce10/curso_reproducible.git

 

  1. Set up the remote repository locally
Code
# enter the folder
cd ./curso_reproducible

# initialize repository in git
git init

# set the repository address
git remote add origin https://github.com/maRce10/curso_reproducible.git

 

  1. Send pull requests to GitHub
  • Once you have worked on a project locally, these changes can be synchronized with the repository on GitHub
  • git pull updates the remote development line with updates from its local counterpart
  • The change must be tracked and committed to be synchronized
Code
## track new changes
git add .

# commit changes
git commit -m "local change x"

# send the first request to github
git pull origin master

# send subsequent changes to github 

git pull

 

  1. Update local repository
  • If there are changes from your collaborators that have been synchronized with the repository on GitHub, they can be synchronized locally using git push
  • git push updates the local development line with updates from its remote counterpart
Code
# synchronize remote changes the first time
git push origin master

# synchronize remote changes 
git push

 

 

2.5 Exercise 1

  • In your GitHub account, create a new repository

  • Create a README file remotely (on GitHub)

  • Clone the repository locally

  • Synchronize the local repository with GitHub

  • Make changes locally (e.g., add a text file) and send these changes to GitHub

 


2.6 References

  • Bryan, J. (2018). Excuse Me, Do You Have a Moment to Talk About Version Control? American Statistician, 72(1), 20–27. https://doi.org/10.1080/00031305.2017.1399928 ```