Skip to the content. Disable Animations :x: :movie_camera:

Extra: Advanced Git

🔗 Do I need to learn git? No.

Many people make do with never really learning git, but use it nevertheless. How to do this? Keep a curated list of commands with you at all times that you can copy and paste from. This typically includes:

  • git clone <URL>
  • git checkout <branch>
  • git checkout -b <branch> (create new branch based on current)
  • git add <file>
  • git commit -m "<message>"
  • git merge <branch>(merge branch into current)

So! If you’re happy nuking the repository every time something unexpected happens, and manually copying files around, then stick with the above and there’s nothing else you need.

However, if you would like to understand more about Git to see all the ways it can help you with different projects and scenarios, then read on. Everything that follows is what I find useful in my day to day work.

🔗 Why Git

I’m going to tell you what git does by telling you why people needed something like it, and how it addresses those needs. Then I’ll go over the fundamental git commands, which make all other git commands make sense.

Needs:

  • Version control and team coding, obviously
  • Multiple off-site backups, private cloud repos, or public-facing repos
  • Storing user-specific settings, like username, preferred editor, etc.
  • One folder: All information should move with the repo folder (removing git should involve deleting one thing)
  • Allow offline work, with periodic sharing to a remote repository that can be shared

The Solution:

  • Git stores all branch and user information in a hidden .git/ directory
    • git init creates the .git/ in current directory, making it a local repository
    • rm -rf .git destroys the .git/ in current directory, making it a normal folder again
  • Git allows sharing/publishing your local repository changes to any other repository. We call those other repositories remotes. For instance, you might want to share/publish code to an internal company shared repo, an off-site backup, or public-facing release server. Each local repository on your computer can be told about the existence of other repositories if you know their addresses. To easily work with those remotes you should name them when adding a remote to your repository. Here’s how to work with remotes.
    • git remote add our-internal git@aws.company.com:team/project.git
    • git remote add backup-server git@bk2.company.com:team/project.git
    • git remote add public git@company.com:team/project.git
    • git remote add personal git@github.com:jory/project.git (IP concerns for a company!)
    • git remote remove personal
    • git remote add nonSSHpersonal https://www.github.com/jory/project.git (IP concerns for a company!)
    • git remote -v (list all available - in this case, the above 4)
  • User-specific settings stored in .git/config set using:
    • git config --local user.name "Me"
    • git config --local user.email "me@uni.edu"
    • git config --global core.editor "vim" (global settings are in file ${HOME}/.gitconfig)
    • Note that remote URLs can use either https (https://...) or ssh (git@server...), which affects how you can push changes to the remote. https is fairly common, because we often have read-only access to most repositories, and if you try to push, git will prompt you for your login information for that repo (such as github username and password). SSH is nice to work with when you’ve set up ssh authentication for your repository, so git will automatically check your local ssh files in your home directory, and use that to authenticate for the remote, meaning you never have to manually log in when pushing changes. In github, you can set up ssh keys in your profile/settings to facilitate this.
  • Git allows offline work, but that means having to cache the state of the remote server(s):
    • Git stores a read-only “understanding” of what is on the remote server, so you can build on that code even if you’re not online.
      • git branch -a to list all branches, and you’ll see these read-only branches like origin/master, often colored red
    • remotes are just communally accessible git repos (like one you created with git init) If your computer is on a network and people could access your computer, then your git projects could be remotes for other people! ex:git remote add jane-does-repo git@192.168.1.32:temp/project.git
    • your local repository can be synchronized with any remote, sharing/publishing your new changes. To do this, you need to ensure you have an up-to-date cache of the remote server, so git knows how to propose your changes to the server.
    • git fetch backup-server (synchronize your read-only copy of the remote server)
    • git push backup-server my-feature-branch

🔗 Fundamental git commands

🔗 Getting a remote repository: fetch vs clone

It is often confusing when first learning git, what is the difference between fetch and clone? clone is a convenience git command, that automates several basic git commands. Here is the basic way to accomplish downloading a new remote repository, followed by the clone convenient way:

  • mkdir myrepo
  • cd myrepo
  • git init (make myrepo a [empty] git repository)
  • git remote add <invent_a_name> <repo_address> (usually, origin is the name used)
  • git fetch <invent_a_name> (synchronize-to-local / get latest state of remote) (I use this often)
  • git checkout master (switch working tree to the master branch)

OR

  • git clone <repo_address> myrepo (same as above 6 commands!) (myrepo is optional new name)

🔗 Pros & Cons: fetch vs clone

You should probably use the clone method most of the time, though be aware it assumes origin as remote name, and downloads all branches by default.

Use the fetch method if you don’t like those default assumptions and want to change any of it.

🔗 Updating your branch with remote changes: fetch-&-merge vs pull

There is also a convenience command for updating your branch to the latest from the remote server. pull is the convenience command, but it is an automation of 2 basic commands: fetch and merge.

IF

you are on the development branch, for instance:

  • git checkout development (switch to the development branch)

THEN

  • git fetch origin (synchronize-to-local / get latest state of remote; assuming origin is the remote name)
  • git merge origin/development (merge in all the changes from your read-only local copy of origin’s development branch, and automatically try to resolve conflicts)

OR

  • git pull (same as fetching then merging)

🔗 Pros & Cons: fetch-&-merge vs pull

You should use the pull method most of the time, but it assumes you want all remote changes applied.

Use the fetch-&-merge method when you do not automatically want to merge in all changes, such as if you want to tailor the merge command, or use a different command (cherry-pick) to get some commits but not all from the branch in origin.

🔗 What is HEAD? What is a branch?

Kind of like a linked list, git needs to know where the current location is in your string of commits, onto which it will add the next commit, or determine if two branches are different. HEAD is like a pointer, and every branch has a HEAD. You can think of it like a write head in analog hard drive speak: it’s where the changes will happen next when you make changes. It usually points at the most recent commit, but in some cases you can force it back to an earlier commit, as a way to “undo” some later commits.

Similarly, branches are just pointers to commits, but can have extra data associated with them, such as a HEAD and tracking information.

Create a branch:

  • git checkout -b <new_name> creates a new branch set up to track current branch as parent
  • git branch <new_name> creates a new branch

Tracking information lets git automatically perform checks and enable useful commands for merging. For instance, if a new feature branch based on develop is not set up to track develop, then git pull will not work, and git will also not tell you when you are ahead with commits, behind, or have diverging commits from its parent branch.

You don’t always need to have tracking information, such as using a new branch (remember, pointer) to keep track of a certain commit for a moment while you change HEAD. But tracking information is good when you are developing code in a branch.

You can add branch tracking information after the fact with git branch -u <remote_server_branch_to_track>, but usually it’s done automatically if you’ve created a new branch using the checkout -b method.

🔗 Fixing a bad commit or bad uncommitted changes

Mistakes happen! Here’s a list of scenarios and how to use git to fix them while preserving all your careful work to record changes over time.

If you just forgot to add some files to the most recent commit (D), or wanted a different commit message, then you can:

commit history BEFORE
--A--B--C--D(HEAD)
commit history AFTER
--A--B--C--D(HEAD)
  • git add <forgotten files>
  • git commit --amend -m "new message"

If you wanted to make different changes to the files, you can roll back HEAD, but keep the file changes of later commits:

commit history BEFORE
--A--B--C--D(HEAD)
commit history AFTER
          _E(HEAD)
         /
--A--B--C--D
  • git reset <commit_id> (get the commit_id from git log)
  • git reset HEAD^1 (shortcut for resetting to 1 commit previous to current HEAD location)
  • (make more changes)
  • git add <fixed_changes>
  • git commit -m "<new_message>" This leaves all changes after the new HEAD location (here, D) as uncommitted changes in your working directory, so you don’t lose them, and can recommit them if you want to.

If you want to ignore a whole string of commits, and all their changes, you can roll back and wipe the history by using the above process, but adding the --hard flag to the reset command:

commit history BEFORE
--A--B--C--D(HEAD)
commit history AFTER
       _E(HEAD)
      /
--A--B--C--D
  • git reset --hard <commit_id> And it will be like those commits never existed and you lose the changes in those commits (here, C,D). (look up git reflog if you really want them back)

Stage and Unstage

  • git add <file> stages for commit
  • git reset <file> unstages from staged

If you don’t care for changes you’ve made in a file, and want to reset it back to last known commit:

  • git checkout -- <file>

Temporarily test some earlier version of the code, then reset it back to the commit where you were:

commit history BEFORE
--A--B--C--D(HEAD)
commit history DURING testing
--A--B(HEAD)--C--D(bookmark)
commit history AFTER
--A--B--C--D(HEAD)
  • git branch bookmark I named my new branch bookmark (it’s just a pointer to a commit)
  • git reset --hard <commit_id> Go to some earlier commit
  • (build and test, etc.)
  • git reset --hard bookmark (get back to the commit you were at)
  • git branch --delete bookmark (remove the bookmark)

🔗 Other Commands

Here are some other commands or sequences I use quite often.

  • git fetch followed by git status will tell me if my current branch has diverged from the remote, or is simply behind the remote
  • git branch -a lists all branches, including the read-only local copies of the remotes (usually listed in red) (-a: all)
  • git remote -v lists all remotes, names and addresses (-v: verbose) Why not -a? ¯\(ツ)/¯ that’s git for you!
  • git push <remote_name> <branch> push local commits to the remote; doesn’t have to be current branch.
  • git push <remote_name> HEAD shortcut for pushing changes of current branch
  • git checkout - go back to the previous branch (switch between most recent branches)
  • git log --oneline -8 see the most recent 8 commits, all 1-line each (easier to read!)
  • git log --graph --oneline see an ascii graph of merge history
  • git diff or git diff --staged tells you about changes in the working tree, or changes that are staged. Useful to confirm you are adding or committing what you think you are.
  • git rebase -i <earlier_commit> lets me ‘squash’ several commits together and give the new commit a new message. Useful for cleaning up a bunch of commits that eventually led to working code. Do NOT do this on a branch other people are working with because you will need to force-rewrite remote history if the branch already exists on the remote. Uses your text editor — look up git rebase -i for more info.