Getting Git

The following is an in-progress guide to Git that I'm working on for my work, so there's some small stuff pertaining to our particular work environment in this guide. Maybe some day I'll write a "How Not to Use Git" guide, with the flawed process we've been using written out as satirical instructions.


An Introduction to Git

Git is a version-control system (VCS)—sometimes called a source-control manager (SCM).

A crucial part in understanding Git is understanding the context that resulted in its creation.

1969–1982

Unix-based operating systems saw fairly widespread use in academic institutions and computer nerds of the day liked it because it was modular, self-contained, and aimed at programmers. Trouble was, Unix was proprietary, owned by AT&T, and licenses to use it had restrictions (which could make it a gray-area to change the OS to support whatever program you were working on).

Enter Richard Matthew Stallman

In 1983, in a Usenet post, Richard Matthew Stallman kicked off the Free Software Movement by annoucing his project: a completely free, open-source operating system. Because he is a computer nerd, he called it 'GNU'—which stands for 'GNU, Not Unix' (nerdy recursion joke). The project was to basically replace Unix but have the same "feel."

1991

GNU was still not fully complete (many utility programs used in most operations were implemented). The kernel (which manages CPU access, RAM access, memory, devices, hardware, and system calls) was the remaining thing to be finished.

Enter Linus Torvalds: a Finnish guy who wrote his own free, open-source kernel that used the GNU utils, and named it Linux.

2005

Linux has become very, very prevalent. Used everywhere.
See this archived Wikipedia article to see the state of Linux adoption in 2005. (Linux is currently the largest software development project on the planet. It is the basis for Android, ChromeOS, and far too many other things to list.)

A large chunk of the kernel is drivers for various hardware that only bloats as time goes on. Another chunk is a lot of networking stuff. Loads and loads of people contributing to the repository and making pull requests daily.

Probably not that much fun to maintain if you're Linus, right?

Right.

To simplify events, Linus was annoyed with the version control tool he was using, so he wrote a new one in a weekend.

He made sure his new tool:

  • had users working on local copies of the repository
  • had support for distributed merging
  • could allow for easy renaming of files
  • was fast
  • was open source

Linus named this new tool 'Git'—named after himself, since 'Git' means 'an unpleasant person'. You may be entertained to glance over the original Git readme from the first ever Git commit on GitHub. Git has seen widespread use partly due to GitHub.

Basic Git

Here's a couple of interactive tutorials that should familiarize you with Git.

Our Version Control Process

Git is flexible and can be used in many ways, some more incorrect than others. Much of the remainder of this guide outlines the version control process we will use.

Features

We need to think in terms of 'features' instead of 'tasks'. We don't want to integrate changes if the feature is not complete. This means if server-side code is finished, but styles are not complete, the changes for the server-side code should not be merged into the integration branch until the client-side styles and scripts are complete.

The convention in Git—in fact, the default—is to name the primary, working branch "master". We will adhere to this convention.

Don't Be Afraid of Merging!

If you aren't familiar with Git but are familiar with other SCMs, merging may have been quite painful for you in the past.

Git was designed for easy merging, and because of its light underlying cryptography, this is possible. We will use branching in Git with branches for features and these branches will get merged together. That is, we will use Git as intended.

It should be noted that it can be somewhat difficult to merge two branches that have diverged due to lack of proper maintenance. It is also difficult if the branches have had changes brought in from one-another in a way that circumvents the underlying cryptography as it is usually used, losing the link between data that Git uses to make merging easy.

Key Workflow Points

  • Features get their own branches (even for simple text changes—because branches are simple!)  The benefits of having a branch for each feature are:  we will not "lose" commits anymore we will not have to constantly resolve merge conflicts when releasing features that have multiple commits associated with them we will not have the problem of commits being applied in the "wrong order"    
  • There is one beta branch used for integration, code reviews, and testing
  • "Merge resolutions" between two features with conflicting changes get their own branches  This will afford us the benefit of being able to choose which features get released. If, for example, there are two features—feature A and feature B—in development, there are four possible release-related situations that can occur:  feature A and feature B are both being released only feature A is being released only feature B is being released neither feature is being released   By having the 'resolution branch' instead of fixing the merge conflicts between the two features on the beta branch directly, and then having to later track down the merge in the beta branch (and filter through other merges and beta-branch-only changes), we can simply select the correct branch for our release scenario.  
  • Releases get their own branches

General Workflow

At the start of an iteration/sprint, the master branch, the beta (integration) branch, and the most recent prod (release) branch all have the exact same changes. All feature branches "branch-off" from the master branch.

  • Feature branches get updated by the developers of the feature at the discretion of the developers.
  • The beta branch will get updated as features are completed.
  • The master branch will get updated during the next release, and a new prod branch will be created from the master branch and it will be "frozen" where no additional changes will be made to that particular release branch (except for hotfixes, which will be explained in another section). This process continues indefinitely, or until the sun explodes.

Feature Branches

A 'feature' ('work request') may be broken up into multiple 'tasks', where some tasks are dependent upon others.

The following steps are applicable for tasks that are not dependent upon other tasks. For tasks that are dependent on other tasks, refer to the "Dependent Task Workflow" section.

At the start of an iteration, you will checkout the master branch and ensure that you have the latest changes. This can be achieved via your favorite Git-friendly command prompt with:

git checkout master
git pull

You will then create a new branch for your feature.
Remember, you must create the new branch from the master branch.
To check your current branch, you can run:

git status

You should see something like:

On branch master
nothing to commit, working directory clean

After you are sure you are on the master branch, you are ready to create your new feature branch. Let's say the work request number is 654321. This can be done as follows:

git checkout -b dev/f00654321

Now you can do as many commits as your heart desires! However, please refrain from committing anything that doesn't compile.

To commit all your changes (use discretion):

git add .
git commit -m " #654321 Enter short description here"

NOTE: If you are putting the task number at the start of your commit message starting with a '#' so that TFS will track the commit for you, put an extra space in before the '#' since '#' is also the default comment character when rebasing interactively. If you forget the space, your commit message could potentially disappear when the changes are moved back into the master branch.

It is also recommended that you run the following command after your first commit on the new feature branch:

git push -u origin dev/f00654321

This will push the changes in your branch up to origin (TFS in our case). That way if your computer explodes or is stolen, your changes aren't lost. Pushing the changes up to origin is fine because your changes are on your branch only and don't step on anyone else's tasks. Having the changes on origin also means that that multiple developers can easily collaborate on a single feature-branch.

Once the feature branch is being tracked by origin, you update it with your following local commits by simply doing:

git push

Integrating Changes

Once you are finished with your feature and you're sure your code will work perfectly and that you've covered every edge case, it may be time to merge the changes into the beta branch.

NOTE: If your is task still requires front-end CSS/styles, don't merge the changes into beta yet. The front-end work is a dependent task and the feature is not complete. In such cases, make a note in the testing dialog that your task is not to be tested until the dependent task(s) are complete. Send your code for review and notify the developers of the dependent tasks that they may begin. Only once a feature is fully implemented should the following commands be entered.

git checkout beta
git pull
git merge dev/t00654321 --squash
git commit
git push

If you see output pertaining to merge conflicts, follow the steps in the "Handling Merge Conflicts" subsection.

Once the merge on beta is pushed up to origin, update all the affected files on the beta environment.

At least one task should be sent for review at this point:

  • If your task had no dependent tasks, send the task for review.
  • If your task had dependent tasks, your task should've already been sent to review at this point, but the dependent task(s) should be sent for review at the current juncture.

Handling Merge Conflicts

It's possible you could experience a merge conflict when attempting to merge your feature branch into the beta branch. This is because you changed a line in your feature branch that someone else changed in their feature branch, and they have already merged their changes into the beta branch.

If you see something like:

Auto-merging change
CONFLICT (add/add): Merge conflict in change
Squash commit -- not updating HEAD
Automatic merge failed; fix conflicts and then commit the result.

then there is a merge conflict.

Step 1: Finding the Conflicting Files

In the case of a merge conflict, you'll want to determine what feature your changes are conflicting with. To do that, we'll find the conflicting files and refer to the history.

You might more-easily be able to determine the conflicting files from Visual Studio instead of issuing the following commands.

The output when initially attempting the merge will give you a hint at what files are conflicting, but for another way to easily see which files conflict, enter:

git status

If there are a quite a few files conflicting, then the output generated when attempting the merge or when entering git status might become a little too cluttered. To list only the conflicting files, enter (or paste):

git diff --name-only --diff-filter=U

(It may be handy to copy and paste this outputted list into notepad.)

If the only conflicting files are generated/minified files, then we will permit a manual merge resolution on the beta branch and you can exit the rest of the "Handling Merge Conflicts" process. Essentially, in the case of generated/minified files, you just take one version or the other of the conflicting file to make Git shut up and regenerate the file before committing. You'd then continue whatever process you were going through before encountering the 'bogus' merge conflict.

(My preference is to put all generated file types in the .gitignore file and just be rid of this pain-point.)

Step 2: Canceling the Merge

Why cancel the merge on beta if there is a genuine merge conflict?

It's important that we don't accept changes that create conflicts and require cascading merge commits directly on the beta branch so that, when time comes to release only some tasks, we can pick the tasks needed and merge them with absolutely no effort at all.

Now that we know all the files that are conflicting, we can go ahead and abort the merge. This can be done by entering the following:

git reset --hard HEAD

Step 3: Finding the Conflicting Author

You might more-easily be able to determine the conflicting author from Visual Studio instead of issuing the following commands.

Now we'll see what feature(s) introduced conflicting changes and figure out who we need to beat up work with:

git log CONFLICTING_FILE_PATH_HERE

Replace 'CONFLICTING_FILE_PATH_HERE' with the path to your conflicting file. This will show the history of the given file.

NOTE: Git plays nicer with Linux/Unix-like paths with forward slashes instead of back slashes, so be wary of tabbing causing issues with paths? TODO: Update this if it isn't an issue (not on a Windows machine at the time of writing . . .)

A more advanced command is:

git log master..beta

Which will list all the commits that beta has that master doesn't.
(There may be some noise with revert commits, but those can be ignored since they "cancel out" commits that were 'undone'/'removed' from beta.)

An even more advanced command is:

git log master..beta -- CONFLICTING_FILE_PATH_HERE

Which will list all the commits affecting the given file that beta has that master doesn't. This is probably the command you'd want to use.

One of these commands should show you the commit messages and help you figure out who you'll need to Slack/work-with to create a resolution branch.

Step 4: Creating the Resolution Branch

Communicate with the developer(s) of the conflicting feature and determine which of the two features would take the least amount of effort to modify in order to be compatible with the other.

Once the easier-to-alter feature and the harder-to-alter feature have been identified, checkout the branch of the harder-to-alter feature.

git fetch
git checkout dev/f00harder-to-alter

(NOTE: instead of 'harder-to-alter' and 'easier-to-alter', use the actual WR numbers.)

Create a new resolution branch from the harder-to-alter branch.

git checkout -b dev/f00harder-to-alter-f00easier-to-alter

For example, creating a resolution branch between a WR 654321 and a WR 765432 would look like:

git checkout -b dev/f00654321-f00765432

Then begin the merge of the changes from the easier-to-alter branch (which, we know will conflict):

git merge --squash dev/f00easier-to-alter 

Resolve the conflicts with your favorite merge tool or manually.

Once the conflicts are resolved and committed, we'll push the resolution branch to origin:

git push -u origin dev/f00harder-to-alter-f00easier-to-alter

Make sure the developers of both features look at the result of the final merge and ensure that no behaviors were altered or broken.

Update the implementation dialog on both tasks in TFS to say something similar to the following:

If this task is being released with task [CONFLICTING_TASK], merge this branch instead of the two task branches: [RESOLUTION_BRANCH]

Step 5: 'Removing' the First of the Two Conflicting Features from beta

Once you and the developer that worked on the other branch with conflicting changes have created a resolution branch, one of you will need to remove the original feature branch  that was pushed up to beta. This is so that we can merge the new resolution branch instead and get both of the originally-conflicting features onto the beta branch without conflicts.

Since we "squash merge" feature tasks onto the beta branch to begin with, all we have to do is revert the "squash merge" commit for that feature.

Let's say the commit-hash of the first of the conflicting features on beta that we're going to remove was deadbeefcafe. We enter:

git revert deadbeefcafe

Then, we squash merge the resolution branch:

git merge --squash dev/f00harder-to-alter-f00easier-to-alter
git commit

Now we update the beta environment with the affected files and continue whatever process wee were working on prior (probably would send task to review at this point).

Dependent Task Workflow

The dependent task workflow is fairly simple, from the perspective of someone working a dependent task. Let's say you are such a person.

You have some changes to do on top of someone else's changes. That person's changes have not yet completed the feature.

Simply checkout the branch they were working on:

git fetch
git checkout dev/00654321

Commit your changes to that branch.

If your changes complete the feature, jump to the "Integrating Changes" subsection of the "General Workflow" section.

Related tasks should be identified at the beginning of an iteration before any code has been written in order to come up with a release strategy that calls for this workflow, since the related tasks can effectively become dependent on code written/changed in the other task(s) and create a situation where it becomes difficult to release one task without releasing the other.

Potential pattern if must use code written in another unreleased feature:

git fetch
git checkout dev/00654321
git checkout -b dev/f00876543

If a related task must be released without the initial task where some code was written also being released, we could potentially cherry-pick the code the related task uses, and then cherry-pick/rebase the related task changes, but this could complicate the release of the branch where the code was cherry-picked from.

Releasing Features

Check the corresponding task descriptions for notes about resolution branches. If there are resolution branches between two features, and both features are being released, use that branch instead of each feature branch individually.

For simplicity, let's say you were releasing a feature1, a feature2, and a feature3:

git checkout master
git merge feature1 feature2 feature3
git push
git checkout -b prod/v1.3.0
git push -u origin prod/v1.3.0

And you're done!

If any features were not included in the release and they are on the beta environment, the "squash merge" commits for those features should be reverted on the beta branch if the features are not scheduled for the next release. This is so development branches do not diverge from the production code.

Any tasks still in development should at this point be rebased on top of the latest changes in the master branch.

Rebasing Currently In-Development Features

When changes are released, the master branch is updated and a new release branch was created. If you are working on a task still, you'll need to update your feature branch with the changes that are currently in production.

To do this, first make sure you have the latest changes in both your feature branch (if you were collaborating with someone else) and the latest changes in the master branch.

git checkout master
git pull
git checkout dev/f00654321
git pull

In order to get the latest changes, we'll simply merge the master branch into the feature branch, and then rebase our changes so that they are "on top of" the latest master branch changes.

git merge master
git rebase

Hotfixes

Unfortunately, things don't always go perfect and sometimes emergency fixes need to be performed after some code has gone live.

Checkout the lastest release branch:

git checkout prod/v1.3.0

You can optionally create a new branch from this release branch for your hotfix changes:

git checkout -b hotfix/v1.3.0

Or, create a new, incremented release branch:

git checkout -b prod/v1.3.1

Or just make a new commit on the current release branch.

Once the hotfix has been implemented and committed, we'll get the fix on the relevant branches.

If you took the option with a hotfix/vX.X.X type of branch, you'd then do:

git push -u origin hotfix/v1.3.0
git checkout prod/v1.3.0
git merge hotfix/v1.3.0
git push

If you went with the incremented version release branch with the hotfix directly on that branch:

git push -u origin prod/v1.3.1

If you went with the hotfix committed directly on the latest release branch, you'd do:

git push

You now need to get your hotfix into the master branch. The release branch is directly mergable if there are no other commits in the release branch that should not go into the master branch (potentially something like livelocaloverride being committed).

If there are no differences other than the hotfix commit:

git checkout master
git merge prod/v1.3.1
git push

Alternatively, the hotfix commit can be cherry-picked. This would most likely be done if there were other commits on the release branch that for some reason shouldn't get merged back into the master branch.

Let's say the hotfix commit hash was deadbeefcafe.

git checkout master
git cherry-pick deadbeefcafe -x
git push

Send a slack to the team channel and tell the devs that they'll have to rebase their current in-progress feature branches now that the master branch has been updated.

Our Branch Name Conventions

The following are set of rules not enforced by Git. They are merely conventions that we will adhere to.

Feature Branches

  • All features are developed in their own branches.
  • We will begin by using the convention "dev/f{paddedWrNumber}", where paddedWrNumber is the work request number padded to 8 digits. For example, a branch for a WR 654321 would be named "dev/f00654321".  The use of the "/" in the branch name will represent all features sharing the same "dev" segment as a single, collapsable directory in Visual Studio. Segments should follow the same rules that the names of variables usually follow—hence the "f" before the work request number in "dev/f00654321". (However, dashes are fine.) The WR number is padded so that feature branches can continue to be easily sorted chronologically. If WR numbers surpass 99999999 (and we aren't pruning with a valid reason or if we have some tasks < 99999999 that aren't to be pruned yet), then we could change the convention to dev/fe and use numbers with 10 characters of padding to preserve the sort. (We could theoretically continue this type of pattern until we spell "feature".) Branch names should contain only lowercase characters. Windows is not case-sensitive, but Git was developed on Linux and is case sensitive. An optional feature description segment may be added after the paddedWrNumber segment. For example: "dev/f00654321/my-feature"  WARNING: if a branch named "segment" already exists, you cannot create a branch named "segment/thing"    

Resolution Branches

  • Resolution branches are branches designated to resolve conflicts between two feature branches if conflicts occur.
  • We will use the convention "dev/f{feature1}-f{feature2}", where feature1 is the padded WR number for the "heavier"/"more-difficult-to-change" feature and feature2 is the padded WR number for the "lighter"/"easier-to-change" feature. For example, the name of a resolution branch between a WR 654321 and a WR 765432 would look like: "dev/f00654321-f00765432"

Release Branches

  • Release branches are created for each release. This makes it very easy to 'rollback'. We just checkout a previous release branch and deploy code from that branch.
  • We will use the convention "prod/v{versionNumber}". For a version number of '1.3.0', the branch name would be "prod/v1.3.0".

Bugfix Branches

  • Bugfixes are fixes to released code that are not high-priority and can be released in the next scheduled release.  As such, bugfixes follow the same workflow that features use, with the only difference being the branch name.  
  • The bugfix branch name convention is "dev/b{paddedBugNumber}", where paddedBugNumber is the TFS bug number padded in the same manner that feature WR numbers are padded. For example, a bugfix for a bug with the number 987654 on TFS would be: "dev/b00987654"

Hotfix Branches

  • Hotfixes are fixes to released code that are high-priority and must be released immediately.
  • Since we have not yet formally defined an single hotfix approach and proposed three different processes, the name convention will not be specified until a single hotfix approach is define. Refer to "Hotfix" section for the three potential hotfix workflows for implicit naming conventions.

A Note on Appropriate Cherry-Picking

If someone hands you an ice cream sundae where the banana is rotten and the ice cream was made from spoiled milk but the cherry on top was fresh and could be enjoyed on its own, you might pick the cherry out of the bowl and toss the rest.

If you started work on a branch and stopped because the changes were no longer needed, or a better solution was designed, but you did make a change that still could be useful on its own, you might cherry-pick the change from the branch and toss the rest of the changes.

Cherry-picks shouldn't really be used in-place of merging, since the changes lose the parent commit reference and cherry-picks can end up being applied out of order and really turn the release process into a nice nightmare.

Long-Lived Feature Branches

For now, we'll just advise merging changes from master once it is updated, and rebasing the long-lived feature branch changes on top of the changes in the master branch. Note that this approach means if multiple developers were working on the feature branch, then they would have to update their local branches to match the rebased feature branch.

This article may additionally be of assistance if the same merge conflicts arise for very-long-lived feature branches:
https://hackernoon.com/fix-conflicts-only-once-with-git-rerere-7d116b2cec67

Understanding Git Internals

For the curious, in order to get a good understanding of how Git works, you will need to know what a cryptographic hash function is and then what a Merkle tree is (this may also require a small amount of graph theory to understand the terminology often used, as well). Other knowledge of cryptography give insights into the system—knowing what a commitment scheme is will give intuition as to why we call changesets 'commits'.

Some resources in no particular order:

If somewhat familiar with NodeJS, here's a bonus, well-commented JS implementation of Git:
http://gitlet.maryrosecook.com/docs/gitlet.html


Micro-Blog-Update

As one may have surmised, this blog is being neglected. There is good reason for this. I'm working on a few really interesting projects in my spare time instead of writing posts. Eventually, the projects I am working on will find their way onto this blog.