Git is a distributed version control system that allows multiple people to work on a project at the same time without overwriting each other's changes. It was created by Linus Torvalds in 2005 to manage the development of the Linux kernel. Since then, it has become a standard tool in the software development industry, used by individuals and organizations around the world to manage and track changes to their code.
This glossary will delve into the various terms and concepts associated with Git, providing an in-depth understanding of how it functions and how it can be utilized effectively in a software development context. From basic terms like 'repository' and 'commit' to more complex concepts like 'branching' and 'merging', this glossary aims to be a comprehensive guide for software engineers looking to expand their knowledge of Git.
Repository
A repository, often shortened to 'repo', is the most fundamental component of Git. It is essentially a directory or storage space where your project lives. It can contain folders and files, images, videos, spreadsheets – in short, anything your project needs. What makes it special is that Git records all changes to this directory.
There are two types of repositories in Git - local and remote. A local repository is stored in the same system where you are working, while a remote repository is stored on a server or a different location. You can synchronize between the two by pushing and pulling data.
Local Repository
A local repository is created on your local machine for individual use. It is where you'll do all your work, including creating, editing, deleting and organizing files. Every time you save a file, Git records what changes have been made, by whom and when.
The local repository consists of three "trees" maintained by Git. The first one is the Working Directory which holds the actual files. The second one is the Index which acts as a staging area, and finally the HEAD which points to the last commit you've made.
Remote Repository
Remote repositories are versions of your project that are hosted on the Internet or network somewhere. They can be on a completely different system, and they allow collaboration between multiple people. Each collaborator has a local copy of the entire project, and changes are synchronized between the repositories.
Common platforms that provide remote repository hosting services include GitHub, Bitbucket, and GitLab. These platforms provide a user-friendly interface for managing your Git repositories, as well as additional features like issue tracking, user permissions, and more.
Commit
In Git, a commit is an individual change to a file (or set of files). It's like when you save a file, except with Git, every time you save it creates a unique ID (a.k.a. the "SHA" or "hash") that allows you to keep record of what changes were made when and by who. Commits usually contain a commit message which is a brief description of what changes were made.
Commits make up the essence of your project and allow you to go back to the state of a project at any point. So, if you mess something up, you can easily go back to a previous state of your project. Each commit has a unique SHA1 hash that identifies it, which is a 40 characters string of numbers and letters.
Commit Message
Commit messages are crucial as they provide a log of the changes made to the project over time. A well-crafted commit message allows other contributors (and your future self) to understand why a particular change was made, making troubleshooting and debugging much easier.
Commit messages are composed of a header, body and footer. The header is a short summary of the changes, the body is a detailed description of what and why the changes were made, and the footer is where you can reference any issues or pull requests related to the commit.
Branch
A branch in Git is simply a lightweight movable pointer to one of these commits. The default branch name in Git is master. As you start making commits, you’re given a master branch that points to the last commit you made. Every time you commit, the master branch pointer moves forward automatically.
Branching serves as an efficient way to work on different versions of a project at the same time without affecting the main codebase. It allows developers to branch out from the original codebase and isolate their work from others. It not only allows teams to work on multiple features at the same time, but also helps in easily switching between different features and managing them.
Master Branch
The master branch is the default branch when you create a repository. Use it to store the official release history. All other branches are used to store changes and updates. It is the main and default branch that git creates for you when a new repository is initialized.
The master branch is usually where the source code of HEAD always reflects a production-ready state. It is common practice to initiate all new development (new features, non-emergency bug fixes) from the master branch.
Feature Branch
Feature branches (also commonly called topic branches) are used to develop new features for the upcoming or a distant future release. When the feature is fully tested and validated by automated tests, the feature branch is merged into master.
Creating a feature branch not only provides an isolated environment for every change to your codebase, but it also frees the developer to experiment and explore new concepts.
Merge
Merging is Git's way of putting a forked history back together. The git merge command lets you take the independent lines of development created by git branch and integrate them into a single branch.
When you merge a branch into another, changes made to the source branch are integrated into the target branch. This operation is performed by the git merge command. Git merge will combine multiple sequences of commits into one unified history.
Fast-Forward Merge
A fast-forward merge can occur when there is a linear path from the current branch tip to the target branch. Instead of “creating a new commit”, it just moves the branch pointer forward. Hence the name fast-forward.
When you create a new branch and start making changes, the original branch does not change. When you’re done with your feature and want to merge it back into master, if there have been no changes on master since you branched off, Git will perform a fast-forward merge.
3-Way Merge
When the histories have diverged in two branches, Git cannot perform a fast-forward merge. Instead, it performs a 3-way merge. This involves finding the common ancestor of the two branches (the commit they both branched off), comparing the changes in each branch since that commit, and then merging those changes together.
In case of conflicts, manual intervention is required to decide which changes to keep. Once resolved, a new commit is created that includes the changes from both branches, along with a merge commit message.
Clone
Cloning a Git repository means creating a local copy of a remote Git repository. This allows you to make changes to your copy of the project without affecting the original source code. If you want to contribute to a project, you can clone the project, make your changes and then send the changes to the original project.
When you clone a repository, Git automatically creates a remote connection called origin pointing back to the original repository. This is useful for pulling updates made to the original repository into your local clone.
Shallow Clone
A shallow clone is a git clone command with the --depth option followed by number of commits. Shallow clone creates a copy of the repository with a limited history of commits. This can be useful when working with a repository with a long history of commits, as it can significantly reduce the time and disk space required to clone the repository.
However, some operations, such as git pull and git merge, may not work as expected in a shallow clone, as they rely on the repository's history. Therefore, shallow clones are best suited for scenarios where you only need the latest version of a repository, and not its entire history.
Deep Clone
A deep clone, on the other hand, is a clone where all the branches in the repository are fully copied. It's the default type of clone that happens when you do a git clone without specifying any options.
A deep clone includes all the project files, as well as the history of changes to those files. This can be useful when you need to see the entire history of a project, or when you need to work with all of the branches in a repository.
Pull
The git pull command is used to fetch and download content from a remote repository and immediately update the local repository to match that content. It is actually a combination of git fetch followed by git merge. In the simplest terms, git pull does a git fetch followed by a git merge.
You can use git pull to bring your local branch up-to-date with its remote version, which can be particularly useful if other people have been committing to the same branch.
Pull Request
A pull request is a method of submitting contributions to an open development project. It occurs when a developer asks for changes committed to an external repository to be considered for inclusion in a project's main repository after the peer review.
It's important to note that a pull request is only a proposal for changes to be pulled into a repository. It's not a command or an order. The owner of the repository (or anyone with write access) can decide whether or not to accept the proposed changes.
Pull Rebase
Git pull --rebase is a command that integrates changes from a remote repository into the current branch. In its default mode, it performs a rebase, which can be used to make a linear sequence of commits. The commit history will be a lot cleaner if only rebasing is allowed.
The pull rebase command is particularly useful in scenarios where you want to keep your feature branch up to date with the latest changes from the master branch, but you don't want to create a merge commit.