841
votes

I keep hearing people say they're forking code in Git. Git "fork" sounds suspiciously like Git "clone" plus some (meaningless) psychological willingness to forgo future merges. There is no fork command in Git, right?

GitHub makes forks a little more real by stapling correspondence onto it. That is, you press the fork button and later, when you press the pull request button, the system is smart enough to email the owner. Hence, it's a little bit of a dance around repository ownership and permissions.

Yes/No? Any angst over GitHub extending Git in this direction? Or any rumors of Git absorbing the functionality?

10
Yeah, it is just a type of clone which is tracked by the github database.Paŭlo Ebermann
Doesn't GitHub do something special to avoid doubling the storage requirements (on GitHub's own servers)?Keith Thompson
Not mentioned yet: Deleting a private repo deletes all its forks. Deleting a public repo keeps the forks but promotes one fork to be the new parent repo. If your boss makes your public repo private, it breaks all the existing forks and you won't be able to make pull requests from them to the private repo. help.github.com/articles/…Plato
I believe (without proof since GitHub do not show this to us) that the actual mechanism here is Git's "alternates". In other words, the fork is a mirror clone with --reference used. Exactly how public repos and deletions are handled is not at all clear (move alternates to randomly chosen promoted repo? point all forks to some common alternate that's not part of the original fork?) but the use of alternates explains various observable behaviors.torek

10 Answers

947
votes

Fork, in the GitHub context, doesn't extend Git.
It only allows clone on the server side.

When you clone a GitHub repository on your local workstation, you cannot contribute back to the upstream repository unless you are explicitly declared as "contributor". That's because your clone is a separate instance of that project. If you want to contribute to the project, you can use forking to do it, in the following way:

  • clone that GitHub repository on your GitHub account (that is the "fork" part, a clone on the server side)
  • contribute commits to that GitHub repository (it is in your own GitHub account, so you have every right to push to it)
  • signal any interesting contribution back to the original GitHub repository (that is the "pull request" part by way of the changes you made on your own GitHub repository)

Check also "Collaborative GitHub Workflow".

If you want to keep a link with the original repository (also called upstream), you need to add a remote referring that original repository.
See "What is the difference between origin and upstream on GitHub?"

fork and upstream

And with Git 2.20 (Q4 2018) and more, fetching from fork is more efficient, with delta islands.

140
votes

I keep hearing people say they're forking code in git. Git "fork" sounds suspiciously like git "clone" plus some (meaningless) psychological willingness to forgo future merges. There is no fork command in git, right?

"Forking" is a concept, not a command specifically supported by any version control system.

The simplest kind of forking is synonymous with branching. Every time you create a branch, regardless of your VCS, you've "forked". These forks are usually pretty easy to merge back together.

The kind of fork you're talking about, where a separate party takes a complete copy of the code and walks away, necessarily happens outside the VCS in a centralized system like Subversion. A distributed VCS like Git has much better support for forking the entire codebase and effectively starting a new project.

Git (not GitHub) natively supports "forking" an entire repo (ie, cloning it) in a couple of ways:

  • when you clone, a remote called origin is created for you
  • by default all the branches in the clone will track their origin equivalents
  • fetching and merging changes from the original project you forked from is trivially easy

Git makes contributing changes back to the source of the fork as simple as asking someone from the original project to pull from you, or requesting write access to push changes back yourself. This is the part that GitHub makes easier, and standardizes.

Any angst over Github extending git in this direction? Or any rumors of git absorbing the functionality?

There is no angst because your assumption is wrong. GitHub "extends" the forking functionality of Git with a nice GUI and a standardized way of issuing pull requests, but it doesn't add the functionality to Git. The concept of full-repo-forking is baked right into distributed version control at a fundamental level. You could abandon GitHub at any point and still continue to push/pull projects you've "forked".

82
votes

Yes, fork is a clone. It emerged because, you cannot push to others' copies without their permission. They make a copy of it for you (fork), where you will have write permission as well.

In the future if the actual owner or others users with a fork like your changes they can pull it back to their own repository. Alternatively you can send them a "pull-request".

39
votes

"Fork" in this context means "Make a copy of their code so that I can add my own modifications". There's not much else to say. Every clone is essentially a fork, and it's up to the original to decide whether to pull the changes from the fork.

27
votes

Cloning involves making a copy of the git repository to a local machine, while forking is cloning the repository into another repository. Cloning is for personal use only (although future merges may occur), but with forking you are copying and opening a new possible project path

11
votes

I think fork is a copy of other repository but with your account modification. for example, if you directly clone other repository locally, the remote object origin is still using the account who you clone from. You can't commit and contribute your code. It is just a pure copy of codes. Otherwise, If you fork a repository, it will clone the repo with the update of your account setting in you github account. And then cloning the repo in the context of your account, you can commit your codes.

11
votes

Forking is done when you decide to contribute to some project. You would make a copy of the entire project along with its history logs. This copy is made entirely in your repository and once you make these changes, you issue a pull request. Now its up-to the owner of the source to accept your pull request and incorporate the changes into the original code.

Git clone is an actual command that allows users to get a copy of the source. git clone [URL] This should create a copy of [URL] in your own local repository.

11
votes

There is a misunderstanding here with respect to what a "fork" is. A fork is in fact nothing more than a set of per-user branches. When you push to a fork you actually do push to the original repository, because that is the ONLY repository.

You can try this out by pushing to a fork, noting the commit and then going to the original repository and using the commit ID, you'll see that the commit is "in" the original repository.

This makes a lot of sense, but it is far from obvious (I only discovered this accidentally recently).

When John forks repository SuperProject what seems to actually happen is that all branches in the source repository are replicated with a name like "John.master", "John.new_gui_project", etc.

GitHub "hides" the "John." from us and gives us the illusion we have our own "copy" of the repository on GitHub, but we don't and nor is one even needed.

So my fork's branch "master" is actually named "Korporal.master", but the GitHub UI never reveals this, showing me only "master".

This is pretty much what I think goes on under the hood anyway based on stuff I've been doing recently and when you ponder it, is very good design.

For this reason I think it would be very easy for Microsoft to implement Git forks in their Visual Studio Team Services offering.

6
votes

Apart from the fact that cloning is from server to your machine and forking is making a copy on the server itself, an important difference is that when we clone, we actually get all the branches, labels, etc.

But when we fork, we actually only get the current files in the master branch, nothing other than that. This means we don't get the other branches, etc.

Hence if you have to merge something back to the original repository, it is a inter-repository merge and will definitely need higher privileges.

Fork is not a command in Git; it is just a concept which GitHub implements. Remember Git was designed to work in peer-to-peer environment without the need to synchronize stuff with any master copy. The server is just another peer, but we look at it as a master copy.

3
votes

In simplest terms,

When you say you are forking a repository, you are basically creating a copy of the original repository under your GitHub ID in your GitHub account.

and

When you say you are cloning a repository, you are creating a local copy of the original repository in your system (PC/laptop) directly without having a copy in your GitHub account.