255
votes

I'm not clear on what the following means (from the Git submodule update documentation):

...will make the submodules HEAD be detached, unless --rebase or --merge is specified...

How does --rebase/--merge change things?

My main use case is to have a bunch of central repositories, which I will embed via submodules into other repositories. I would like to be able to improve on these central repositories, either directly in their original location, or from within their embedding repositories (the ones that use them via submodule).

  • From within these submodules, can I create branches/modifications and use push/pull just like I would in regular repositories, or are there things to be cautious about?
  • How would I advance the submodule referenced commit from say (tagged) 1.0 to 1.1 (even though the head of the original repository is already at 2.0), or pick which branch's commit is used at all?
4
On the topic of "detached head", see also stackoverflow.com/questions/964876/head-and-orighead-in-git and stackoverflow.com/questions/237408/… for a practical example (not submodule-related, but still)VonC
"you cannot modify the contents of the submodule from within the main project": yes, true. And I have edited my answer to shed some light on that apparent contradiction (non-modifiable submodule, that you still can modify from the main project repo!)VonC

4 Answers

318
votes

This GitPro page does summarize the consequence of a git submodule update nicely

When you run git submodule update, it checks out the specific version of the project, but not within a branch. This is called having a detached head — it means the HEAD file points directly to a commit, not to a symbolic reference.
The issue is that you generally don’t want to work in a detached head environment, because it’s easy to lose changes.
If you do an initial submodule update, commit in that submodule directory without creating a branch to work in, and then run git submodule update again from the superproject without committing in the meantime, Git will overwrite your changes without telling you. Technically you won’t lose the work, but you won’t have a branch pointing to it, so it will be somewhat difficult to retrieve.


Note March 2013:

As mentioned in "git submodule tracking latest", a submodule now (git1.8.2) can track a branch.

# add submodule to track master branch
git submodule add -b master [URL to Git repo];

# update your submodule
git submodule update --remote 
# or (with rebase)
git submodule update --rebase --remote

See "git submodule update --remote vs git pull".

MindTooth's answer illustrate a manual update (without local configuration):

git submodule -q foreach git pull -q origin master

In both cases, that will change the submodules references (the gitlink, a special entry in the parent repo index), and you will need to add, commit and push said references from the main repo.
Next time you will clone that parent repo, it will populate the submodules to reflect those new SHA1 references.

The rest of this answer details the classic submodule feature (reference to a fixed commit, which is the all point behind the notion of a submodule).


To avoid this issue, create a branch when you work in a submodule directory with git checkout -b work or something equivalent. When you do the submodule update a second time, it will still revert your work, but at least you have a pointer to get back to.

Switching branches with submodules in them can also be tricky. If you create a new branch, add a submodule there, and then switch back to a branch without that submodule, you still have the submodule directory as an untracked directory:


So, to answer your questions:

can I create branches/modifications and use push/pull just like I would in regular repos, or are there things to be cautious about?

You can create a branch and push modifications.

WARNING (from Git Submodule Tutorial): Always publish (push) the submodule change before publishing (push) the change to the superproject that references it. If you forget to publish the submodule change, others won't be able to clone the repository.

how would I advance the submodule referenced commit from say (tagged) 1.0 to 1.1 (even though the head of the original repo is already at 2.0)

The page "Understanding Submodules" can help

Git submodules are implemented using two moving parts:

  • the .gitmodules file and
  • a special kind of tree object.

These together triangulate a specific revision of a specific repository which is checked out into a specific location in your project.


From the git submodule page

you cannot modify the contents of the submodule from within the main project

100% correct: you cannot modify a submodule, only refer to one of its commits.

This is why, when you do modify a submodule from within the main project, you:

  • need to commit and push within the submodule (to the upstream module), and
  • then go up in your main project, and re-commit (in order for that main project to refer to the new submodule commit you just created and pushed)

A submodule enables you to have a component-based approach development, where the main project only refers to specific commits of other components (here "other Git repositories declared as sub-modules").

A submodule is a marker (commit) to another Git repository which is not bound by the main project development cycle: it (the "other" Git repo) can evolves independently.
It is up to the main project to pick from that other repo whatever commit it needs.

However, should you want to, out of convenience, modify one of those submodules directly from your main project, Git allows you to do that, provided you first publish those submodule modifications to its original Git repo, and then commit your main project refering to a new version of said submodule.

But the main idea remains: referencing specific components which:

  • have their own lifecycle
  • have their own set of tags
  • have their own development

The list of specific commits you are refering to in your main project defines your configuration (this is what Configuration Management is all about, englobing mere Version Control System)

If a component could really be developed at the same time as your main project (because any modification on the main project would involve modifying the sub-directory, and vice-versa), then it would be a "submodule" no more, but a subtree merge (also presented in the question Transferring legacy code base from cvs to distributed repository), linking the history of the two Git repo together.

Does that help understanding the true nature of Git Submodules?

139
votes

To update each submodule, you could invoke the following command (at the root of the repository):

git submodule -q foreach git pull -q origin master

You can remove the -q option to follow the whole process.

19
votes

To address the --rebase vs. --merge option:

Let's say you have super repository A and submodule B and want to do some work in submodule B. You've done your homework and know that after calling

git submodule update

you are in a HEAD-less state, so any commits you do at this point are hard to get back to. So, you've started work on a new branch in submodule B

cd B
git checkout -b bestIdeaForBEver
<do work>

Meanwhile, someone else in project A has decided that the latest and greatest version of B is really what A deserves. You, out of habit, merge the most recent changes down and update your submodules.

<in A>
git merge develop
git submodule update

Oh noes! You're back in a headless state again, probably because B is now pointing to the SHA associated with B's new tip, or some other commit. If only you had:

git merge develop
git submodule update --rebase

Fast-forwarded bestIdeaForBEver to b798edfdsf1191f8b140ea325685c4da19a9d437.
Submodule path 'B': rebased into 'b798ecsdf71191f8b140ea325685c4da19a9d437'

Now that best idea ever for B has been rebased onto the new commit, and more importantly, you are still on your development branch for B, not in a headless state!

(The --merge will merge changes from beforeUpdateSHA to afterUpdateSHA into your working branch, as opposed to rebasing your changes onto afterUpdateSHA.)

11
votes

Git 1.8.2 features a new option ,--remote, that will enable exactly this behavior. Running

git submodule update --rebase --remote

will fetch the latest changes from upstream in each submodule, rebase them, and check out the latest revision of the submodule. As the documentation puts it:

--remote

This option is only valid for the update command. Instead of using the superproject’s recorded SHA-1 to update the submodule, use the status of the submodule’s remote-tracking branch.

This is equivalent to running git pull in each submodule, which is generally exactly what you want.

(This was copied from this answer.)