What to Expect Below
gists:
Why would anyone want to read this long post?
Because while previous answers clearly
understand the problem with the original question,
they fall short of correct/meaningful results;
or accurately solve a different problem.
Feel free to just review the first section;
it solves the "find something" problem and
should highlight scope of the problem.
For some, that may be sufficient.
This will show you a way to
extract correct & meaningful results from git
(you may not like them),
and demonstrate one way to apply
your knowledge of your conventions
to those results
to extract what you are really looking for.
Sections below cover:
- An Unbiased Question & Solution:
- nearest git branches using
git show-branch
.
- what expected results should look like
- Example Graph & Results
- Batching Branches: work around limits of
git show-branch
- A Biased Question & Solution:
introducing (naming) conventions to improve results
The Problem with The Question
As has been mentioned, git does not track relationships between branches;
branches are simply names referencing a commit.
In official git documentation and other sources we'll often encounter somewhat misleading diagrams such as:
A---B---C---D <- master branch
\
E---F <- work branch
Let's change the form of the diagram and the hierarchically suggestive names to show an equivalent graph:
E---F <- jack
/
A---B
\
C---D <- jill
The graph (and hence git) tells us absolutely nothing about which branch was created first (hence, which was branched off the other).
That master
is a parent of work
in the first graph is a matter of convention.
Therefore
- simple tooling will produce responses that ignore the bias
- more complex tooling incorporates conventions (biases).
An Unbiased Question
First, I must acknowledge primarily Joe Chrysler's response, other responses here, and the many comments/suggestions all around;
they inspired and pointed the way for me!
Allow me to rephrase Joe's rephrasal, giving consideration to multiple branches related to that nearest commit (it happens!):
"What is the nearest commit that resides on a branch other than the
current branch, and which branches is that?"
Or, in other words:
Q1
Given a branch B
:
consider the commit C
nearest to B'HEAD
(C
could be B'HEAD
)
that is shared by other branches:
what branches, other than B
, have C
in their commit history?
An Unbiased Solution
Apologies up front; it seems folks prefer one-liners. Feel free to suggest (readable/maintainable) improvements!
#!/usr/local/bin/bash
# git show-branch supports 29 branches; reserve 1 for current branch
GIT_SHOW_BRANCH_MAX=28
CURRENT_BRANCH="$(git rev-parse --abbrev-ref HEAD)"
if (( $? != 0 )); then
echo "Failed to determine git branch; is this a git repo?" >&2
exit 1
fi
##
# Given Params:
# EXCEPT : $1
# VALUES : $2..N
#
# Return all values except EXCEPT, in order.
#
function valuesExcept() {
local except=$1 ; shift
for value in "$@"; do
if [[ "$value" != "$except" ]]; then
echo $value
fi
done
}
##
# Given Params:
# BASE_BRANCH : $1 : base branch; default is current branch
# BRANCHES : [ $2 .. $N ] : list of unique branch names (no duplicates);
# perhaps possible parents.
# Default is all branches except base branch.
#
# For the most recent commit in the commit history for BASE_BRANCH that is
# also in the commit history of at least one branch in BRANCHES: output all
# BRANCHES that share that commit in their commit history.
#
function nearestCommonBranches() {
local BASE_BRANCH
if [[ -z "${1+x}" || "$1" == '.' ]]; then
BASE_BRANCH="$CURRENT_BRANCH"
else
BASE_BRANCH="$1"
fi
shift
local -a CANDIDATES
if [[ -z "${1+x}" ]]; then
CANDIDATES=( $(git rev-parse --symbolic --branches) )
else
CANDIDATES=("$@")
fi
local BRANCHES=( $(valuesExcept "$BASE_BRANCH" "${CANDIDATES[@]}") )
local BRANCH_COUNT=${#BRANCHES[@]}
if (( $BRANCH_COUNT > $GIT_SHOW_BRANCH_MAX )); then
echo "Too many branches: limit $GIT_SHOW_BRANCH_MAX" >&2
exit 1
fi
local MAP=( $(git show-branch --topo-order "${BRANCHES[@]}" "$BASE_BRANCH" \
| tail -n +$(($BRANCH_COUNT+3)) \
| sed "s/ \[.*$//" \
| sed "s/ /_/g" \
| sed "s/*/+/g" \
| egrep '^_*[^_].*[^_]$' \
| head -n1 \
| sed 's/\(.\)/\1\n/g'
) )
for idx in "${!BRANCHES[@]}"; do
## to include "merge", symbolized by '-', use
## ALT: if [[ "${MAP[$idx]}" != "_" ]]
if [[ "${MAP[$idx]}" == "+" ]]; then
echo "${BRANCHES[$idx]}"
fi
done
}
# Usage: gitr [ baseBranch [branchToConsider]* ]
# baseBranch: '.' (no quotes needed) corresponds to default current branch
# branchToConsider* : list of unique branch names (no duplicates);
# perhaps possible (bias?) parents.
# Default is all branches except base branch.
nearestCommonBranches "${@}"
How it Works
Considering Output of: git show-branch
For git show-branch --topo-order feature/g hotfix master release/2 release/3 feature/d
, the output would look similar to:
! [feature/g] TEAM-12345: create X
* [hotfix] TEAM-12345: create G
! [master] TEAM-12345: create E
! [release/2] TEAM-12345: create C
! [release/3] TEAM-12345: create C
! [feature/d] TEAM-12345: create S
------
+ [feature/g] TEAM-12345: create X
+ [feature/g^] TEAM-12345: create W
+ [feature/d] TEAM-12345: create S
+ [feature/d^] TEAM-12345: create R
+ [feature/d~2] TEAM-12345: create Q
...
+ [master] TEAM-12345: create E
* [hotfix] TEAM-12345: create G
* [hotfix^] TEAM-12345: create F
*+ [master^] TEAM-12345: create D
+*+++ [release/2] TEAM-12345: create C
+*++++ [feature/d~8] TEAM-12345: create B
A few points:
- the original command listed N (6) branch names on the command line
- those branch names appear, in order, as the first N lines of the output
- the lines following the header represent commits
- the first N columns of the commit lines represent (as a whole) a "branch/commit matrix", where a single character in column
X
indicates the relationship (or lack of) between a branch (header row X
) and the current commit.
Primary Steps
- Given a
BASE_BRANCH
- Given an ordered set (unique)
BRANCHES
that does not include BASE_BRANCH
- For brevity, let
N
be BRANCH_COUNT
,
which is the size of BRANCHES
;
it does not include BASE_BRANCH
git show-branch --topo-order $BRANCHES $BASE_BRANCH
:
- Since
BRANCHES
contains only unique names (presumed valid)
the names will map 1-1 with the header lines of the output,
and correspond to the first N columns of the branch/commit matrix.
- Since
BASE_BRANCH
is not in BRANCHES
it will be the last of the header lines,
and corresponds to the last column branch/commit matrix.
tail
: start with line N+3
; throw away the first N+2
lines: N branches + base branch + separator row ---..
.
sed
: these could be combined in one... but are separated for clarity
- remove everything after the branch/commit matrix
- replace spaces with underscores '_';
my primary reason was to avoid potential IFS parsing hassles
and for debugging/readability.
- replace
*
with +
; base branch is always in last column,
and that's sufficient. Also, if left alone it goes through bash
pathname expansion, and that's always fun with *
egrep
: grep for commits that map to at least one branch ([^_]
) AND to the BASE_BRANCH ([^_]$
). Maybe that base branch pattern should be \+$
?
head -n1
: take the first remaining commit
sed
: separate each character of the branch/commit matrix to separate lines.
- Capture the lines in an array
MAP
, at which point we have two arrays:
BRANCHES
: length N
MAP
: length N+1
: first N
elements 1-1 with BRANCHES
, and the last element corresponding to the BASE_BRANCH
.
- Iterate over
BRANCHES
(that's all we want, and it's shorter) and check corresponding element in MAP
: output BRANCH[$idx]
if MAP[$idx]
is +
.
Example Graph & Results
Consider the following somewhat contrived example graph:
- Biased names will be used, as they help (me) weigh and consider results.
- Presume merges exist and are being ignored.
- The graph generally attempts to highlight branches as such (forking),
without visually suggesting a preference/hierarchy;
ironically
master
stands out after I was done with this thing.
J <- feature/b
/
H
/ \
/ I <- feature/a
/
D---E <- master
/ \
/ F---G <- hotfix
/
A---B---C <- feature/f, release/2, release/3
\ \
\ W--X <- feature/g
\
\ M <- support/1
\ /
K---L <- release/4
\
\ T---U---V <- feature/e
\ /
N---O
\
P <- feature/c
\
Q---R---S <- feature/d
Unbiased Results for Example Graph
Assuming the script is in executable file gitr
, then run:
gitr <baseBranch>
For different branches B
we obtain the following results:
GIVEN B |
Shared Commit C |
Branches !B with C in their history? |
---|
feature/a |
H |
feature/b |
feature/b |
H |
feature/a |
feature/c |
P |
feature/d |
feature/d |
P |
feature/c |
feature/e |
O |
feature/c, feature/d |
feature/f |
C |
feature/a, feature/b, feature/g, hotfix, master, release/2, release/3 |
feature/g |
C |
feature/a, feature/b, feature/f, hotfix, master, release/2, release/3 |
hotfix |
D |
feature/a, feature/b, master |
master |
D |
feature/a, feature/b, hotfix |
release/2 |
C |
feature/a, feature/b, feature/f, feature/g, hotfix, master, release/3 |
release/3 |
C |
feature/a, feature/b, feature/f, feature/g, hotfix, master, release/2 |
release/4 |
L |
feature/c, feature/d, feature/e, support/1 |
support/1 |
L |
feature/c, feature/d, feature/e, release/4 |
Batching Branches
[Presented at this stage
because it fits best into final script at this point.
This section is not required, feel free to skip forward.]
git show-branch
limits itself to 29 branches.
That maybe a blocker for some (no judgement, just sayin!).
We can improve results, in some situations,
by grouping branches into batches.
- BASE_BRANCH must be submitted with each branch.
- If there are a large number of branches in a repo
this may have limited value, by itself.
- May provide more value if you find other ways
to limit the branches (that would be batched).
- The previous point fits my use case,
so charging ahead!
This mechanism is NOT perfect,
as the result size approaches the max (29),
expect it to fail. Details below
Batch Solution
#
# Remove/comment-out the function call at the end of script,
# and append this to the end.
##
##
# Given:
# BASE_BRANCH : $1 : first param on every batch
# BRANCHES : [ $2 .. $N ] : list of unique branch names (no duplicates);
# perhaps possible parents
# Default is all branches except base branch.
#
# Output all BRANCHES that share that commit in their commit history.
#
function repeatBatchingUntilStableResults() {
local BASE_BRANCH="$1"
shift
local -a CANDIDATES
if [[ -z "${1+x}" ]]; then
CANDIDATES=( $(git rev-parse --symbolic --branches) )
else
CANDIDATES=("$@")
fi
local BRANCHES=( $(valuesExcept "$BASE_BRANCH" "${CANDIDATES[@]}") )
local SIZE=$GIT_SHOW_BRANCH_MAX
local COUNT=${#BRANCHES[@]}
local LAST_COUNT=$(( $COUNT + 1 ))
local NOT_DONE=1
while (( $NOT_DONE && $COUNT < $LAST_COUNT )); do
NOT_DONE=$(( $SIZE < $COUNT ))
LAST_COUNT=$COUNT
local -a BRANCHES_TO_BATCH=( "${BRANCHES[@]}" )
local -a AGGREGATE=()
while (( ${#BRANCHES_TO_BATCH[@]} > 0 )); do
local -a BATCH=( "${BRANCHES_TO_BATCH[@]:0:$SIZE}" )
AGGREGATE+=( $(nearestCommonBranches "$BASE_BRANCH" "${BATCH[@]}") )
BRANCHES_TO_BATCH=( "${BRANCHES_TO_BATCH[@]:$SIZE}" )
done
BRANCHES=( "${AGGREGATE[@]}" )
COUNT=${#BRANCHES[@]}
done
if (( ${#BRANCHES[@]} > $SIZE )); then
echo "Unable to reduce candidate branches below MAX for git-show-branch" >&2
echo " Base Branch : $BASE_BRANCH" >&2
echo " MAX Branches: $SIZE" >&2
echo " Candidates : ${BRANCHES[@]}" >&2
exit 1
fi
echo "${BRANCHES[@]}"
}
repeatBatchingUntilStableResults "$@"
exit 0
How it Works
Repeat until results stabilize
- Break
BRANCHES
into batches of
GIT_SHOW_BRANCH_MAX
(aka SIZE
) elements
- call
nearestCommonBranches BASE_BRANCH BATCH
- Aggregating results into a new (smaller?) set of branches
How it can fail
If the number of aggregated branches exceeds the max SIZE
and further batching/processing cannot reduce that number
then either:
- the aggregated branches IS the solution,
but that can't be verified by
git show-branch
, or
- each batch doesn't reduce;
possibly a branch from one batch would help reduce another
(diff merge base); the current algo admits defeat and fails.
Consider Alternative
Individually pairing a base branch with every other branch of interest, determine a commit node (merge base) for each pair; sorting the set of merge bases in commit history order, taking the nearest node, determining all branches associated with that node.
I present that from a position of hindsight.
It's probably really the right way to go.
I'm moving forward;
perhaps there is value outside of the current topic.
A Biased Question
You may have noted that the core function nearestCommonBranches
in the earlier script answers more than question Q1 asks.
In fact, the function answers a more general question:
Q2
Given a branch B
and
an ordered set (no duplicates) P
of branches (B
not in P
):
consider the commit C
nearest to B'HEAD
(C
could be B'HEAD
)
that is shared by branches in P
:
in order per order-of-P, what branches in P have C in their commit history?
Choosing P
provides bias, or describes a (limited) convention.
To match all the characteristics of your biases/convention may require additional tools, which is out-of-scope for this discussion.
Modeling Simple Bias/Convention
Bias varies for different organization & practices,
and the following may not be suitable for your organization.
If nothing else, perhaps some of the ideas here might help
you find a solution to your needs.
A Biased Solution; Bias by Branch Naming Convention
Perhaps the bias can be mapped into, and extracted from,
the naming convention in use.
Bias by P
(Those Other Branch Names)
We're going to need this for the next step,
so let's see what we can do by filtering branch names by regex.
The combined previous code and the new code below is available as a gist: gitr
#
# Remove/comment-out the function call at the end of script,
# and append this to the end.
##
##
# Given Params:
# BASE_BRANCH : $1 : base branch
# REGEXs : $2 [ .. $N ] : regex(s)
#
# Output:
# - git branches matching at least one of the regex params
# - base branch is excluded from result
# - order: branches matching the Nth regex will appear before
# branches matching the (N+1)th regex.
# - no duplicates in output
#
function expandUniqGitBranches() {
local -A BSET[$1]=1
shift
local ALL_BRANCHES=$(git rev-parse --symbolic --branches)
for regex in "$@"; do
for branch in $ALL_BRANCHES; do
## RE: -z ${BSET[$branch]+x ... ; presumes ENV 'x' is not defined
if [[ $branch =~ $regex && -z "${BSET[$branch]+x}" ]]; then
echo "$branch"
BSET[$branch]=1
fi
done
done
}
##
# Params:
# BASE_BRANCH: $1 : "." equates to the current branch;
# REGEXS : $2..N : regex(es) corresponding to other to include
#
function findBranchesSharingFirstCommonCommit() {
if [[ -z "$1" ]]; then
echo "Usage: findBranchesSharingFirstCommonCommit ( . | baseBranch ) [ regex [ ... ] ]" >&2
exit 1
fi
local BASE_BRANCH
if [[ -z "${1+x}" || "$1" == '.' ]]; then
BASE_BRANCH="$CURRENT_BRANCH"
else
BASE_BRANCH="$1"
fi
shift
local REGEXS
if [[ -z "$1" ]]; then
REGEXS=(".*")
else
REGEXS=("$@")
fi
local BRANCHES=( $(expandUniqGitBranches "$BASE_BRANCH" "${REGEXS[@]}") )
## nearestCommonBranches can also be used here, if batching not used.
repeatBatchingUntilStableResults "$BASE_BRANCH" "${BRANCHES[@]}"
}
findBranchesSharingFirstCommonCommit "$@"
Biased Results for Example Graph
Let's consider the ordered set
P = { ^release/.*$ ^support/.*$ ^master$ }
Assuming the script (all parts) is in executable file gitr
, then run:
gitr <baseBranch> '^release/.*$' '^support/.*$' '^master$'
For different branches B
we obtain the following results:
GIVEN B |
Shared Commit C |
Branches P with C in their history (in order) |
---|
feature/a |
D |
master |
feature/b |
D |
master |
feature/c |
L |
release/4, support/1 |
feature/d |
L |
release/4, support/1 |
feature/e |
L |
release/4, support/1 |
feature/f |
C |
release/2, release/3, master |
feature/g |
C |
release/2, release/3, master |
hotfix |
D |
master |
master |
C |
release/2, release/3 |
release/2 |
C |
release/3, master |
release/3 |
C |
release/2, master |
release/4 |
L |
support/1 |
support/1 |
L |
release/4 |
That's getting closer to a definitive answer; the responses for release branches aren't ideal. Let's take this one step further.
Bias by BASE_NAME
and P
One direction to take this could be to use different P
for different
base names. Let's work out a design for that.
Conventions
DISCLAIMER: A git flow purist I am not, make allowances for me please
- A support branch shall branch off master.
- There will NOT be two support branches sharing a common commit.
- A hotfix branch shall branch off a support branch or master.
- A release branch shall branch off a support branch or master.
- There may be multiple release branches sharing a common commit;
i.e. branched off master at the same time.
- A bugfix branch shall branch off a release branch.
- a feature branch may branch off a feature, release, support, or master:
- for the purpose of "parent",
one feature branch cannot be established as
a parent over another (see initial discussion).
- therefore: skip feature branches and
look for "parent" among release, support, and/or master branches.
- any other branch name to be considered a working branch,
with same conventions as a feature branch.
Let's see how far we git
with this:
Base Branch Pattern |
Parent Branches, Ordered |
Comment(s) |
---|
^master$ |
n/a |
no parent |
^support/.*$ |
^master$ |
|
^hotfix/.*$ |
^support/.*$ ^master$ |
give preference to a support branch over master (ordering) |
^release/.*$ |
^support/.*$ ^master$ |
give preference to a support branch over master (ordering) |
^bugfix/.*$ |
^release/.*$ |
|
^feature/.*$ |
^release/.*$ ^support/.*$ ^master$ |
|
^.*$ |
^release/.*$ ^support/.*$ ^master$ |
Redundant, but keep design concerns separate |
Script
The combined previous code and the new code below is available as a gist: gitp
#
# Remove/comment-out the function call at the end of script,
# and append this to the end.
##
# bash associative arrays maintain key/entry order.
# So, use two maps, values correlated by index:
declare -a MAP_BASE_BRANCH_REGEX=( "^master$" \
"^support/.*$" \
"^hotfix/.*$" \
"^release/.*$" \
"^bugfix/.*$" \
"^feature/.*$" \
"^.*$" )
declare -a MAP_BRANCHES_REGEXS=("" \
"^master$" \
"^support/.*$ ^master$" \
"^support/.*$ ^master$" \
"^release/.*$" \
"^release/.*$ ^support/.*$ ^master$" \
"^release/.*$ ^support/.*$ ^master$" )
function findBranchesByBaseBranch() {
local BASE_BRANCH
if [[ -z "${1+x}" || "$1" == '.' ]]; then
BASE_BRANCH="$CURRENT_BRANCH"
else
BASE_BRANCH="$1"
fi
for idx in "${!MAP_BASE_BRANCH_REGEX[@]}"; do
local BASE_BRANCH_REGEX=${MAP_BASE_BRANCH_REGEX[$idx]}
if [[ "$BASE_BRANCH" =~ $BASE_BRANCH_REGEX ]]; then
local BRANCHES_REGEXS=( ${MAP_BRANCHES_REGEXS[$idx]} )
if (( ${#BRANCHES_REGEXS[@]} > 0 )); then
findBranchesSharingFirstCommonCommit $BASE_BRANCH "${BRANCHES_REGEXS[@]}"
fi
break
fi
done
}
findBranchesByBaseBranch "$1"
Biased Results for Example Graph
Assuming the script (all parts) is in executable file gitr
, then run:
gitr <baseBranch>
For different branches B
we obtain the following results:
GIVEN B |
Shared Commit C |
Branches P with C in their history (in order) |
---|
feature/a |
D |
master |
feature/b |
D |
master |
feature/c |
L |
release/4, support/1 |
feature/d |
L |
release/4, support/1 |
feature/e |
L |
release/4, support/1 |
feature/f |
C |
release/2, release/3, master |
feature/g |
C |
release/2, release/3, master |
hotfix |
D |
master |
master |
|
(blank, no value) |
release/2 |
C |
master |
release/3 |
C |
master |
release/4 |
L |
support/1 |
support/1 |
L |
master |
Refactor for the Win!
Opportunities!
In this last example, the release branch shares a common commit
with multiple others: release, support, or master branches.
Let's "refactor" or re-evaluate the conventions in used, and tight them a bit.
Consider this git
usage convention:
When creating a new release branch:
immediately create a new commit; perhaps update a version, or the README file.
This ensures that feature/work branches
for the release (branched off the release)
will have the commit shared with the release branch
prior to (and not shared by) the commit for the underlying
support or master branch.
For example:
G---H <- feature/z
/
E <- release/1
/
A---B---C---D <- master
\
F <- release/2
A feature branches off release/1 could not have a common commit
that includes release/1 (it's parent) and master or release/2.
That provides one result, the parent, for every branch,
with these conventions.
DONE! with tools and conventions, I can live in an OCD friendly structured git world.
Your mileage may vary!
Parting thoughts
- gists
Foremost: I've come to the conclusion that,
beyond what's been presented here,
at some point one may need to accept that there may be multiple
branches do deal with.
- Perhaps validations might be done on all potential branches;
"at-least-one" or "all" or ?? rules might be applied.
It's weeks like this that I really think it's time I learn Python.