[cfe-dev] [llvm-dev] Sequential ID Git hook
Tom Honermann via cfe-dev
cfe-dev at lists.llvm.org
Thu Jun 30 08:13:29 PDT 2016
On 6/30/2016 7:43 AM, Renato Golin via llvm-dev wrote:
> Given the nature of our project's repository structure, triggers in
> each repository can't just update their own sequential ID (like
> Gerrit) because we want a sequence in order for the whole project, not
> just each component. But it's clear to me that we have to do something
> similar to Gerrit, as this has been proven to work on a larger
> infrastructure.
I'm assuming that pushes to submodules will result in a (nearly)
immediate commit/push to the umbrella repo to update it with the new
submodule head. Otherwise, checking out the umbrella repo won't get you
the latest submodule updates.
Since updates to the umbrella project are needed to synchronize it for
updates to sub-modules, it seems to me that if you want an ID that
applies to all projects, that it would have to be coordinated relative
to the umbrella project.
> Design decisions
>
> This could be a pre/post-commit trigger on each repository that
> receives an ID from somewhere (TBD) and updates the commit message.
> When the umbrella project synchronises, it'll already have the
> sequential number in. In this case, the umbrella project is not
> necessary for anything other than bisect, buildbots and releases.
I recommend using git tag rather than updating the commit message
itself. Tags are more versatile.
> I personally believe that having the trigger in the umbrella project
> will be harder to implement and more error prone.
Relative to a SQL database and a server, I think managing the ID from
the umbrella repository would be much simpler and more reliable.
Managing IDs from a repo using git meta data is pretty simple. Here's
an example script that creates a repo and allocates a push tag in
conjunction with a sequence of commits (here I'm simulating pushes of
individual commits rather than using git hooks for simplicity). I'm not
a git expert, so there may be better ways of doing this, but I don't
know of any problems with this approach.
#!/bin/sh
rm -rf repo
# Create a repo
mkdir repo
cd repo
git init
# Create a well known object.
PUSH_OBJ=$(echo "push ID" | git hash-object -w --stdin)
echo "PUSH_OBJ: $PUSH_OBJ"
# Initialize the push ID to 0.
git notes add -m 0 $PUSH_OBJ
# Simulate some commits and pushes.
for i in 1 2 3; do
echo $i > file$i
git add file$i
git commit -m "Added file$i" file$i
PUSH_TAG=$(git notes show $PUSH_OBJ)
PUSH_TAG=$((PUSH_TAG+1))
git notes add -f -m $PUSH_TAG $PUSH_OBJ
git tag -m "push-$PUSH_TAG" push-$PUSH_TAG
done
# list commits with push tags
git log --decorate=full
Running the above shows a git log with the tags:
commit a4ca4a0b54d5fb61a2dacbab5732d00cf8216029 (HEAD, tag:
refs/tags/push-3, refs/heads/master)
...
Added file3
commit e98e2669569d5cfb15bf4cd1f268507873bcd63f (tag: refs/tags/push-2)
...
Added file2
commit 5c7f29107838b4af91fe6fa5c2fc5e3769b87bef (tag: refs/tags/push-1)
...
Added file1
The above script is not transaction safe because it runs commands
individually. In a real deployment, git hooks would be used and would
rely on push locks to synchronize updates. Those hooks could also
distribute ID updates to the submodules to keep them synchronized.
Tom.
More information about the cfe-dev
mailing list