Working with the LLVM git Repository ==================================== This guide provides information about how to access and interact with LLVM's git repository. Specifically, it covers how to check out the LLVM git repository, create patches for review, revise those patches as necessary and commit them to the LLVM upstream repository. Initially, this guide is a sketch of how a transition to git may be done without disrupting the current LLVM review process. It may be adapted later into a full-fledged user guide. This guide does not specifically cover interacting with Clang and other LLVM projects. However, the information here is likely to be adaptable to those projects. This guide assumes the developer only wants a single clone of LLVM. It is possible to set up complex work environments with a local shared clone of LLVM and various clones of the shared local repository, but this guide does not attempt to describe such scenarios. Such setups are highly useful for tracking independent third-paty development but they are beyond the scope of this document. All of the git commands described here have associated man pages. The easiest way to access them is via git help. git help rebase git help send-email A simple Google of the git command will also return a link to its associated man page. Getting LLVM ------------ To check out the current LLVM sources, use git clone: git clone http://llvm.org/git/llvm.git llvm This will get you a repository that looks like this: ...-A[HEAD] where "..." is the history of the project leading up to the most recent commit A. HEAD is a reference to this most recent commit. Naming Upstream --------------- The intial clone from upstream results in a git remote reference with the rather unhelpful name of "origin." As more remote sources get added (perhaps to track potential features developed by other LLVM contributors), it is easy to forget what "origin" is. Therefore, if you would like, give it a more descriptive name. git remote rename origin llvm-upstream This guide will assume this renaming has been done. Updating LLVM ------------- Suprisingly many ways exist to update sources. There are good arguments for all of them. To keep things simple, we'll concentrate on one formula for getting updates. Adventurous readers can consult the Resources at the end of this guide to search out more information. To update your clone to the latest LLVM sources, first fetch them from upstream. git fetch llvm-upstream This may result in a dag like this B-C{llvm-uptream/master} / ...-A{master}[HEAD] Notice that git-fetch automatically created a branch (technically, a remote-tracking branch) to hold the pending updates. This is useful for several reasons. You may examine the branch, do diffs against it, etc. to see what will be merged into your master branch. It also serves as a reference point to use to back out a merge to master if, for example, there are too many conflicts. It is very convenient to have a local branch like this available, which is why we advocate the "two phase" update process [2]. Once we've fetched the changes, we can apply them to our master branch: git rebase llvm-upstream/master This results in the tree: ...-A-B-C[HEAD] showing two new commits made since the last pull. We advocate using git rebase to merge in changes from upstream because keeping a linear history makes reviews easier. If you are always in the habit of using git rebase to merge changes from upstream, you won't forget to do it when you actually have local changes to apply to the new upstream commits. Working with LLVM ----------------- Actual code development in the workarea proceeeds much like any project, except that with git you may commit changes locally without disturbing upstream. Let us say that the curent dag looks like this: ...-A[HEAD] One can get an textual graphic of the history dag with git log: git log --oneline --graph This will show one line per commit and a dag structure in the left columns. A tool such as gitk can show the same information graphically. Another trick is to create an alias that pumps git log --graph to graphviz [3]: [alias] graphviz = "!f() { echo 'digraph git {' ; git log --pretty='format: %h -> { %p }' \"$@\" | sed 's/[0-9a-f][0-9a-f]*/\"&\"/g' ; echo '}'; }; f" Then do: git graphviz | dotty /dev/stdin We want to make some changes to if conversion. First we create a local branch to hold the changes: git checkout -b ifconvert Now we edit the files and commit them. git add lib/CodeGen/IfConversion.cpp git commit git has some conventions about commit message format. The most important one is that if there is a single line separated from the rest, like this: This is the subject This patch does some really funky stuff. I hope it will work! Then "This is the subject" will become the line printed when doing --oneline log displays and will also be the subject of e-mails sent to the mailing list. It is good gitiquette to provide such subject lines. Our dag now looks like this: ...-A{master} \ B{ifconvert}[HEAD] Let's make some more changes: ...-A{master} \ B-C-D{ifconvert}[HEAD] Preparing a Commit ---------------------- Let's say we want to commit the changes to if conversion, contained in revisions B and C. We don't yet want to send D for review. There are several strategies available, but the fundamental choice is whether to create a separate branch to do the merge and commit. Usually this is not necessary but for very complex situations it can help to organize things a bit more explicitly. Let's say we choose the simple strategy of merging from master. The first step is to get the identifiers of B and C: git log --oneline: 8fe2a Finish if conversion work afe3d Middle of if conversion work, something interesting to commit 44ef3 Start if conversion work (we hope subjects are more descriptive than this!) We go back to master to start the commit process: git checkout master ...-A{master}[HEAD] \ B-C-D{ifconvert} Now we choose the commits we want to send upstream: git cherry-pick 44ef3 git cherry-pick afe3d The dag looks like this: ...-A-B'-C'{master}[HEAD] \ B-C-D{ifconvert} B' and C' are the cherry-picked B and C. This graph emphasizes the reality that B' and C' are really two different commits from B and C and because we used cherry-pick they have no relation to each other. This is one of the downsides of cherry-pick but it is usually not a problem. Note that if we hadn't gone and implemented D we could have done a merge from ifconvert to master. But it's often hard to predict how commits will be organized and presented to upstream before development of a feature is done. There are ways to back up history and put B and C on their own feature branch so as to aid merging and avoid cherry-picking to better preserve history [2]. Updating LLVM - Now with local changes! --------------------------------------- We have the commits we want in the master branch but we should apply them against the latest upstream master. git rebase is the tool to do this. Just like before, we first fetch the upstream sources: git fetch llvm-upstream E-F{llvm-uptream} / ...-A-B'-C'{master}[HEAD] \ B-C-D{ifconvert} Now we want to take those upstream changes and apply our local changes on top of them. This maintains a linear history, making reviews much easier. git rebase llvm-upstream/master There may be conflicts from this operation. If so, resolve them in the usual way. git has some tools to help but they are beyond the scope of this guide. Our history looks like this: ...-A-E-F-B''-C''{master}[HEAD] \ B-C-D{ifconvert} Where B'' and C'' are the new commits of B' and C' on top of the latest upstream master. Note that this process is exactly the same as the previous update process. Keeping things consistent like this will help you develop a repeatable and thus debuggable process. Now we are ready to begin the review process. Sending Patches for Review -------------------------- git includes a whole set of tools for managing the patch review process. We kick things off with git send-email: git send-email --annotate --compose --no-signed-off-by-cc --suppress-from \ --thread HEAD~1..HEAD 2>&1 | tee email.out It's good practice to do this with the --dry-run switch first, just to make sure what will be sent is what you want. --annotate allows you to edit each e-mail before git sends it. --compose produces a cover letter for a patch serives that you may edit to introduce the patches. This becomes a separate email to which the patches are replies. Don't bother using --compose if you only have one patch to send. This is a rather unwieldy command. There are several ways to mitigate this. You can create an alias in .git/config: [alias] email = send-email --annotate --no-signed-off-by-cc --suppress-from --thread Then you would run: git email --compose HEAD~1..HEAD You may also set some default switches in .git/config: [sendemail] smtpencryption = tls smtpserver = smtpuser = smtpserverport = 25 thread = true chainreplyto = false to = llvm-commits@cs.uiuc.edu signedoffbycc = false from = suppressfrom = true Then you would run: git send-email --annotate --compose HEAD~1..HEAD If you don't set to/from/etc. in .git/config git send-email will prompt you for them. E-mails will get sent to the list with the subject "[PATCH n/2] " where "n" is the patch number (0 for the cover letter) and is the first line of the commit message. git help send-email gives all the details. Some people prefer to look at all the e-mails in bulk before sending them. You can do this with git format-patch but we will not cover that here. Your patches have been sent to llvm-commits for review. Interact over e-mail and respond to feedback. Updating Patches ---------------- Your patches will probably require some editing. git rebase -i and git add -i are your friends. For the typical case of editing your patches a bit, use git rebase -i: git rebase -i llvm-upstream/master This rewinds history back to the last fetch you did from llvm-upstream (remember the convenient remote-tracking branch that got created?). git will then bring up an editor with a document that looks something like this: pick ef723 Start if conversion work pick 443de Middle of if conversion work, something interesting to commit This is a control file you edit to state how git-rebase should work. There are essentially four commands pick, edit, fixup and squash. "pick" tells rebase to apply that change as-is. "edit" tells rebase to suspend working immediately after applying that commit. This allows you to edit the patch as necessary. "squash" tells rebase to combine that commit with the previous one and apply them as a single change. "fixup" is like squash except that the commit message for the quashed commit is discarded. This essentially does what it says: amends the previous commit with a fixup and commits the result with the original log message. One can also reorder commits within the file to change the order they are applied. If you delete a line, that commit disappears from the rewritten history. In this case, let's say that the first commit needs some work. We edit the control file to do that: edit ef723 Start if conversion work pick 443de Middle of if conversion work, something interesting to commit After saving and quitting, git-rebase does its work and we are left in this state, assuming no intervening upstream changes: ...-A-E-F-B''{master}[HEAD] \ B-C-D{ifconvert} Note that B'' is still applied. There are a couple of ways to proceed. One strategy is to back out B'', edit the files and re-commit it. However, that can involve some tricky "git reset" commands. Another, perhaps easier way to go is to just create a new commit. First edit the files as needed, them commit the result: ...-A-E-F-B''-G{master}[HEAD] \ B-C-D{ifconvert} Now tell rebase to continue and finish up: git rebase --continue ...-A-E-F-B''-G-C''{master}[HEAD] \ B-C-D{ifconvert} The only remaining problem is that we have the fixup G commit that shouldn't appear in the upstream history. We really want to combine it with B''. So we use git-rebase again: git rebase -i llvm-upstream/master pick ef723 Start if conversion work pick 4542e Review fixup pick 443de Middle of if conversion work, something interesting to commit We edit things as so: pick ef723 Start if conversion work fixup 4542e Review fixup pick 443de Middle of if conversion work, something interesting to commit After saving the file and returning to git, git rebase will do its work. ...-A-E-F-B'''-C''{master}[HEAD] \ B-C-D{ifconvert} Where B''' is the fixed-up version of B''. Now we can either send the changes upstream or do another round of code review via git send-email. Splitting Patches ----------------- Sometimes reviewers request that a patch be broken up into smaller components. This is easy to do with git add -i. Let's say the first patch is too large. git rebase -i llvm-upstream/master pick ef723 Start if conversion work pick 443de Middle of if conversion work, something interesting to commit Edit the first patch: edit ef723 Start if conversion work pick 443de Middle of if conversion work, something interesting to commit After rebase suspends, back out the commit: git reset --mixed HEAD^ Your files are left in the "changed, but not staged for commit" state and will have to be added again. This is what you want because you need to add just a few files or parts of files. So decide what to add: git add -i This brings up a little menu which lets you pick which files or hunks of files to add. Going through all the options is beyond the scope of this guide. One good guide is here: http://book.git-scm.com/4_interactive_adding.html After finishing the add process, do a commit. git commit Do NOT do commit -a as that will re-add everything. Go on to do some more add -i and commits to stage everything as desired. Then tell rebase to finish up: git rebase --continue Merging Patches --------------- Let's say we got a little overzealous in our splitting and our history looks like this: ...-A-E-F-B'''-G-H-I-J-C''{master}[HEAD] \ B-C-D{ifconvert} B''' is that part of B'' remaining from the original commit and G, H, I, J are bits of B'' split out into separate commits. Again, rebase -i to the rescue! git rebase -i llvm-upstream/master pick bbff3 Start if conversion work pick 33421 Tinker a bit pick e23ac Turn that knob pick 887ef Almost there pick f93d2 Finish the beginning pick 443de Middle of if conversion work, something interesting to commit This is a bit too fine-grained for our (and the reviewers!) taste. So we want to combine H, I and J into one commit: pick bbff3 Start if conversion work pick 33421 Tinker a bit pick e23ac Turn that knob fixup 887ef Almost there fixup f93d2 Finish the beginning pick 443de Middle of if conversion work, something interesting to commit We save the file, exit the editor and git rebase does its thing. ...-A-E-F-B'''-G-K-C''{master}[HEAD] \ B-C-D{ifconvert} Ah, much better. Now we are ready to send this stuff out into the world. Sending to Upstream ------------------- Finally, we are ready to send the patches. First make sure we are as up-to-date as possible: git fetch llvm-upstream master git rebase llvm-upstream/master Now send your changes to upstream: git push llvm-upstream master That's it! Resources --------- Here are some helpful git resources. [1] Pro Git book: http://progit.org/ This is a great book to grok git. It explains the tools in a straightforward way. Get it in paperback for a great reference! All of the other resources will make much more sense after having read this book. [2] git fetch/merge vs. pull: http://longair.net/blog/2009/04/16/git-fetch-and-merge/ [3] Git wiki: https://git.wiki.kernel.org/index.php/Aliases [4] Git rebase worflow: http://mettadore.com/analysis/a-simple-git-rebase-workflow-explained/ The git manpages are pretty good once you are reasonably familiar with the tools. They contain a lot of good examples for various scenarios.