[llvm-dev] Commit module to Git after each Pass
Alexandre Isoard via llvm-dev
llvm-dev at lists.llvm.org
Fri Jun 15 10:49:10 PDT 2018
On Fri, Jun 15, 2018 at 9:52 AM Troy Johnson via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> > FWIW: We could also just have a mode that dumps 1 file per pass. That is
> enough to make it convenient/easy to run diff between passes. (And if you
> wanted to you could still
> > make a git repository out of it with an external script).
> > - Matthias
> I have done this before and would strongly encourage this approach as
> opposed to direct emission to std[out|err] or directly involving a source
> control system. The most convenient way was to add an additional option,
> -print-to-files, which modified the behavior of -print-after-all,
> -print-before-all, etc. The filename was constructed by massaging the pass
> name to comply with file system naming conventions and prepending a
> monotonically increasing integer (with suitable leading zeros) plus "bef"
> or "aft" to indicate sequencing. The only awkward part was modifying
> createPrinterPass to accept a filename, which had to be done because
> otherwise you end up having to keep each stream open from the time you
> setup the pass pipeline until the printing pass actually runs.
That was the exact implementation we had, and that was way too many files
for our file system, we would have to create subfolders each ~100 passes.
Additionally, this took a lot of disk space and the only metadata we could
store was in the file-name. Do you skip passes that don't change the
module? How do you store the missed optimization opportunities messages?
On the other hand, with git, I can store much more in the commit message (I
actually extended the thing to allow a pass to tag a commit, and I am
planning to allow passes to print into the commit message itself).
Yesterday, I wanted to see when the compiler diverge when I tweak SCEV
reduction rules so what I did is run the compiler once, switch the branch
back to the beginning, do a second run with my modification, and the git
history will automatically identify identical commit. That is, I directly
get, in the git history tree, the divergence point between the two versions.
And that's just scrapping the top of the iceberg. Git is designed to be a
version control system, true, but it can also be re-purposed into a
tremendous tool box.
I would seriously encourage going into the "git fast-import" direction, or
a semantically equivalent output format that we post-process, because I
think it would simplify the implementation (especially to allow a pass to
dump anything into the commit message). But don't pass on the actual
benefits of having a version control system backend.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the llvm-dev