[llvm-dev] [RFC] mir-canon: A new tool for canonicalizing MIR for cleaner diffing.

Puyan Lotfi via llvm-dev llvm-dev at lists.llvm.org
Mon Aug 21 23:45:26 PDT 2017


Ping.

Still working on preparing code for review. Will have a patch for review
ready in the coming days.

PL

On Tue, Aug 15, 2017 at 12:06 PM Puyan Lotfi <puyan.lotfi.llvm at gmail.com>
wrote:

> Hi,
>
>
> My name is Puyan and I've been exploring ways to improve the state of
> instruction level diffing using llvm and MIR. Below is a proposal for a new
> llvm tool meant to address issues encountered when diffing at the machine
> level. I'm eager to hear the community's feedback.
>
>
> Thanks
>
>
> PL
>
>
> mir-canon: A new tool for canonicalizing MIR for cleaner diffing.
>
>
> Problem Statement and Context:
>
> Diff-tools are regularly used for comparing IR and assembly files. This
> often involves reasoning through differences that are semantically
> equivalent and it can be very time consuming for a person to do said
> reasoning.
>
> Specifically in the context of GlobalISel development there are problems
> of correctness verification. There is a need to compare two programs,
> compiled from identical IR by two different instruction selectors in a way
> where the true differences stand out.
>
>
> Proposal:
>
> We propose a new tool that we have tentatively named 'mir-canon' that
> performs canonical transformations on MIR. The goal is for MIR
> pre-processed with mir-canon to show fewer differences than if it were not
> pre-processed.
>
> At the time of this writing we have a prototype canonicalization tool.
> We’ve come up with some techniques that show promise and would like to open
> discussion with the community to get feedback and suggestions on refining
> said techniques. Currently we think of this as an open ended project.
>
>
> Techniques:
>
> Our prototype does the following for each basic block in a Reverse Post
> Ordering:
>
>
>    -
>
>    Canonical instruction ordering is done by moving a given instruction
>    as close to the nearest use of its def as possible.
>
>
>
>    -
>
>    Next, canonical VReg renaming is done by building a collection of
>    candidate instructions that can be thought of as sinks in the def-use
>    graph: they are typically instructions that write to physical registers or
>    store to memory. These candidates are used as the root of a breadth first
>    walk over the vreg operand def-use graph that determines a canonical vreg
>    ordering.
>
>
>
>    -
>
>    Using said canonical vreg ordering we rename monotonically, but before
>    we do this we skip several vreg values in order to increase the chance that
>    we land on the same vreg number for two different input MIR files. We also
>    do this to reduce the chances that a difference in previously def-use walks
>    will affect the vreg renaming for subsequent walks. This skipping step
>    could be thought of as a kind of vreg number reckoning: we skip modulo n
>    vregs so that we are likely to land on the same vreg for two different
>    files.
>
>
>
> This approach is completely agnostic of ISA specific semantics so it
> should work for any target.
>
>
> Current status:
>
> At the moment we have a standalone llvm tool that uses a single pass to do
> the above described transformations. We have test inputs that show promise
> but we still need a wider set of tests as well as hard metrics.
>
> Our approach processes a single file at a time. The primary benefit to
> this approach is lower complexity in initial implementation and exploration
> of building such a tool. We are open to other approaches such as an
> llvm-diff like (two file at a time) approach, but we have not explored that
> avenue fully yet.
>
> We’re eager to hear community feedback and will be ready to share patches
> for review soon.
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170822/f547aabc/attachment.html>


More information about the llvm-dev mailing list