[llvm-commits] [llvm] r67849 - /llvm/trunk/test/CodeGen/PowerPC/int-fp-conv-0.ll

Wed Apr 1 13:33:56 PDT 2009

On Tue, Mar 31, 2009 at 9:04 PM, Chris Lattner <clattner at apple.com> wrote:

> On Mar 30, 2009, at 1:34 PM, Misha Brukman wrote:
>
> I'm proposing we test passes in C++ instead of a shell script that calls
> grep.
>
> What is the real value of doing this?  The costs I see are 1) increased
> link time and 2) more dependence on the API (more to update when an API
> changes).
>

I address your issue (1) in #3 below.  I accept your point (2) as valid, but
based on personal experience, although it's a non-zero effort, it's not as
painful as you imagine, and coupled with the benefits below, I think it's
worthwhile.  Consider it a tradeoff (of CPU time for human time), rather
than simply an added cost.

Let's look at the benefits:

1) Tests can be written at any level of the LLVM API, so you can write tests
to verify a single function, check that a class modifies its state properly
(even private state), verify a pass output, or verify that llc or opt does
everything end-to-end correctly.  For instance, let's say we have a
transformation that's dependent on an analysis, and say we have a test case
where the transformation should have no effect.

We could test with the opt/diff to make sure the output is the same as
input, but that's not really testing what we think we are, because it's
testing the interaction of the two, so a bug in one can mask a bug in
another.  In other words, there are many code paths that can "return false"
along the way, and cause the transformation not to be applied, but do you
really know WHICH one is responsible?

With a C++ test, you can test the analysis pass separately and test its
outputs on one test, and then in another test, inject a Mock (or Fake)
analysis object with known values, and see how the pass behaves.  This lets
you test each piece of code independently, and prevents complex interactions
from messing with your tests.

2) Although LLVM may have high line-coverage via large tests, we don't know
what code path coverage it has, which is also important.  If we can only
test at the granularity of a pass, there are 2**N code paths, and it's hard
to contrive .ll inputs that will cause function foo() to be called with
inputs bar and quux (also makes debugging more difficult than it should be,
see #3 below).  With a single test, you can synthesize exactly what you want
the inputs to be, thus avoiding single-stepping through a debugger when you
don't need to.  So instead of writing, say O(2**N) tests, you can write O(N)
tests (in number, not runtime!) to get much better path coverage.

3) Currently, when something is broken, bugpoint is introduced to narrow
down the test case to something small, and then the developer steps through
the code with a debugger will (hopefully) to find which part of which data
structure isn't being updated properly.  I have personally been spending
*MUCH* less time interacting with debuggers thanks to testing -- instead of
setting a break point and manually verifying state at function entry/exit, I
can set up the test scenario, and call EXPECT() to verify what I think it
should be, and let it run!  I'll quickly know if it's a problem or a red
herring.  This works for any granularity of code, and is really valuable.

Without tests, each developer manually repeats the debugging via "Oh, I
think it's caused when X happens and Y interacts with Z!".  If you have a
test for that that you commit to the repository, anyone who thinks it's
broken can just re-run the test and see that no, that guess was wrong, try
again.

This will save everyone a LOT of time, and given that developer time is more
valuable than CPU time, I would rather have the machine compile/link/run
tests (which is inactive time for me -- I can switch to working on something
else), than firing up my debugger to single-step through code and inspect
complex data structures that could and should be verified automatically,
given my specification (in the form of a test).

4) There are other things we cannot test via opt/grep directly, or even
llvmirgrep:
(a) anything that doesn't have text-based output to grep through, e.g.
  (i) analysis passes (alias analysis, etc.)
  (ii) PassManager (e.g., verify pass ordering in the presence of
FunctionPasses and ModulePasses, etc. that's independent of code being run)
(b) things that read or emit data are only tested indirectly: bitcode
reader/writer, TableGen, code emitters, etc.

Bottom line -- I see at least 2 open bugs on Andersen's that would benefit.
How about we expose its class in a header to be able to write tests and see
how it goes?  And there is a bug for a pass that seems to run out of memory
that might also benefit.  And next time there's some difficult-to-debug
problem that you see as similar to the above, please file a bug and assign
it to me, I'll see what I can do for it with some targetted tests.  How does
that sound?

> How else can we check the correctness of a pass, which involves
> pattern-matching multiple instructions?  Grep can only look at a single line
> at a time, and doesn't have the semantic knowledge of what's an instruction
> and what's an operand, whereas we have all this information in C++.
>
> I think the answer is that we need to build *one* small "llvmirgrep" tool
> that does this, which could be used by many different tests.
>

This is a tool that doesn't exist yet, which means it will require time and
effort to design and implement it, as well as (dare I say it?) test it.  But
LLVM already has something like this in C++, llvm::PatternMatch.  I was
thinking this could be used to pattern-match on LLVM expressions to verify
the transformations.

> I don't know exactly the syntax for it, but I think it should be something
> like the tool we use in clang.  In clang, we write tests for diagnostics
> like this:
>
[snip example]

> In this case, the tool is built into clang, but it doesn't need to be.  The
> -verify mode basically runs the compiler as normal, but captures the
> generated diagnostics to a buffer.  It then scans the .c file for the
> "expected-" comments and "diffs" the generated and expected diagnostics.
>

Sure, this works because the diagnostic is on the same line as the error,
i.e., you don't need to pattern-match multiple lines.

 If we had an llvmirgrep tool, then we could do something like this:
>
> ; RUN: llvm-as < %s | opt -instcombine | llvm-irgrep %s
>
> define i32 @test(float %f) {
>       ; irgrep {%t1 = bitcast float %f to i32}
>       ; irgrep {ret i32 %t1}
>         %tmp7 = insertelement <4 x float> undef, float %f, i32 0
>         %tmp17 = bitcast <4 x float> %tmp7 to <4 x i32>
>         %tmp19 = extractelement <4 x i32> %tmp17, i32 0
>         ret i32 %tmp19
> }
>
> Which would be smart enough to require that those two instructions be in
> that basic block of that function, and it would do a fuzzy match to ignore
> the fact that the generated code names the first instruction "%tmp19".
>

The syntax will get complex when you want to have a pass that deletes an
instruction, or moves it outside of a loop, or breaks up a loop header, etc.

What you want is structural equivalence, which is what I think
llvm::PatternMatch would do for instructions, or llvm-irgrep will
re-implement.  If it re-uses that API, we should just write tests with it
and extend when necessary.  If it's re-creating the API, then we're
duplicating effort.

Since we're hand-waving, let me hand wave some testing code.

TEST(InstCombine, BitCast) {
  // .. create the IR using IRBuilder or other method
  Function *F = ...;

  InstCombine.runOnFunction(F);

  // We want to see a single BasicBlock with this code:
  // %y = bitcast float %x to i32
  // ret i32 %y
  EXPECT_EQ(1, F->numBasicBlocks());
  const BasicBlock &BB = F->front();
  Value *X, *Y;
  ASSERT_TRUE(llvm::PatternMatch::match(BB.getTerminator(),
m_BitCast(m_Value(X), m_value(Y))));
  ASSERT_EQ(Type::float, X->type());
  ASSERT_EQ(Type::i32, Y->type());
}

Alternatively, if we write a test helper function to compare equality of two
BasicBlocks, we can just do:

  BasicBlock *ExpectedBB = ...;
  EXPECT_TRUE(AreBasicBlocksEqual(*ExpectedBB, F->front()));

after we set up the ExpectedBB with the code we want it to have.  It sounds
like llvm-irgrep will have to have this functionality anyway to run your
example, but here, you can compare anything structurally, e.g. the flow
graph, or even an entire Module, which will probably be hard to express in
text for llvm-irgrep.

Misha
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20090401/d39695d5/attachment.html>