[LLVMdev] MC Hammer Test results

Thu May 10 09:44:54 PDT 2012

Hello everyone

At EuroLLVM I presented some testing work we have been doing on improving
correctness of the MC Layer for ARM. There seemed to be interest from the
community in seeing the results of this test suite.

Background
-----------
We are using a test suite, called MC Hammer, that compares MC with an ARM
in-house implementation of the same functionality. The test space for this suite
is very large ( O(10 trillion) points ) so we are concentrating on small slices
at a time.

For further details you can check out the talk I did at EuroLLVM last month:
http://llvm.org/devmtg/2012-04-12/Slides/Richard_Barton.pdf

Results
--------
The below results are:

 - for ARM instructions (i.e. not Thumb instructions)
 - for Cortex-A8 with VFPv3 and Advanced SIMDv1 extensions
 - for the encode/decode loop, described on slide 11 of the talk
 - for all instruction encodings with condition code AL (all 32-bit patterns
with the top 4 bits set to 0xE) [1]
 - all silent codegen bugs[2], that is bugs where:
   - The reference bitpattern is defined and predictable.
   - The generated bitpattern is defined.
   - The generated bitpattern differs from the reference bitpattern.
 - for LLVM r156468 (updated at 2012-05-09 08:08:58 BST.) 
 - for LLVM built with no assertions[3]

Attached is a zipped copy of the full log from MC Hammer. 

Also attached is a triaged summary of these logs, showing that there are five
bugs found (summary from log reproduced below [4]). Currently we have patches
for three (bugs 1-3) of these that are ready to go upstream. We are actively
working on a patch for one other (bug 5). We would, of course welcome a patch
for bug 4 (I have put a proposal for the fix in the attached report.)

Unless there is a preference otherwise, the next set of results will be the list
of silent codegen bugs for the 0xF slice. I will then move to the list of silent
codegen bugs for the disassemble and assemble loops (slides 10 and 12
respectively) but, like MC Hammer's DJ, I can take requests.

Regards,

Richard Barton
ARM Ltd, Cambridge

====================

[1] The slice is run with only one condition code to reduce the size of the
logs, that is to say for all 32-bit patterns with the top 4 bits set to 0xE.
Running the whole space over all 32 bits hits the same bugs repeatedly, once for
each condition code. Condition code 0xF is for instructions that can only be
executed unconditionally. Notably, this includes most of the VFP instructions. 

[2] Other types of failure can be detected by MC Hammer. These results can be
published if there is interest in seeing them.

 [3] LLVM built with assertions turned on hits SIGABRT on some bitpatterns. When
this occurs the rest of the slice is not run, so to cover the whole slice we are
using an version of LLVM with no assertions. One would expect an assertion to
catch early a genuine codegen failure so hopefully bugs that would have
triggered an assertion will still show up without them on.

[4] (reproduced bug triage from the error output)

[bug 1] echo 0xb0 0x00 0x80 0xe6 | ./llvm-mc -triple armv7 --show-inst
--show-encoding --disassemble

This bitpattern should decode to an unpredictable SEL r0, r0, r0. MC is
decoding this to an STR r0, [r0], r0, lsr #1 which it is incorrectly diagnosing 
as unpredictable.

[bug 2] echo 0x70 0x01 0x80 0xe6 | ./llvm-mc -triple armv7 --show-inst
--show-encoding --disassemble

This bitpattern should decode to an unpredictable SXTAB16 r0, r0, r0. MC is
decoding this to an STR r0, [r0], r0, ror #2 which it is incorrectly diagnosing 
as unpredictable.

[bug 3] echo 0x70 0x01 0xc0 0xe6 | ./llvm-mc -triple armv7 --show-inst
--show-encoding --disassemble

This bitpattern should decode to an unpredictable UXTAB16 r0, r0, r0. MC is
decoding this to an STRB r0, [r0], r0, ror #2 which it is incorrectly diagnosing

as unpredictable.

[bug 4] echo 0x90 0x00 0xc0 0xe7 | ./llvm-mc -triple armv7 --show-inst
--show-encoding --disassemble
        echo 0x90 0x01 0xc0 0xe7 | ./llvm-mc -triple armv7 --show-inst
--show-encoding --disassemble

This bitpattern decodes to a BFI with an invalid mask operand, which is 
unpredictable. The first example fails with an abort when they are turned on, 
and otherwise creates the instruction BFI r0, r0, #32, #-32.

The second example does not abort and decodes to BFI r0, r0, #1, #2
(0xe7c20090). 

The ARMARM could be clearer on this point, but the real UAL should be

BFI r0, r0, #lsbit #(msbit+1-lsbit) or BFI r0, r0, #3, #-2

In my opinion, the root cause of the problem is that BFI MCInsts store the
mask as a 32-bit operand and converts to and from the msbit and lsbit fields
during encode, decode, assemble and disassemble. I think it should store the 
msbit and lsbit fields as operands and compute the mask at the instruction 
selection phase.

[bug 5] echo 0x03 0x0b 0x80 0xec | ./llvm-mc -triple armv7 --show-inst
--show-encoding --disassemble

The bitpattern is decoding as VSTMIA r0, {d0} when it should decode to FSTMIAX
r0, {d0}

These instructions are a bit of a curiosity in that they are pre-ARMv6 (VFPv1) 
instruction mnemonics which were not superseded by UAL-style V* mnemonics. They 
still exist in VFPv4 but their use is deprecated. Any VSTM's with odd numbered 
imm8 fields (bottom 8 bits) are the old-style F* encodings, and the encoding i
for the immediate is different.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: ca8_ARM_enc_dec_alcond_diffsonly_raw.rpt.bz2
Type: application/octet-stream
Size: 1272798 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20120510/40dc6f5e/attachment.obj>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: ca8_ARM_enc_dec_alcond_diffsonly.txt
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20120510/40dc6f5e/attachment.txt>