[PATCH] Switch from i32i8imm to i8 for immediate operands on x86 vector element shift instructions.

Lang Hames lhames at gmail.com
Wed Oct 16 11:17:32 PDT 2013


Hi Craig,

Thanks very much for taking a look at this. You're right that, for the IR
shift instructions, shifting by more than the width of the source is
undefined. However, the test case I'm looking at uses the llvm.x86.sse2.ps*i
intrinsics. I think the semantics of these intrinsics should be the same as
their corresponding x86 instructions, which do define the behavior for
larger shifts. What do you think?

Cheers,
Lang.



On Wed, Oct 16, 2013 at 12:08 AM, Craig Topper <craig.topper at gmail.com>wrote:

> Isn't this considered undefined behavior? For non-vectors, a constant
> shift that's too big is converted to undef.
>
>
> On Tue, Oct 15, 2013 at 12:48 PM, Lang Hames <lhames at gmail.com> wrote:
>
>> The immediate versions of current x86 vector element shift instructions
>> use i32i8imm as the type of the immediate. The i32i8imm type is for 32bit
>> wide immediates with only 8 significant bits of information.
>>
>> I suspect this is done to make the size of the shift operand the same
>> (always 32 bits) regardless of whether a register or immediate is used. It
>> is responsible for a serious bug though: The instruction only encodes 8
>> bits for immediate shifts, so when using i32i8imm the higher bits (>=bit 8)
>> are simply dropped. This leads, for example, to a left-shift by 0x101 of a
>> 4xi32 value becoming a left shift by 0x01 when encoded. That's obviously
>> not correct.
>>
>> The attached patch switches the type of the immediate to i8. Logical
>> shifts by sizes more than the element width are turned into constant zeros.
>> Arithmetic shifts are capped at element width - 1 bits, E.g. A shift by
>> >=32 becomes a shift by 31 for an N x i32 vector. Alternatively, the
>> arithmetic shifts could be capped at the highest encodable shift amount
>> (255). My preference is for the tighter bound though - in theory it could
>> cause more shifts to be CSE'd (though in practice I don't expect it to come
>> up often).
>>
>> The patch passes the regression tests (with minor modifications) and a
>> nightly test suite run.
>>
>> Intel experts - please let me know if you see any issues with this patch,
>> otherwise I'll go ahead and apply it to fix the bug.
>>
>> Cheers,
>> Lang.
>>
>> _______________________________________________
>> llvm-commits mailing list
>> llvm-commits at cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>
>>
>
>
> --
> ~Craig
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20131016/28d859c5/attachment.html>


More information about the llvm-commits mailing list