[PATCH] D12032: Vector element extraction without stack operations on Power 8

Tue Aug 18 13:28:51 PDT 2015

wschmidt added inline comments.

================
Comment at: lib/Target/PowerPC/PPCInstrVSX.td:1284
@@ +1283,3 @@
+  dag LE_BYTE_1 = (i32 (EXTRACT_SUBREG (RLDICL LE_DWORD_0, 56, 8), sub_32));
+  dag LE_BYTE_2 = (i32 (EXTRACT_SUBREG (RLDICL LE_DWORD_0, 48, 16), sub_32));
+  dag LE_BYTE_3 = (i32 (EXTRACT_SUBREG (RLDICL LE_DWORD_0, 40, 24), sub_32));
----------------
wschmidt wrote:
> This code concerns me, starting with LE_BYTE_2.  I see from your test cases that it happens to work, but it looks fragile to me.
> 
> If you have bytes 7 6 5 4 3 2 1 0 and apply RLDICL 48, 16, you will get
> X X X X X X 3 2 where the Xs have been cleared to zero.  The correct answer is X X X X X X X 2.
> 
> For RLDICL 40, 24, you will get X X X X X 5 4 3, which is incorrect for both byte and halfword extraction.
> 
> The pattern you need is (RLDICL LE_DWORD_0 48, 8), followed by 40,8, etc.  Then you need separate patterns for LE_HWORD that uses 16 instead of 8.
> 
> Somehow in your tests you are ending up with extsb instructions that make this work, which I really don't understand based on the code you're specifying here (which just creates an i32, not an i8).  I am concerned that those are an artifact that could disappear.  Can you explain how those come to be generated?
Ah, I see why.  Your tests are returning an i8, which forces the conversion from i32 to i8 that generates the extsb.  Without that, the incorrect code generation would be exposed.

Repository:
  rL LLVM

http://reviews.llvm.org/D12032