[PATCH] D26546: [PPC] Add vec_insert4b/vec_extract4b to altivec.h

Sean Fertile via cfe-commits cfe-commits at lists.llvm.org
Mon Nov 14 19:02:31 PST 2016


sfertile added inline comments.


================
Comment at: lib/Headers/altivec.h:11908
+#define vec_extract4b(__a, __b)                                                \
+   vec_reve((vector unsigned long long)                                        \
+                __builtin_vsx_xxextractuw((__a), (12 - (__b & 0xF))))
----------------
nemanjai wrote:
> I find it difficult to follow and understand this logic when it's in the header.
> What I'd prefer to see here is that the macro simply expands into `__builtin_vsx_xxextractuw` and then handle all this logic in the code that emits an intrinsic call.
> Namely if the target is little endian, we adjust the parameter, emit the intrinsic call and finally emit a shufflevector.
I think this is a good idea, looking at the code its not obvious what is going on.


================
Comment at: lib/Headers/altivec.h:12014
+#define vec_insert4b(__a, __b, __c) \
+  ((vector unsigned char)__builtin_vsx_xxinsertw((__a), (__b), (__c) & 0xF))
+#endif
----------------
kbarton wrote:
> nemanjai wrote:
> > As far as I can tell by looking at this patch and the corresponding back end patch, the `__a` argument will have a word inserted into it and it will be returned.
> > 
> > Is that the semantics that the ABI specifies (I can't seem to make sense of the description).
> > 
> > ```
> > vector unsigned int a = { 0xAAAAAAAA, 0xBBBBBB, 0xCCCCCC, 0xDDDDDD };
> > vector unsigned char b = (vector unsigned char) 0xFF;
> > vector unsigned char c = vec_insert4b(a, b, 4);
> > // Do we expect vector c to be:
> > // { 0xAA, 0xAA, 0xAA, 0xAA, 0xFF, 0xFF, 0xFF, 0xFF, 0xCC, 0xCC, 0xCC, 0xCC, 0xDD, 0xDD, 0xDD, 0xDD }
> > ```
> I think the current version of the ABI document has an error in it. The description of the vec_insert4b is identical to the vec_extract4b, so I expect it was copy/pasted in error. I think we need to open up an (internal) bug against the ABI and wait for clarification to complete this. 
You have it correct Nemanja, word 1 will be extracted from b, and it will get inserted into a. The word will be inserted at the byte position starting at the 3rd argument. (so in this case byte offsets 4 to 7)

I talked to Bill Schmidt earlier today and he already has a bug open. It hasn't been updated yet, but it should roughly correspond to the description for the xxinsertw instruction:

The contents of word element 1 of VSR[XB] are placed
into byte elements UIM:UIM+3 of VSR[XT]. The contents
of the remaining byte elements of VSR[XT] are not
modified.


Repository:
  rL LLVM

https://reviews.llvm.org/D26546





More information about the cfe-commits mailing list