[PATCH] D10972: Add missing builtins to altivec.h for ABI compliance (vol. 3)
Bill Schmidt
wschmidt at linux.vnet.ibm.com
Thu Jul 9 06:39:19 PDT 2015
wschmidt added inline comments.
================
Comment at: lib/Basic/Targets.cpp:1015
@@ -1014,2 +1014,3 @@
DiagnosticsEngine &Diags) {
+ bool VSXDisabled = false;
for (unsigned i = 0, e = Features.size(); i !=e; ++i) {
----------------
nemanjai wrote:
> wschmidt wrote:
> > This way of handling the various -m arguments positionally is not quite right. If both -mvsx and -mno-vsx are present in the command line, the last one seen wins. That means that if the user specifies
> >
> > -mno-vsx -mpower8-vector -mvsx
> >
> > we should turn on both the VSX and the P8Vector features. The logic you are using will only turn on the VSX feature because the specification of -mpower8-vector happened to be in between -mno-vsx and -mvsx.
> >
> > Also, it is an error to specify -mpower8-vector or -mdirect-moves if -mno-vsx is the switch value that wins due to occurring last.
> >
> > So I suggest that you track each feature separately during this loop, obtaining individual values for HasVSX, HasP8Vector, and HasDirectMove. Following the loop, if !HasVSX && (HasP8Vector || HasDirectMove), report an error and fail the compilation.
> Thank you for this comment. I think I know what and how we need to do (this matches the gcc behaviour):
> - all three are set by default for ppc64le/pwr8
> - -mno-vsx disables all three defaults
> - if an -mno-vsx wins (appears after any -mvsx) and the user explicitly specifies one of the other two, a diagnostic is emitted
Agreed, that is more complete than what I wrote.
================
Comment at: lib/Headers/altivec.h:7013
@@ -6712,2 +7012,3 @@
+ return (vector signed long long)__res;
}
----------------
nemanjai wrote:
> wschmidt wrote:
> > Seems like too much casting above. It's ok to just use the original code here (and make "vector long long" into "vector unsigned long long"). The instruction will mask off all but the rightmost 6 bits of each element of b, so the signedness doesn't matter.
> I am not sure if you are referring only to the vector long long overload of vec_sr or to all of them. In any case, the reason for the casts is that Clang will produce an **`lshr`** when the LHS is unsigned and an **`ashr`** when the LHS is signed. This is what was causing LLVM to emit **`vsr[bhwd]`** for the unsigned ones and **`vsra[bhwd]`** for the signed ones. So the casts are simply to ensure that the IR contains the **`lshr`** instructions rather than **`ashr/lshr`** depending on signedness.
>
> Of course, I could have just defined builtins and called the same one for both signed and unsigned overloads, but I assumed that since IR instructions exist for this, it is better to retain the information about what operation is actually being performed in case the optimizer can use it.
You have wrongly changed the instruction to be generated. The user has chosen vec_sr, which means the user wants vsr[bhwd]. If they wanted vsra[bhwd], they could have chosen vec_sra. See figures 4-120 and 4-121 of http://www.freescale.com/files/32bit/doc/ref_manual/ALTIVECPIM.pdf to see the required instruction mappings.
Repository:
rL LLVM
http://reviews.llvm.org/D10972
More information about the cfe-commits
mailing list