[llvm-dev] Implicit Defs and Uses are ignored by pre-RA schedulers

Fri Jan 7 02:56:06 PST 2022

Thanks a lot for the replies,

On 1/5/22 3:44 PM, Wang, Phoebe wrote:
>
> Did you try `hasSideEffects = 1`?
>
> I’m not familiar with AArch64. On X86, we have separate FPCR and FPSR. 
> The former is used for control (rounding, exception mask) and the 
> latter is for status. We modeled all FP instructions that may raise 
> exception by `mayRaiseFPException = 1` and using FPCR. Note, the read 
> of FPCR instruction is another use instead of def FPCR. So it’s not 
> necessary to keep the order of read instruction ahead as source order. 
> Only the write FPCR does. I guess it is the same reason for AArch64? 
> Maybe you can have a check on the write of FPCR.
>
> Thanks
>
> Phoebe
>
On our end, hasSideEffects = 1 and mayRaiseFPException = 1 (combined 
with implicit Defs and Uses of our $CS register) do not seem to be 
enough to prevent the reordering of floating-point instructions in 
pre-RA scheduling.

On 1/6/22 9:32 PM, Kevin Neal wrote:

> Correct. You do need to add the required support to your backend.
>
> The X86, PowerPC, and SystemZ backends have basically complete support.
>
> The PowerPC backend has a fix to not reschedule floating-point 
> instructions
>
> around function calls if the rounding mode may change. I haven't heard
>
> that the other two have this fix. AArch64 and RISC-V support are both a
>
> work in progress so one of the three fully-supported targets is best to
>
> examine and emulate.
>
> Also be aware that optimization of strict floating-point is a work in
>
> progress, so be prepared for not-so-great performance.
>
> Lastly, there's currently no way to have machine-specific llvm intrinsics
>
> respect "strict" mode. A fix has been proposed, but I don't think anything
>
> has been implemented.
>
> It might have been clang 12 where a warning was introduced that told you
>
> that "strict" floating-point doesn't work for that target and is therefore
>
> disabled. I don't remember exactly which release first had this.
>
> --
> Kevin P. Neal
> SAS/C and SAS/C++ Compiler
>
> Compute Services
>
> SAS Institute, Inc.

Thank you for the answer - it confirms what I have been seeing.

I will take a closer look to these backends, especially PowerPC's fix to 
not reschedule floating-point instructions above function calls.

Cyril S

> To declare a filtering error, please use the following link 
> <https://www.security-mail.net/reporter.php?mid=1683c.61d5aecc.f65e.0&r=csix%40kalrayinc.com&s=phoebe.wang%40intel.com&o=RE%3A+Implicit+Defs+and+Uses+are+ignored+by+pre-RA+schedulers&verdict=C&c=4d4dde345644db05269b9fb7845bc1591661ae39>
>
> *From:* llvm-dev <llvm-dev-bounces at lists.llvm.org> *On Behalf Of 
> *Cyril Six via llvm-dev
> *Sent:* Wednesday, January 5, 2022 7:44 PM
> *To:* llvm-dev <llvm-dev at lists.llvm.org>
> *Subject:* [llvm-dev] Implicit Defs and Uses are ignored by pre-RA 
> schedulers
>
> Hello,
>
> In our Kalray LLVM backend, we have builtins to get and set system 
> registers. One of them is $CS, which has sticky bits enforcing 
> rounding mode or storing masked floating-point exceptions. The 
> equivalent on AArch64 would be FPCR.
>
> In our user code, we would like to preserve the partial ordering 
> between a SET to $CS and a floating-point operation, since the SET to 
> $CS might be modifying the rounding mode. Similarly, we would like to 
> preserve the partial ordering between a GET from $CS and a 
> floating-point operation, since a user code might want to examine the 
> floating-point exception bits right after a given floating-point 
> operation.
>
> Another use-case we have is the following: we have a coprocessor that 
> is turned on by setting a given bit on a system register. This can be 
> accessed by a builtin. Such SET instruction must happen before using a 
> coprocessor instruction - the compiler should not break that 
> dependency when reordering instructions.
>
> We have tried to implement this by using implicit Defs and implicit 
> Uses in our instruction definitions, using for example `Defs = [CS] 
> in` and `Uses = [CS]` where relevant in our Target Description files.
>
> I have been running some experiments, examining the scheduling outputs 
> and the dependencies (using VLIWScheduler in pre-RA, 
> PostRASchedulerList in post-RA, and a child of VLIWPacketizerList for 
> bundling).
>
> I have found that the implicit defs and uses are indeed taken into 
> account by the post-RA schedulers. However, they seem to be ignored by 
> the pre-RA schedulers. Also, they do not appear as dependencies in the 
> SelectionDAG.
>
> If I look at what some other backends did, AArch64 does not seem to 
> model anything on FPCR. PowerPC sets MFFS as scheduling barrier 
> (isSchedulingBoundary) to prevent floating-point instructions being 
> ordered above it - but isSchedulingBoundary seems to be only used by 
> post-RA schedulers; pre-RA schedulers do not seem to care about that.
>
> The bad consequence for us: our programmers have to encapsulate the 
> SET instructions (touching system registers) in non-inlined functions 
> to enforce the compiler not breaking anything.
>
> We are looking for advice on how to treat this problem - we have 
> possible leads, like modifying the SelectionDAG to recover these 
> dependencies, or modifying the schedulers to scan the SelectionDAG and 
> enforce the source order when such dependency is detected (maybe by 
> having a look at how SourceScheduler works), but we have not yet 
> investigated it fully.
>
> Any such advice would be greatly appreciated
>
> Also, another related issue: it would seem that the flag 
> -ffp-exception-behavior=strict does not preserve the exception 
> semantics like it says it does. Although the generated IR seems to 
> preserve it, there does not seem to be anything in the LLVM backends 
> enforcing the "strict" floating-point exception behavior.
>
> That last point can be witnessed in that piece of code: 
> https://godbolt.org/z/e96zP7jET <https://godbolt.org/z/e96zP7jET>
>
> ```
> long fpcr;
>
> int toto(float a, float b, float c, double d, double e){
>   float bc = b + c; // first faddd
>   asm("mrs %[result], FPCR" : [result] "=r" (fpcr) : :);
>   float abc = a + bc; // second faddd
>   float dw = (float) d; // fwidenlwd : should not happen before the 
> second faddd
>   float ew = (float) e;
>   int dw_ewl = (int) dw + (int) ew;
>   int abcl_dw_ewl = (int) abc + dw_ewl;
>   return abcl_dw_ewl;
> }
>
> ```
>
> Compiling this piece of code with clang 11.0.0 for ARMv8-a gives the 
> following assembly code:
> ```
> toto:
>         fadd    s1, s1, s2
>         fcvt    s2, d3
>         fadd    s0, s1, s0
>         fcvt    s3, d4
>         fcvtzs  w9, s2
>         fcvtzs  w10, s0
>         add     w9, w10, w9
>         fcvtzs  w10, s3
>         add     w0, w9, w10
>         adrp    x9, fpcr
>         //APP
>         mrs     x8, FPCR
>         //NO_APP
>         str     x8, [x9, :lo12:fpcr]
>         ret
> ```
>
> Notice that mrs was moved below - which does not seem to preserve the 
> floating-point exception semantics of the compiled code.
>
> PS : apologies for the double message if any ; I sent the first to 
> llvm-dev-bounces by mistake
>
> Best regards,
>
> *Cyril Six*
> *Compiler Engineer • Kalray*
> Phone:
> _csix at kalrayinc.com <mailto:csix at kalrayinc.com>_• www.kalrayinc.com 
> <https://www.kalrayinc.com>
>
> Kalray logo <https://www.kalrayinc.com/>
>
>
> *Please consider the environment before printing this e-mail.*
>
> This message contains information that may be privileged or 
> confidential and is the property of Kalray S.A. It is intended only 
> for the person to whom it is addressed. If you are not the intended 
> recipient, you are not authorized to print, retain, copy, disseminate, 
> distribute, or use this message or any part thereof. If you receive 
> this message in error, please notify the sender immediately and delete 
> all copies of this message.
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20220107/1990e5cf/attachment-0001.html>