[llvm-dev] Implicit Defs and Uses are ignored by pre-RA schedulers
Cyril Six via llvm-dev
llvm-dev at lists.llvm.org
Fri Jan 7 02:56:06 PST 2022
Thanks a lot for the replies,
On 1/5/22 3:44 PM, Wang, Phoebe wrote:
>
> Did you try `hasSideEffects = 1`?
>
> I’m not familiar with AArch64. On X86, we have separate FPCR and FPSR.
> The former is used for control (rounding, exception mask) and the
> latter is for status. We modeled all FP instructions that may raise
> exception by `mayRaiseFPException = 1` and using FPCR. Note, the read
> of FPCR instruction is another use instead of def FPCR. So it’s not
> necessary to keep the order of read instruction ahead as source order.
> Only the write FPCR does. I guess it is the same reason for AArch64?
> Maybe you can have a check on the write of FPCR.
>
> Thanks
>
> Phoebe
>
On our end, hasSideEffects = 1 and mayRaiseFPException = 1 (combined
with implicit Defs and Uses of our $CS register) do not seem to be
enough to prevent the reordering of floating-point instructions in
pre-RA scheduling.
On 1/6/22 9:32 PM, Kevin Neal wrote:
> Correct. You do need to add the required support to your backend.
>
> The X86, PowerPC, and SystemZ backends have basically complete support.
>
> The PowerPC backend has a fix to not reschedule floating-point
> instructions
>
> around function calls if the rounding mode may change. I haven't heard
>
> that the other two have this fix. AArch64 and RISC-V support are both a
>
> work in progress so one of the three fully-supported targets is best to
>
> examine and emulate.
>
> Also be aware that optimization of strict floating-point is a work in
>
> progress, so be prepared for not-so-great performance.
>
> Lastly, there's currently no way to have machine-specific llvm intrinsics
>
> respect "strict" mode. A fix has been proposed, but I don't think anything
>
> has been implemented.
>
> It might have been clang 12 where a warning was introduced that told you
>
> that "strict" floating-point doesn't work for that target and is therefore
>
> disabled. I don't remember exactly which release first had this.
>
> --
> Kevin P. Neal
> SAS/C and SAS/C++ Compiler
>
> Compute Services
>
> SAS Institute, Inc.
Thank you for the answer - it confirms what I have been seeing.
I will take a closer look to these backends, especially PowerPC's fix to
not reschedule floating-point instructions above function calls.
Cyril S
> To declare a filtering error, please use the following link
> <https://www.security-mail.net/reporter.php?mid=1683c.61d5aecc.f65e.0&r=csix%40kalrayinc.com&s=phoebe.wang%40intel.com&o=RE%3A+Implicit+Defs+and+Uses+are+ignored+by+pre-RA+schedulers&verdict=C&c=4d4dde345644db05269b9fb7845bc1591661ae39>
>
> *From:* llvm-dev <llvm-dev-bounces at lists.llvm.org> *On Behalf Of
> *Cyril Six via llvm-dev
> *Sent:* Wednesday, January 5, 2022 7:44 PM
> *To:* llvm-dev <llvm-dev at lists.llvm.org>
> *Subject:* [llvm-dev] Implicit Defs and Uses are ignored by pre-RA
> schedulers
>
> Hello,
>
> In our Kalray LLVM backend, we have builtins to get and set system
> registers. One of them is $CS, which has sticky bits enforcing
> rounding mode or storing masked floating-point exceptions. The
> equivalent on AArch64 would be FPCR.
>
> In our user code, we would like to preserve the partial ordering
> between a SET to $CS and a floating-point operation, since the SET to
> $CS might be modifying the rounding mode. Similarly, we would like to
> preserve the partial ordering between a GET from $CS and a
> floating-point operation, since a user code might want to examine the
> floating-point exception bits right after a given floating-point
> operation.
>
> Another use-case we have is the following: we have a coprocessor that
> is turned on by setting a given bit on a system register. This can be
> accessed by a builtin. Such SET instruction must happen before using a
> coprocessor instruction - the compiler should not break that
> dependency when reordering instructions.
>
> We have tried to implement this by using implicit Defs and implicit
> Uses in our instruction definitions, using for example `Defs = [CS]
> in` and `Uses = [CS]` where relevant in our Target Description files.
>
> I have been running some experiments, examining the scheduling outputs
> and the dependencies (using VLIWScheduler in pre-RA,
> PostRASchedulerList in post-RA, and a child of VLIWPacketizerList for
> bundling).
>
> I have found that the implicit defs and uses are indeed taken into
> account by the post-RA schedulers. However, they seem to be ignored by
> the pre-RA schedulers. Also, they do not appear as dependencies in the
> SelectionDAG.
>
> If I look at what some other backends did, AArch64 does not seem to
> model anything on FPCR. PowerPC sets MFFS as scheduling barrier
> (isSchedulingBoundary) to prevent floating-point instructions being
> ordered above it - but isSchedulingBoundary seems to be only used by
> post-RA schedulers; pre-RA schedulers do not seem to care about that.
>
> The bad consequence for us: our programmers have to encapsulate the
> SET instructions (touching system registers) in non-inlined functions
> to enforce the compiler not breaking anything.
>
> We are looking for advice on how to treat this problem - we have
> possible leads, like modifying the SelectionDAG to recover these
> dependencies, or modifying the schedulers to scan the SelectionDAG and
> enforce the source order when such dependency is detected (maybe by
> having a look at how SourceScheduler works), but we have not yet
> investigated it fully.
>
> Any such advice would be greatly appreciated
>
> Also, another related issue: it would seem that the flag
> -ffp-exception-behavior=strict does not preserve the exception
> semantics like it says it does. Although the generated IR seems to
> preserve it, there does not seem to be anything in the LLVM backends
> enforcing the "strict" floating-point exception behavior.
>
> That last point can be witnessed in that piece of code:
> https://godbolt.org/z/e96zP7jET <https://godbolt.org/z/e96zP7jET>
>
> ```
> long fpcr;
>
> int toto(float a, float b, float c, double d, double e){
> float bc = b + c; // first faddd
> asm("mrs %[result], FPCR" : [result] "=r" (fpcr) : :);
> float abc = a + bc; // second faddd
> float dw = (float) d; // fwidenlwd : should not happen before the
> second faddd
> float ew = (float) e;
> int dw_ewl = (int) dw + (int) ew;
> int abcl_dw_ewl = (int) abc + dw_ewl;
> return abcl_dw_ewl;
> }
>
> ```
>
> Compiling this piece of code with clang 11.0.0 for ARMv8-a gives the
> following assembly code:
> ```
> toto:
> fadd s1, s1, s2
> fcvt s2, d3
> fadd s0, s1, s0
> fcvt s3, d4
> fcvtzs w9, s2
> fcvtzs w10, s0
> add w9, w10, w9
> fcvtzs w10, s3
> add w0, w9, w10
> adrp x9, fpcr
> //APP
> mrs x8, FPCR
> //NO_APP
> str x8, [x9, :lo12:fpcr]
> ret
> ```
>
> Notice that mrs was moved below - which does not seem to preserve the
> floating-point exception semantics of the compiled code.
>
> PS : apologies for the double message if any ; I sent the first to
> llvm-dev-bounces by mistake
>
> Best regards,
>
> *Cyril Six*
> *Compiler Engineer • Kalray*
> Phone:
> _csix at kalrayinc.com <mailto:csix at kalrayinc.com>_• www.kalrayinc.com
> <https://www.kalrayinc.com>
>
> Kalray logo <https://www.kalrayinc.com/>
>
>
> *Please consider the environment before printing this e-mail.*
>
> This message contains information that may be privileged or
> confidential and is the property of Kalray S.A. It is intended only
> for the person to whom it is addressed. If you are not the intended
> recipient, you are not authorized to print, retain, copy, disseminate,
> distribute, or use this message or any part thereof. If you receive
> this message in error, please notify the sender immediately and delete
> all copies of this message.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20220107/1990e5cf/attachment-0001.html>
More information about the llvm-dev
mailing list