[cfe-dev] Loop pragma for hardware loops
Sjoerd Meijer via cfe-dev
cfe-dev at lists.llvm.org
Tue Apr 13 06:15:06 PDT 2021
Hello Janek,
It looks like you would like to steer which hardwareloop form will be generated with a pragma by providing very detailed target information, but I think a more typical use case of pragmas is to override the cost-model or a transformation threshold/argument. In this case, I would have guessed that the idea of the new pragma is it takes precedence over TTI's isHardwareLoopProfitable hook, and thus would probably have expected something as simple as "hwloop(enable|disable)" initially. If you would like to bring a hardwareloop into a more efficient form, then I think that's mainly the responsibility of the hardwareloop pass or a backend pass (see e.g. the ARM backend passes). I think option 1 is a non-starter as it exposes all sorts of internals that we don't want for different reasons, so option 2 looks a lot better but is still very specific.
Cheers,
Sjoerd.
________________________________
From: llvm-dev <llvm-dev-bounces at lists.llvm.org> on behalf of Janek Van Oirschot via llvm-dev <llvm-dev at lists.llvm.org>
Sent: 13 April 2021 13:28
To: llvm-dev at lists.llvm.org <llvm-dev at lists.llvm.org>; cfe-dev at lists.llvm.org <cfe-dev at lists.llvm.org>
Subject: [llvm-dev] Loop pragma for hardware loops
Hey all,
I'm looking to extend the current clang loop pragmas to also support hardware loops and allow a user to insert (or completely disable) hardware loop intrinsics on a per-loop basis.
One of the questions I have regarding this is how to go about incorporating the different hardware loop intrinsics in the pragma. A few options we came up with:
1. The pragma incorporates which intrinsic to use for a loop:
#pragma loop hwloop(set_loop_i32)
or
#pragma loop hwloop(/*LivesInReg=*/ true, /*AddTestGuard=*/ true, /*NumBits=*/ 32)
2. The pragma adds some target specific info (string?) to use in the hwloop TTI hook/new hwloop TTI hook:
#pragma loop hwloop(target="bdnz") // PPC example
#pragma loop hwloop(target="bdz") // PPC example
or
#pragma loop hwloop(max-count=42, ...)
Option 1 requires the user to know about llvm's hardware loops internals so I'm leaning more towards option 2 as users are more likely to be aware of target specific information (such as PPC's bdnz/bdz).
These are just some options we came up with, we would love to hear about other (better) options, if any.
Kind regards,
Janek van Oirschot
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20210413/a9eee549/attachment-0001.html>
More information about the cfe-dev
mailing list