[cfe-dev] Loop pragma for hardware loops

Sjoerd Meijer via cfe-dev cfe-dev at lists.llvm.org
Tue Apr 13 06:15:06 PDT 2021


Hello Janek,

It looks like you would like to steer which hardwareloop form will be generated with a pragma by providing very detailed target information, but I think a more typical use case of pragmas is to override the cost-model or a transformation threshold/argument. In this case, I would have guessed that the idea of the new pragma is it takes precedence over TTI's isHardwareLoopProfitable hook, and thus would probably have expected something as simple as "hwloop(enable|disable)" initially. If you would like to bring a hardwareloop into a more efficient form, then I think that's mainly the responsibility of the hardwareloop pass or a backend pass (see e.g. the ARM backend passes). I think option 1 is a non-starter as it exposes all sorts of internals that we don't want for different reasons, so option 2 looks a lot better but is still very specific.

Cheers,
Sjoerd.
________________________________
From: llvm-dev <llvm-dev-bounces at lists.llvm.org> on behalf of Janek Van Oirschot via llvm-dev <llvm-dev at lists.llvm.org>
Sent: 13 April 2021 13:28
To: llvm-dev at lists.llvm.org <llvm-dev at lists.llvm.org>; cfe-dev at lists.llvm.org <cfe-dev at lists.llvm.org>
Subject: [llvm-dev] Loop pragma for hardware loops


Hey all,



I'm looking to extend the current clang loop pragmas to also support hardware loops and allow a user to insert (or completely disable) hardware loop intrinsics on a per-loop basis.



One of the questions I have regarding this is how to go about incorporating the different hardware loop intrinsics in the pragma. A few options we came up with:



1. The pragma incorporates which intrinsic to use for a loop:

#pragma loop hwloop(set_loop_i32)

or

#pragma loop hwloop(/*LivesInReg=*/ true, /*AddTestGuard=*/ true, /*NumBits=*/ 32)



2. The pragma adds some target specific info (string?) to use in the hwloop TTI hook/new hwloop TTI hook:

#pragma loop hwloop(target="bdnz") // PPC example

#pragma loop hwloop(target="bdz")  // PPC example

or

#pragma loop hwloop(max-count=42, ...)



Option 1 requires the user to know about llvm's hardware loops internals so I'm leaning more towards option 2 as users are more likely to be aware of target specific information (such as PPC's bdnz/bdz).

These are just some options we came up with, we would love to hear about other (better) options, if any.



Kind regards,

Janek van Oirschot
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20210413/a9eee549/attachment-0001.html>


More information about the cfe-dev mailing list