[flang-commits] [flang] [flang][MLIR] Hoist `do concurrent` nest bounds/steps outside the nest (PR #114020)
via flang-commits
flang-commits at lists.llvm.org
Tue Oct 29 07:40:58 PDT 2024
================
@@ -2131,18 +2131,33 @@ class FirConverter : public Fortran::lower::AbstractConverter {
llvm::SmallVectorImpl<const Fortran::parser::CompilerDirective *> &dirs) {
assert(!incrementLoopNestInfo.empty() && "empty loop nest");
mlir::Location loc = toLocation();
+
for (IncrementLoopInfo &info : incrementLoopNestInfo) {
- info.loopVariable =
- genLoopVariableAddress(loc, *info.loopVariableSym, info.isUnordered);
- mlir::Value lowerValue = genControlValue(info.lowerExpr, info);
- mlir::Value upperValue = genControlValue(info.upperExpr, info);
- bool isConst = true;
- mlir::Value stepValue = genControlValue(
- info.stepExpr, info, info.isStructured() ? nullptr : &isConst);
- // Use a temp variable for unstructured loops with non-const step.
- if (!isConst) {
- info.stepVariable = builder->createTemporary(loc, stepValue.getType());
- builder->create<fir::StoreOp>(loc, stepValue, info.stepVariable);
+ mlir::Value lowerValue;
+ mlir::Value upperValue;
+ mlir::Value stepValue;
+
+ {
+ mlir::OpBuilder::InsertionGuard guard(*builder);
+
+ // Set the IP before the first loop in the nest so that all nest bounds
+ // and step values are created outside the nest.
+ if (incrementLoopNestInfo[0].doLoop)
+ builder->setInsertionPoint(incrementLoopNestInfo[0].doLoop);
----------------
vdonaldson wrote:
Thanks @harishch4 and @ergawy for working on this improvement.
As is, this implementation applies to all structured increment loops, not just do concurrent loops. That should be ok because non-do concurrent loops don't have multiple levels. Could you extend it to apply it to all increment loops, including unstructured loops, which are loops that contain branches? Test [loops.f90](https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcommunications.nvidia.com%2FPoliteMail%2Fdefault.aspx%3Fpage%3DfrU5QAjQZEm185y7awWsNQ%26ref_id%3D2MaLg7IuEU6FbskE4dYHAA&data=05%7C02%7Cvdonaldson%40nvidia.com%7Cfc686a4570e4468a49e208dcf76b4963%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C638657287405659251%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=O07NgYMY4fwWdbf43%2FPhA36R1aKVKXNTs%2F5zCXUsKac%3D&reserved=0) has several example unstructured do concurrent loops. You would need to cache the analog of the `doLoop` op for unstructured loops.
https://github.com/llvm/llvm-project/pull/114020
More information about the flang-commits
mailing list