[flang-commits] [flang] [flang][CodeGen] add nsw to address calculations (PR #74709)

via flang-commits flang-commits at lists.llvm.org
Fri Dec 8 09:32:30 PST 2023


jeanPerier wrote:

> I don't know what "nsw" means, but will just add a reminder that Fortran DO loop iteration counts are calculated before the loop starts. Don't use the current value of the DO loop index variable for loop termination testing -- just increment it each time around.

Thanks for the asnwer, lowering is not using the do loop index for termination testing. My example with `n + 1 .le. 0` had nothing to do with the DO LOOP lowering and was misleading in this discussion.

It is just an example that shows what kind of optimization LLVM can do if nsw is set on additions.  `n + 1 .le. 0`  can be rewritten to `n . le. 1` by LLVM only if nsw is set on the add. `nsw` tells LLVM that overflow in the addition are illegal and LLVM do not need to generate code where the overflow would be honored: it can rewrite arithmetic operations to mathematically equivalent forms, and not only "2 complements equivalent form".

LLVM example:
```
define dso_local i1 @test_nsw(i32 noundef %0) {
  %2 = add nsw i32 %0, 10
  %3 = icmp sle i32 %2, 0
  ret i1 %3
}

define dso_local i1 @test_no_nsw(i32 noundef %0) {
  %2 = add i32 %0, 10
  %3 = icmp sle i32 %2, 0
  ret i1 %3
}
```

`opt -O3 -S`: the add can be optimized out with `nsw` but not without it:

```
define dso_local i1 @test_nsw(i32 noundef %0) local_unnamed_addr #0 {
  %2 = icmp slt i32 %0, -9
  ret i1 %2
}

define dso_local i1 @test_no_nsw(i32 noundef %0) local_unnamed_addr #0 {
  %2 = add i32 %0, 10
  %3 = icmp slt i32 %2, 1
  ret i1 %3
}
```

Slava perf issue comes from a DO loop index that is being used in user code/addressing after being incremented. We do not add the "nsw" flag on the add that increments the  do loop variable inside the loop currently, so llvm thinks that it has to honor any do loop overflow in usages of the do loop variable after its increment. I think the question of Tom is: are we supposed to honor do loop overflow inside the body of the loop? And I think Fortran does allow user code with integers to overflow, so this flag can be set.



https://github.com/llvm/llvm-project/pull/74709


More information about the flang-commits mailing list