<table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Issue</th>
<td>
<a href=https://github.com/llvm/llvm-project/issues/63130>63130</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>
Missing Constant Propagation in OpenMP Parallel Region
</td>
</tr>
<tr>
<th>Labels</th>
<td>
new issue
</td>
</tr>
<tr>
<th>Assignees</th>
<td>
</td>
</tr>
<tr>
<th>Reporter</th>
<td>
AntonRydahl
</td>
</tr>
</table>
<pre>
I noticed that `flang-new` does not perform constant propagation as I would have expected. I made two simple test programs in `C` and `FORTRAN` using reductions in OpenMP. The programs are the following:
<table>
<tr>
<td> C </td> <td> FORTRAN </td>
</tr>
<tr>
<td>
```C
#include <stdio.h>
#include <stdlib.h>
int main(void) {
const int length = 1024*1024;
double * arr = (double *) malloc(length*sizeof(double));
double sum = 0.0;
#pragma omp parallel for reduction(+:sum)
for(int i = 0; i < length; i++) {
arr[i] = 1.0/length;
sum += arr[i];
}
printf("The result of sum(arr[0:%d]) is %1.9lf\n",length-1,sum);
free(arr);
return 0;
}
```
</td>
<td>
```FORTRAN
PROGRAM parallel_loop
IMPLICIT NONE
INTEGER(kind=4) :: length, i
REAL(kind=8), allocatable :: arr(:)
REAL(kind=8) :: sumval
length = 1024*1024
allocate (arr(length))
sumval = 0.0
!$omp parallel do reduction(+:sumval)
do i=1,length
arr(i) = 1.0/length
sumval = sumval + arr(i)
end do
!$omp end parallel do
write(*,100) "The result of sum(arr(1:",length,")) is ", sumval
100 format (A,I7,A,e13.6e2)
deallocate(arr)
END PROGRAM parallel_loop
```
</td>
</tr>
</table>
I compiled the two programs with:
```bash
clang -O3 -pthread -fno-dwarf2-cfi-asm -fno-asynchronous-unwind-tables -fopenmp -emit-llvm -S reduction.c -o c_reduction.ll
flang-new -O3 -fopenmp -emit-llvm -S reduction.f90 -o f_reduction.ll
```
In the LLVM IR generated with `clang`, `1.0/length` has been replaced with `0x3EB0000000000000` which is exactly `1.0/(1024.0*1024,0)` as one would expect:
```llvm
omp.inner.for.body: ; preds = %omp.inner.for.body.prol.loopexit, %omp.inner.for.body
%indvars.iv = phi i64 [ %indvars.iv.next.3, %omp.inner.for.body ], [ %indvars.iv.unr, %omp.inner.for.body.prol.loopexit ]
%arrayidx = getelementptr inbounds double, ptr %3, i64 %indvars.iv
store double 0x3EB0000000000000, ptr %arrayidx, align 8, !tbaa !11
%11 = load double, ptr %sum1, align 8, !tbaa !11
%add5 = fadd double %11, 0x3EB0000000000000
store double %add5, ptr %sum1, align 8, !tbaa !11
%indvars.iv.next = add nsw i64 %indvars.iv, 1
%arrayidx.1 = getelementptr inbounds double, ptr %3, i64 %indvars.iv.next
store double 0x3EB0000000000000, ptr %arrayidx.1, align 8, !tbaa !11
%12 = load double, ptr %sum1, align 8, !tbaa !11
%add5.1 = fadd double %12, 0x3EB0000000000000
store double %add5.1, ptr %sum1, align 8, !tbaa !11
%indvars.iv.next.1 = add nsw i64 %indvars.iv, 2
%arrayidx.2 = getelementptr inbounds double, ptr %3, i64 %indvars.iv.next.1
store double 0x3EB0000000000000, ptr %arrayidx.2, align 8, !tbaa !11
%13 = load double, ptr %sum1, align 8, !tbaa !11
%add5.2 = fadd double %13, 0x3EB0000000000000
store double %add5.2, ptr %sum1, align 8, !tbaa !11
%indvars.iv.next.2 = add nsw i64 %indvars.iv, 3
%arrayidx.3 = getelementptr inbounds double, ptr %3, i64 %indvars.iv.next.2
store double 0x3EB0000000000000, ptr %arrayidx.3, align 8, !tbaa !11
%14 = load double, ptr %sum1, align 8, !tbaa !11
%add5.3 = fadd double %14, 0x3EB0000000000000
store double %add5.3, ptr %sum1, align 8, !tbaa !11
%indvars.iv.next.3 = add nsw i64 %indvars.iv, 4
%lftr.wideiv.3 = trunc i64 %indvars.iv.next.3 to i32
%exitcond.not.3 = icmp eq i32 %5, %lftr.wideiv.3
br i1 %exitcond.not.3, label %omp.loop.exit, label %omp.inner.for.body
```
In the LLVM IR generated with `flang-new`, the expression `1.0/length` is evaluated every iteration. That happens in line` %11 = fdiv contract float 1.000000e+00, %10` since `%10` is a floating point cast of `%9`, which is an extension of `%loadgep_`, which ultimately refers to the statically allocated variable `length`.
```llvm
omp_loop.body: ; preds = %omp_loop.body.lr.ph, %omp_loop.body
%omp_loop.iv68 = phi i32 [ 0, %omp_loop.body.lr.ph ], [ %7, %omp_loop.body ]
%7 = add nuw i32 %omp_loop.iv68, 1
%8 = add i32 %7, %4
%9 = load i32, ptr %loadgep_, align 4, !tbaa !4
%10 = sitofp i32 %9 to float
%11 = fdiv contract float 1.000000e+00, %10
%12 = fpext float %11 to double
store ptr %.unpack, ptr %loadgep_4, align 8, !tbaa !8
store i64 8, ptr %loadgep_4.repack19, align 8, !tbaa !8
store i32 20180515, ptr %loadgep_4.repack21, align 8, !tbaa !8
store i8 1, ptr %loadgep_4.repack23, align 4, !tbaa !8
store i8 28, ptr %loadgep_4.repack25, align 1, !tbaa !8
store i8 2, ptr %loadgep_4.repack27, align 2, !tbaa !8
store i8 0, ptr %loadgep_4.repack29, align 1, !tbaa !8
store i64 1, ptr %loadgep_4.repack31, align 8, !tbaa !8
store i64 %.unpack14.unpack.unpack16, ptr %loadgep_4.repack31.repack33, align 8, !tbaa !8
store i64 8, ptr %loadgep_4.repack31.repack35, align 8, !tbaa !8
%13 = sext i32 %8 to i64
%14 = add nsw i64 %13, -1
%15 = getelementptr double, ptr %.unpack, i64 %14
store double %12, ptr %15, align 8, !tbaa !4
store ptr %.unpack, ptr %loadgep_6, align 8, !tbaa !8
store i64 8, ptr %loadgep_6.repack49, align 8, !tbaa !8
store i32 20180515, ptr %loadgep_6.repack51, align 8, !tbaa !8
store i8 1, ptr %loadgep_6.repack53, align 4, !tbaa !8
store i8 28, ptr %loadgep_6.repack55, align 1, !tbaa !8
store i8 2, ptr %loadgep_6.repack57, align 2, !tbaa !8
store i8 0, ptr %loadgep_6.repack59, align 1, !tbaa !8
store i64 1, ptr %loadgep_6.repack61, align 8, !tbaa !8
store i64 %.unpack14.unpack.unpack16, ptr %loadgep_6.repack61.repack63, align 8, !tbaa !8
store i64 8, ptr %loadgep_6.repack61.repack65, align 8, !tbaa !8
%16 = load double, ptr %1, align 8
%17 = fadd contract double %16, %12
store double %17, ptr %1, align 8
%exitcond.not = icmp eq i32 %omp_loop.iv68, %reass.sub
br i1 %exitcond.not, label %omp_loop.exit, label %omp_loop.body
}
```
Finally, the `flang` IR has not been vectorized like the `clang` IR. Should I file that as a separate bug report? The full LLVM IR and source files can be found here:
https://github.com/AntonRydahl/flangtests/blob/main/ir/CC/loop_reduction.ll
https://github.com/AntonRydahl/flangtests/blob/main/ir/FC/loop_reduction.ll
https://github.com/AntonRydahl/flangtests/blob/main/src/loop_reduction.c
https://github.com/AntonRydahl/flangtests/blob/main/src/loop_reduction.f90
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJy0Wt1u66oSfhp6M4pl49hxLnqRps1WpPWn7qVzu0RsnHAWBh_ASbuf_gj8n7hpu9sVRa0NM98MzAx82CFas72g9BZFdyi6vyGVOUh1uxJGisfnjBz4zU5mz7dbENKwlGZgDsQAiv2cE7GfCXpCsQ-ZpNpKQElVLlUBqRTaEGGgVLIke2KYFEA0bOEkK57BgRwp0KeSpoZmHmyhIBkFc5KgWVFyCoZqp7xXpNDAhDW5tqaIyOz15vvjz8fVN9tSaSb2oGhWpdaMk_5eUvH1hwc_D7RHIYqCOVDIJefyxMQehSvk3yO__RuuDdlxisKHvkGN7jIUPsAaULhGeFPfdc2NS8POTtE2qFdwR67Efv1dN_c4ZCLlVUYtvDYZk96h1znv5Ww36GbCQEGYQDg5SpYhvAS0uGvNLV2swApxKvbmACi8h8DHc4RX7l_Yy2ay2nEKCK-AKOUkEU76VotdEM5linBSwyG80uwfKvNOEOGl_V7C6qpwkL7nD3sRDktF9gUBWZRQEkU4pxxyqfqoI5wgfIfCla4KC97q5lIhnNjBsRoahXfuct2M1t1bVfsdTwzyl0QpFN0xFN3Xs-L5CG86xaGk8916cA-90khmcd9dl4oJkzufsU1RRXXFDcgcnPtJjeDb_MRRZoHwEpgGhKPAW_IcRWuBMEZ4XfsyCxBeNyMfmMwVpQ3auENRUykBg1nunGsT76wwLjL65ZRtS9O1_nj8_tfj6msXtV9cyrLuguaz_frjy3a9_Qnfvn97OOv69vPhr4dHhJPfTGQovJ-72LbfcIXCVRtHvAY21n58WH3pVROXdmtw2UlcoUMP4uYocRPeJM-U-pllXRVHwsdGX6ihkUzjgq2YOjhdqSx7882nttGVxagP4QDh-agoMjldE9bPc-hMAkPhfdClUZcfnaPOOeYK4zz9z2UHjraX-G6AMLZNRQaZbNr6cdjmwVjGOifFDHWDWiG8Dnzf-fVyBeEkcAHt6wThtbtddvVk-ybjGPi-XWEKu9vhxBrcLhBe2wsahF5M8eV80jawfdE5gYdv93ClEN5ccxe7iG0Y71ju7xZSWZSMu-263la7TfDE7NJ1XrM7opuIpnZfh9n3EGalOShKMpjlQs6yE1E5nqU5mxFd1G1EP4v0oKSQlZ5V4sRENnP-aIBZLksqihJmtGBmxvmxgNnf0Genl8JMQvqrb-BNCDpqUbsxDdSr5UvfIuUTSGczuxVuPr58-c9X2D7CngqqiKGZmxTLKtzgrThe29tRvsc-HIiGHaUCFC05SQeK_lP4cOcPP1b-dGDpweYZfSKp4c89ps1NH8_tdb0-4LXNZkdwNEhBG55UU6TLeNlZqJtkUXpMCKq8XCrPkjW7MtldrVQ0080WHV2KeaWS3LNZSJ-YcSOeEmtTHOGIiexIlPbY0aGWBwYsngOK7sa9nqBPxgtfhAS3oa0nFCuhXtQa--sweteIUuSZZU_OsT01lNOCClMaBUzsZCUyDS35WINtRjhyHroRDJ1oULWRijY6MBHeHqa1XW8ubC8gqQcRmB0h9n8QDFwNAucklyS7dElXRfBGHJJlkUPKSdYi1fhWa8LjRnU0sAbn_R5cBtz5Yl0R-jQxq3gNwUTAvOATQubsT47vTYHz3jrlAf680DUDPw8efn_wavff78V5wQavRxBPRRB_VgS94AMxxG-NYfiJMcSTMQz_RQzx58QQvx7DcCqG4afFEH8ghuFbYzj_xBiGkzGc_4sYhp8Tw_D1GM4H6jw3yjuxjLJjo2pUJdKX4hOCkcDCYSXb_TSVIvOEbK2z1JLx_1lBKxE1u_LIVouwU8CCSxyrwsmO8nY7t1u313KNUc8U43gvdRs-j7IGrDR9KhXVmkkxxeYsLzsSXjkYeqTqGZixqJZBws8DMXAgZUnrJ0qcCWq1-h08z9gRUimMIqmBnEti7AnJfSjCd3WOW3lHBjUTKQXnXdPCNJBaj4k9lJIJAynR7hRTyy2bwXREkgigT4YKN6ZOzBbCnpa_RtIVN6wghvJnUDSnStvA21nRhhiWEs6fu4NoBkeiWH0kjv1uirxrrNMdYK4Szl7C48orDz21G-j2adi1s2Oc9AzTZmB0B_6Udo17xicXU5JnfHHR11h1arN85MA5YUk6jUa6NTOsxWW_LtkS65eDLkLdkjA_WxKGMIFfH6OZkXnZ2lva-LlsuSST70nFjr81bCYvLXmrdWpEI9tVdbTaNUPxKlGS9PfE4OYvL3jJGMquTckUgqeoBQ-Wb4cKMWA_SPwoiK4g4iurcTKi_CyB4BpQ-HIMz11LAF8bJY56qOB1qGtIix4Jv4rkX0Navt2neH51psI3TPkAqk-tYN5ctPfxVSvtxRX28M706zGj1zF7TqltITXVmrhtNp5fspazjb1mi7MRvYkm2NgFzxnUYQs1f4meBMPkCa4Mav7Oko8_POdxM9XzTyv5FjF6R_5N13yH9PGa76A-XPMd0odrvkP6eM23UPEfrfneSnvx8Zq_xHxjzcdXziCjWRgU9qI_a3S79aBK43aXnj5EOYTXrQxp-BSZv6A5CEeKEq09Xe2ukvpz4v7rRUp_TvBeer-0YcIS0ZawtzzesuPto3veagfhnrkeaWqkYv_QDDj7TVv5tJf34O-De2q6hZxxWr-qJpZka1oSe1yAXbUHRUupDAo37u1wXnHeHSuIyEDLSqXUIWhIiYAdhdwehOFAFT17ZXwwptTuJcMG4c2emUO181JZILwZvENHeOOGZag2GuHNjssdwpv6neyGKYQ367U9m0hZTjzF_jQjmz9sRKv00kD6h_HzpX-T3YbZMlySG3obxEns-4G_SG4Ot7E_z4J0mWaLPIqDpR-G-WJO5rts4cdhuJzfsFvs49CP_QgH4SJKvGUcRDQL5gtMsnRHMJr7tCCMe_bs40m1v2FaV_Q2DoPQv3Epr92PJzAW9ASuE2GMovsbdWt1Zrtqr9Hc50wb3aMYZji9_cq0-_HCuv25xI_BzyW6nzHAj_aN2CPdMyluKsVvr8ynO6bV_2alkv-lqbEZYF2zc-pc_38AAAD__1_MzlI">