<table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Issue</th>
<td>
<a href=https://github.com/llvm/llvm-project/issues/65413>65413</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>
[flang][hlfir] SPEC CPU2006/437.leslie3d 5% performance regression
</td>
</tr>
<tr>
<th>Labels</th>
<td>
new issue,
flang
</td>
</tr>
<tr>
<th>Assignees</th>
<td>
vzakhari
</td>
</tr>
<tr>
<th>Reporter</th>
<td>
vzakhari
</td>
</tr>
</table>
<pre>
The benchmark runs 5% slower than with FIR lowering on icelake (120.5 seconds vs 114.5).
The slowdown is related to extra temporaries created for some assignments, e.g.:
```
3,550 SUBROUTINE UPDATE()
...
3,567
3,568 DO K = 1, KMAX - 1
3,569 DO J = 1, JMAX - 1
3,570
3,571 Q(1:I2,J,K,1,M) = (RNM1 * Q(1:I2,J,K,1,M) +
3,572 > Q(1:I2,J,K,1,N) + DU(1:I2,J,K,1)) * RNI
```
`Q` and `DU` are module ALLOCATABLE variables; `RNM1` and `RNI` are local scalars . Due to the box loads involved in producing the base memrefs of the designators, the alias analysis cannot currently prove no-aliasing for `Q` and `DU` accesses. This blocks the optimized bufferization pass from eliminating the temporary:
```
%42:2 = hlfir.declare %41 {fortran_attrs = #fir.var_attrs<allocatable>, uniq_name = "_QMles3d_dataEdu"} : (!fir.ref<!fir.box<!fir.heap<!fir.array<?x?x?x?xf64>>>>) -> (!fir.ref<!fir.box<!fir.heap<!fir.array<?x?x?x?xf64>>>>, !fir.ref<!fir.box<!fir.heap<!fir.array<?x?x?x?xf64>>>>)
...
%199:2 = hlfir.declare %198 {fortran_attrs = #fir.var_attrs<allocatable>, uniq_name = "_QMles3d_dataEq"} : (!fir.ref<!fir.box<!fir.heap<!fir.array<?x?x?x?x?xf64>>>>) -> (!fir.ref<!fir.box<!fir.heap<!fir.array<?x?x?x?x?xf64>>>>, !fir.ref<!fir.box<!fir.heap<!fir.array<?x?x?x?x?xf64>>>>)
...
%339 = fir.load %199#0 : !fir.ref<!fir.box<!fir.heap<!fir.array<?x?x?x?x?xf64>>>>
%340 = fir.load %82#0 : !fir.ref<i32>
%341 = fir.convert %340 : (i32) -> index
%342 = arith.cmpi sgt, %341, %c0 : index
%343 = arith.select %342, %341, %c0 : index
%344 = fir.load %121#0 : !fir.ref<i32>
%345 = fir.convert %344 : (i32) -> i64
%346 = fir.load %137#0 : !fir.ref<i32>
%347 = fir.convert %346 : (i32) -> i64
%348 = fir.load %157#0 : !fir.ref<i32>
%349 = fir.convert %348 : (i32) -> i64
%350 = fir.shape %343 : (index) -> !fir.shape<1>
%351 = hlfir.designate %339 (%c1:%341:%c1, %345, %347, %c1, %349) shape %350 : (!fir.box<!fir.heap<!fir.array<?x?x?x?x?xf64>>>, index, index, index, i64, i64, index, i64, !fir.shape<1>) -> !fir.box<!fir.array<?xf64>>
%352 = fir.load %159#0 : !fir.ref<i32>
%353 = fir.convert %352 : (i32) -> i64
%354 = hlfir.designate %339 (%c1:%341:%c1, %345, %347, %c1, %353) shape %350 : (!fir.box<!fir.heap<!fir.array<?x?x?x?x?xf64>>>, index, index, index, i64, i64, index, i64, !fir.shape<1>) -> !fir.box<!fir.array<?xf64>>
%355 = fir.load %42#0 : !fir.ref<!fir.box<!fir.heap<!fir.array<?x?x?x?xf64>>>>
%356 = hlfir.designate %355 (%c1:%341:%c1, %345, %347, %c1) shape %350 : (!fir.box<!fir.heap<!fir.array<?x?x?x?xf64>>>, index, index, index, i64, i64, index, !fir.shape<1>) -> !fir.box<!fir.array<?xf64>>
%357 = fir.load %226#0 : !fir.ref<f64>
%358 = hlfir.elemental %350 unordered : (!fir.shape<1>) -> !hlfir.expr<?xf64> {
^bb0(%arg4: index):
%568 = hlfir.designate %351 (%arg4) : (!fir.box<!fir.array<?xf64>>, index) -> !fir.ref<f64>
%569 = fir.load %568 : !fir.ref<f64>
%570 = arith.mulf %338, %569 fastmath<fast> : f64
%571 = hlfir.designate %354 (%arg4) : (!fir.box<!fir.array<?xf64>>, index) -> !fir.ref<f64>
%572 = fir.load %571 : !fir.ref<f64>
%573 = arith.addf %570, %572 fastmath<fast> : f64
%574 = hlfir.designate %356 (%arg4) : (!fir.box<!fir.array<?xf64>>, index) -> !fir.ref<f64>
%575 = fir.load %574 : !fir.ref<f64>
%576 = arith.addf %573, %575 fastmath<fast> : f64
%577 = hlfir.no_reassoc %576 : f64
%578 = arith.mulf %577, %357 fastmath<fast> : f64
hlfir.yield_element %578 : f64
}
hlfir.assign %358 to %351 : !hlfir.expr<?xf64>, !fir.box<!fir.array<?xf64>>
```
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJzkWE1zozoW_TXy5lYokBCYhReOnVT198dLqmaXkuFia1ogP0l2J_3rpwTYkBh3JzOd6cVzOTEI3aPDOVdXAmGtXNeIM8IvCaX7H-LbRhhJKCV8ORE7t9FmdmidrHTxMLvZIKywzjeVMN_A7GoLnFAOVunvaMBtRA3fpdvA9Zuv0LTJeg26BpmjEt8QCJ1GNAw4WMx1XVjYW4iiOOCEZgEJlySct__9SB610N9rkBYMKuGwAKcB750R4LDaaiOMRAu5weZiqQ1YXSG0t1Zh7SyhC8BgHRDWIZMk7L7NKSN0wXlIwgyaz1-3l18_3d68-XgFt5-X85srQqeEZiQcfn1gEAQDhCR93CMbXpse0Zef4B0QtoTI83r3Yf4vuIDoXFx2jGtD3_ahb38emobn6aTREBYAvnhXCJu_oYQu3hK6eEfowo_ygdCsGZLQ6dePHyIgdP6r3vRyOBQ9DEXYFZz7nEX82CHC8vZMl6ztMYevH9-c3u4Tq4-NX0gSgqgLIEm4vG1ODEKli51CmL9__2kxv5lfvr-CvTBSrBRawi59Zy_CINYP2gUrnQsFNhdKGAsBwHKHPlmdnzH6HpQWhQVZ77XaYwGyhq3RxS7386PpIyxChZXB0oIum7YCfRYLp02Txb5JKCksiFqoByst5KKutYN8ZwzWTj140D1CrS-ajh7cz4nxW85ztBZtAHCzkRZWSuffbDOK3jpZyR9YwGpXlmjkD-GkrmErrIXS6ApQyUrWwh3oH2bjw7lpBkAojylhc9pk1EaV0gQF5sqr569FQNLLUhtnRH0nnDO2Sz3me-6FaRsJWwjl1XbeGMKuvDS7Wv59V4sKuxB69-WDQsuKu0I4cVXsfFFLl0DYHJr5HHlMgyVhi-5kpe_7kw2KbX8mjBEP_pRd3w__yiT24x-_NIMLn-evOMICXo37k6rWGBZl2XnHomz6epb9_SqO_X9ce13nnuceoZyxtnh7MF9-DoZSFnayvhqfLoOawk85i8MTIlM6zkMyOgYQHQFyXe_RuB63yQ8fdjBS1gXenyC0SSyMdJsgr7YS7Nq1rnj47ihvAccR2ADBosK8I0FfBhOfmkKjF4nBx8WIR8VI4pP45JQBS1_EIB1nkDyXwfSUAX8Zg2ycwfSZDHifkXYjttg73EY3zvWVITp2JGwRjfDh0aMa2a7beJyFvrTw3G9gujxpDvLomDn8eJQecqi_2GxyoOfJwydl8XfMWrro8nX8IImHP0-axwR6It4jikNSPYmnktKRJDlTvcaThLPRJGmAn5Uk8euZytk_01R-Ymp8ZiX4fXubfkXkyVlHPbP_0tHf7uP_7uHrmJeemEdpMu5eB_IUYTrQHxX6J3ShDrrtam0KNFg8UfDcXXQw91vziLnflD4amF-tVmFrrjDruF-aaXZ8XDk8ixLKeTI9nyURDIC88z8xe1zU3qbHfpzTreN0updref5S9zY-DQe7l2qnyraMTbs09vilsK4SbuNRhHUNNTaH8nFh7ODOr3e-av4RidLTBaPl-VyJhhs8URRlJ9xBopS-TKLzq4cvQ39GotPy2_J8rkTJqETsKBF_mUTpQKJa3xkU1uq8H2oY1zROR7KYp-lxVU1_PX472oNEVdx1JajHPiFK0uXwtA1uXy8eKprT_RZw_pO6NKjLz629x5cok2LGioxlYoKzKMlYFqcJo5PNLE4KGk9zkVMWZUkYck6nyMNVjAnGqSgmckZDysIs5DQKQ0aDJIuzLI2yKBKRiKYFiUOshFSBUvsq0GY9kdbucJbwOGITJVaobPeKuMbv0Fz0D-d0QSgtlajX3RtjM_MIF6vd2pI4VNI622M66VTzprmN4EvCLxudCF_CX5-vFrD4fEvDMCH0OmZpoNAqiaxo3y9v0ZTaVKLOEQyuDVordT3ZGTXbOLe1zUJ9Tej1WrrNbhXkuiL02g_e_Vxsjf435o7Q6-YGLKHXzQ3-JwAA___2ZQ-Z">