<table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Issue</th>
<td>
<a href=https://github.com/llvm/llvm-project/issues/65426>65426</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>
[flang][hlfir] Polyhedron/nf 23% performance regression
</td>
</tr>
<tr>
<th>Labels</th>
<td>
new issue,
flang
</td>
</tr>
<tr>
<th>Assignees</th>
<td>
</td>
</tr>
<tr>
<th>Reporter</th>
<td>
vzakhari
</td>
</tr>
</table>
<pre>
Code with HLFIR lowering runs for 7.4 seconds vs 6 seconds with FIR lowering.
There is some overhead due to extra temporaries at line 261:
```
256 subroutine NF2DPrecon(x,i1,i2) ! 2D NF Preconditioning matrix
257 integer :: i1 , i2
258 real(dpkind),dimension(i2)::x,t
259 integer :: i
260 do i = i1 , i2 , nx
261 if ( i>i1 ) x(i:i+nx-1) = x(i:i+nx-1) - au2(i-nx:i-1)*x(i-nx:i-1)
262 call trisolve(x,i,i+nx-1)
263 enddo
264 do i = i2-2*nx+1 , i1 , -nx
265 t(i:i+nx-1) = au2(i:i+nx-1)*x(i+nx:i+2*nx-1)
266 call trisolve(t,i,i+nx-1)
267 x(i:i+nx-1) = x(i:i+nx-1) - t(i:i+nx-1)
268 enddo
269 end subroutine NF2DPrecon !=========================================
```
`ArrayValueCopy` has special handling for array slices of the form `(i:j)` and `(j+1:k)`, which allows disambiguating `x(i:i+nx-1)` with `x(i-nx:i-1)`. We can probably do the same in the optimized bufferization pass or implement something more generic. For example, we can try to use the affine dialect utilities to detect store-load conflicts based on the iteration space constraints derived from the slices configurations and the mapping of the iteration indices to the memory locations (based on the designator indexing).
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJzEVUGP4zYP_TXMhUigyLETH3LIJF_wFSgWi6Joz7JFx5yRJUOSM8n8-kJ2Zmen6wXaUwHHjknx8T1KJlUIfLFEe8ifID8t1BBb5_fXN_XSKs-Lyun7_ug04SvHFv__6_mX39C4V_JsL-gHG7BxHrerDQaqndUBrwGLby9j1PcxKxAnEIfp_ntLnpADBtcRuiv5lpRGPRBGh3SLXmGkrndeeaaAKqJhSyiLNWQPDCjE4xpfZV6AKMNQeTfEtPbLWZ6--kQH5O4G8sjrdJMgSwS5RnnCL2ecVmiO7GxS1qno-Qai_O6a4LcgSraRLuQxkcgOyGsEeUSWc-t3IEpPyoDc6f6FrQZZgjxq7sgGHlmNZCasRDDOwZQzaWfWpTKU2iEjZKcPYuPDzukp1iBKROQGQe6QIfvfGFXiLTGD7MAgn-xtuR4Llp1m7UtUg0z2pb0l12gFebj9YHtklVPWWhmD0XNw5krv-zP93rHnOGcgSrJaO5zzbj6VQC4lyIO9gXx6VGN6LOfLkU_E4k_EP2R-9rwLHS0P15T0pwqKef3xn-nfTtH_bodmJM1h795rO-csJyfOfl_pc4Ls9J9dP_D9W2v4Zjx4r-5_KDPQ0fV3KAS2KmDoqWZlsFVWm9QDUmNTaSkGwzUFdA3GlpK9w4Q61fM5FbIQqKx-WJ_TWYPs8DJ50ml7bbluURnjXgNqDqqr-DKomPJAIWZ2LEGO3fPd_ekzKsQK8U_CWlnsvatUZe6o3cgvqI6Q7fjf9ZE7fiON1dA05PlNpRaHvQoBnUfuekMd2Ti24NiOvc95wgtZ8lyvEM_OI91UWjgqmZJGf089egg05lFNkw6DZmWojjhENhxTy44ONcVkC9F5WhqnNNbONobrGLBSgTS6iSxH8hO90Kua0rIQvWIbA2ryfCWNjXfdpHLakwTFl2GKC-MmJG-n-j5peWzZBzJbPcbFqVYddc7f0bj6AQBy94mTpjQgVUy1sppubC8gy9VC7zNdZqVa0H5dlFlZiFyuF-0-3-ot1Sqrmk1d7XZa1aLM61KttyJXRZMteC-FzEQpCiGyXK5X25x2tWiKfKOLNVUFbAR1is3KmGu3cv6y4BAG2hf5RhYLoyoyYRzXUlp6xdEJUoI8gpSNUYmgTLPc7xPCshouATbCcIjhAzNyNOPUnyLyE-RPrWnYQ37Cr87cW9I-TaezbVBmIHPsyaejr2xN6OniKaT5tRi82bcx9iGNJXkGeb5wbIdqVbsO5DllfDyWvXfPVEeQ55F1AHkeVf0VAAD__yMcdBw">