[llvm] [GlobalISel][AArch64] AArch64O0PreLegalizerCombiner: Disable fixed-point iteration (PR #94291)

Tue Jun 4 14:47:10 PDT 2024

tobias-stadler wrote:

 > Is the one round vs fixed-point iteration issue limited to the -O0 combiner or is it a general issue of our combiners?

In my opinion, it's a general issue. The fixed-point iteration unnecessarily burns a lot of compile-time in every combiner. Turning fixed-point iteration off for O0 is inconsequential according to CTMark and the optnone_combines are very simple, so I don't expect a lot of regressions in other benchmarks. >O0 requires a more elaborate discussion. Turning it off for AArch64PreLegalizerCombiner and AArch64PostLegalizerCombiner and running CTMark O2, I get:
```
Program                                       compile_instructions                        size..text
                                              base-O2              patch2-O2       diff   base-O2    patch2-O2 diff
7zip/7zip-benchmark                           207743211267.00      206832115055.00 -0.44% 824480.00  824636.00 0.02%
Bullet/bullet                                 100680312509.00      100194629825.00 -0.48% 517188.00  517440.00 0.05%
kimwitu++/kc                                   42193871512.00       41963559810.00 -0.55% 461696.00  463252.00 0.34%
SPASS/SPASS                                    44102375151.00       43822705691.00 -0.63% 443008.00  443008.00 0.00%
tramp3d-v4/tramp3d-v4                          67051760706.00       66593264326.00 -0.68% 570940.00  571004.00 0.01%
ClamAV/clamscan                                54709178376.00       54184088001.00 -0.96% 456736.00  456752.00 0.00%
mafft/pairlocalalign                           33428526057.00       33104690017.00 -0.97% 321664.00  321664.00 0.00%
lencod/lencod                                  58348756630.00       57763311078.00 -1.00% 546524.00  546828.00 0.06%
consumer-typeset/consumer-typeset              35813224875.00       35401611592.00 -1.15% 419612.00  419632.00 0.00%
sqlite3/sqlite3                                35943993743.00       35457465790.00 -1.35% 436272.00  436272.00 0.00%
                           Geomean difference                                      -0.82%                      0.05%
```
All the code-size regressions originate from AArch64PreLegalizerCombiner and I still need to investigate them further. My hope is that a small additional heuristic in Combiner::WorklistMaintainer will fix most of the regressions, but my is guess that there will always be missed optimizations without fixed-point iteration for more complex combines, so the argument for disabling fixed-point iteration for O2 is not as clear-cut as for O0. Maybe O1?

https://github.com/llvm/llvm-project/pull/94291