[PATCH] D132828: Add new optimization pass of Tree Height Reduction
Florian Hahn via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Aug 30 02:36:46 PDT 2022
fhahn added a comment.
Thanks for updating the patch! Have you considered implementing this as MachineFunctionPass instead of an LLVM IR pass? Doing the transformation on MachineIR would allow for more precise cost estimates, including more accurate information about register usage, selected instructions and processor resource usage. `MachineCombiner.cpp` might be interesting example to look at for similar (although simpler) transformations with relatively accurate uarch-driven cost-modeling.
> I run C/C++ benchmarks in SPECspeed 2017 on Fujitsu A64FX processor, which has two pipelines for integer operations and SIMD/FP operations each. 600.perlbench_s and 619.lbm_s had 3% improvement. Other benchmarks (602, 605, 620, 623, 625, 631, 641, 644, 657) were within 1% up/down. In these runs, to emphasize the performance improvement, the number of OpenMP threads is limited to one.
It might be interesting to also run SPECrate instead of just running speed with one thread?
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D132828/new/
https://reviews.llvm.org/D132828
More information about the llvm-commits
mailing list