[PATCH] D126700: [MachineScheduler] Order more stores by ascending address
Dave Green via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Jun 7 10:52:13 PDT 2022
dmgreen added a comment.
OK I see, thanks. In this version on godbolt I only see ordered STPQ's, perhaps you have some downstream differences that would alter the codegen?
https://godbolt.org/z/1nbr7Eh8r
With this kind of thing we always have two options. Either we can do it unconditionally for all cpus, or we can add a subtarget tuning feature to steer the codegen for specific cpus. There are a number already like FeatureFuseAES that control fusing in the machine scheduler. Adding another to control a more aggressive ordering in AArch64PostRASchedStrategy sounds perfectly sensible to me.
The other option is to do this for all cpus. The general rule there is it should be something that we expect to be beneficial or benign on all/most cpus that would run -mcpu=generic code. I'm not strongly against that for this patch, we need to check that with benchmarks and I did see a case in testing which changed the code from:
STR q1,[sp,#0x40]
LDP q1,q2,[x25,#-0x10]
STR q0,[sp,#0x60]
LDUR q0,[x24,#-0x10]
FDIV v0.2D,v0.2D,v1.2D
STP q2,q1,[sp,#0x20]
...
=>
STR q1,[sp,#0x40]
LDP q1,q2,[x25,#-0x10]
STP q2,q1,[sp,#0x20]
STR q0,[sp,#0x60]
LDUR q0,[x24,#-0x10]
FDIV v0.2D,v0.2D,v1.2D
...
Those clumped-together stores were apparently a little slower. Perhaps it was just unlucky, and it is only one case. I would guess that in-order memory operations are a little easier for prefetchers than ones that bobble around, but I don't have any evidence of that, and prefetchers are usually pretty sophisticated nowadays. It would take a complex case for something like that to matter, and complex cases can easily defy expectation.
So - do you have any other benchmark results, on any other cpus? Or as an alternative, do you think that adding a subtarget feature is an OK solution?
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D126700/new/
https://reviews.llvm.org/D126700
More information about the llvm-commits
mailing list