[PATCH] D98230: [LSR] Add reconciliation of unfoldable offsets

Wed May 25 06:33:15 PDT 2022

eopXD added a comment.

In D98230#3536504 <https://reviews.llvm.org/D98230#3536504>, @jonpa wrote:

> In D98230#3536058 <https://reviews.llvm.org/D98230#3536058>, @eopXD wrote:
>
>> Hi Jonas,
>>
>> I am trying to wrap my head around this patch. I would really appreciate
>> if you can tell me more on your approach. (or please correct me if I am
>> saying something stupid)
>>
>> It sounds like you are trying to have an alternative `Formula`-s when two
>> or more `LSRUse` have mutual unfold-able offset. This sounds like what
>> `LSRInstance::GenerateConstantOffsets` is doing. Won't adding more
>> `Formula`-s to an `LSRUse` do the job (rather than overriding the offset)?
>
> Hi Yueh-Ting,
>
> thanks for taking a look :-)
>
> LSR begins with looking over the interesting instructions and sorting them into groups (LSRUse:s) in CollectFixupsAndInitialFormulae(). This is needed to generate efficient code, i.e. to use the same IV register for more than just a single instruction if possible. This is implemented in getUse(): a previous LSRUse with the same SCEV and Kind is searched for, and if found it is used if reconcileNewOffset() succeeds. If the offset does not go with the pre-existing LSRUse a new LSRUse is created instead.
>
> The problem then with your idea of GenerateConstantOffsets() is that per the above the two instructions end up in their own LSRUse with their own set of formulae which will then not work as they will always be separate. With that said, it might be interesting to experiment with GenerateCrossUseConstantOffsets() to see if the problem could be solved that way. There is however an advantage with putting the instructions in the same LSRUse right away as LSR is generating a lot of formulae and then prune the search space for the sake of compile time, so I would say getting something obvious like this right early is better.
>
> Did you try the patch on your target and find any improvements?

Thank you for the clear explanation. The approach make sense to me.
However, I can't successfully apply the patch, may you rebase it?

It looks like LSR can surely find improvement here, at least for the
code-gen I observed from your test case on the RISC-V backend.

Other than the rebase, may you add debug logs to show the changes
for your enhancement, thanks for the work!

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D98230/new/

https://reviews.llvm.org/D98230