[PATCH] D30416: [InstCombine] Redo reduceLoadOpStoreWidth in instcombine for bitfield store optimization.

Wei Mi via llvm-commits llvm-commits at lists.llvm.org
Mon Mar 6 20:50:57 PST 2017


On Mon, Mar 6, 2017 at 4:36 PM, Friedman, Eli <efriedma at codeaurora.org>
wrote:

> On 3/6/2017 11:06 AM, Wei Mi wrote:
>
>> After looking at the benchmark motivating the test, I find it is much
>> harder for codegenprepare to catch all the shrinking opportunities compared
>> with instcombine, because of two things:
>>
>> * loadpre may merge the original load in a shrinking pattern with other
>> load in a very early position, and there maybe other stores in between. The
>> safety check for the codegenprepare now simply scan mayWriteToMemory insns
>> between load and store in the candidate shrinking pattern, and it will fail
>> because of those stores in between.
>>
>
> It would be easy to make the check a lot more powerful using MemorySSA; I
> guess you don't have access to that in the passes where you're doing the
> transform?


Yes, if MemorySSA + alias query can be used, it should solve the problem. I
didn't find existing usage of MemorySSA in instcombine or CGP so I am not
sure whether it can be used.


>
>
> * we may see the following pattern. a.f1 is a bitfield. Ideally, we should
>> do the shrinking for all the three stores. But codegenprepare works in
>> top-down order. After the first store is shrinked, it is hard for the
>> second and the third to know %bf.set is the same as the value loaded from
>> %0. If we do this in instcombine, it is much easier because before
>> load/store elimination, %bf.set will be loaded from %0 respecitively in
>> if.then9 and in if.end41.
>>
>
> I'm not following how it helps to do this in instcombine?  Normally,
> EarlyCSE runs first.  (A complete C testcase might be useful here.)


It is not earlycse. It is also loadpre to cause the load to be replaced by
%bf.set.

Before global value numbering:

entry:
  %0 = bitcast %class.B* %this to i64*
  %bf.load = load i64, i64* %0, align 8
  %dec = add i64 %bf.load, 65535
  %bf.value = and i64 %dec, 65535
  %bf.clear5 = and i64 %bf.load, -65536
  %bf.set = or i64 %bf.value, %bf.clear5
  store i64 %bf.set, i64* %0, align 8
  ...
if.then9:
  %bf.load1 = load i64, i64* %0, align 8                      // %bf.load1
is helpful for the patch to recognize the load-and-or-store pattern.
  %inc79 = add i64 %bf.load1, 281474976710656
  %bf.shl = and i64 %inc79, 71776119061217280
  %bf.clear18 = and i64 %bf.load1, -71776119061217281
  %bf.set19 = or i64 %bf.shl, %bf.clear18
  store i64 %bf.set19, i64* %0, align 8
  ...
if.end41:
  %bf.load2 = load i64, i64* %0, align 8                      // %bf.load2
is helpful for the patch to recognize the load-and-or-store pattern.
  %inc4578 = add i64 %bf.load2, 65536
  %bf.shl48 = and i64 %inc4578, 4294901760
  %bf.clear49 = and i64 %bf.load2, -4294901761
  %bf.set50 = or i64 %bf.shl48, %bf.clear49
  store i64 %bf.set50, i64* %0, align 8

After global value numbering:

entry:
  %0 = bitcast %class.B* %this to i64*
  %bf.load = load i64, i64* %0, align 8
  %dec = add i64 %bf.load, 65535
  %bf.value = and i64 %dec, 65535
  %bf.clear5 = and i64 %bf.load, -65536
  %bf.set = or i64 %bf.value, %bf.clear5
  store i64 %bf.set, i64* %0, align 8
  ...
if.then9:
  %inc79 = add i64 %bf.set, 281474976710656       // we hope to know
%bf.set is the same as load i64, i64* %0, align 8.
  %bf.shl = and i64 %inc79, 71776119061217280
  %bf.clear18 = and i64 %bf.set, -71776119061217281
  %bf.set19 = or i64 %bf.shl, %bf.clear18
  store i64 %bf.set19, i64* %0, align 8
  ...
if.end41:
  %inc4578 = add i64 %bf.set, 65536                        // we hope to
know %bf.set is the same as load i64, i64* %0, align 8.
  %bf.shl48 = and i64 %inc4578, 4294901760
  %bf.clear49 = and i64 %bf.set, -4294901761
  %bf.set50 = or i64 %bf.shl48, %bf.clear49
  store i64 %bf.set50, i64* %0, align 8


>
> -Eli
>
> --
> Employee of Qualcomm Innovation Center, Inc.
> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux
> Foundation Collaborative Project
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20170306/f2ec581e/attachment.html>


More information about the llvm-commits mailing list