[llvm-dev] [RFC] bitfield access shrinking

Thu Mar 9 11:19:58 PST 2017

Yeah,.

This seems similar to trying to optimize extract/insert values with real
binary operations.

We do it, but ... historically we only do it by teaching things they look
like things they already know ;)
(IE we teach gvn that when it looks like this, it's really an add of these
two things)

On Thu, Mar 9, 2017 at 10:47 AM, Hal Finkel via llvm-dev <
llvm-dev at lists.llvm.org> wrote:

>
> On 03/09/2017 12:28 PM, Krzysztof Parzyszek via llvm-dev wrote:
>
>> We could add intrinsics to extract/insert a bitfield, which would
>> simplify a lot of that bitwise logic.
>>
>
> But then you need to teach a bunch of places about how to simply them,
> fold using bitwise logic and other things that reduce demanded bits into
> them, etc. This seems like a difficult tradeoff.
>
>  -Hal
>
>
>
>> -Krzysztof
>>
>>
>> On 3/9/2017 12:14 PM, Wei Mi via llvm-dev wrote:
>>
>>> In http://lists.llvm.org/pipermail/cfe-commits/Week-of-Mon-
>>> 20120827/063200.html,
>>> consecutive bitfields are wrapped as a group and represented as a
>>> large integer and emits loads stores and bit operations appropriate
>>> for extracting bits from within it. It fixes the problem of violating
>>> C++11 memory model that original widen load/store of bitfield was
>>> facing. It also brings more coalescing opportunities across bitfields.
>>>
>>> If some bitfields are natural aligned and their num of bits can form a
>>> legal integer type that the target supports, it is more efficient to
>>> access them directly than doing a large integer load/store plus a
>>> series of bit operations. We call such reverse transformation legal
>>> type bitfield shrinking. Currently, llvm depends on DAGCombiner to do
>>> such shrinking.
>>>
>>> However, DAGCombiner has the one-basic-block-a-time limitation, so we
>>> started to implement a new shrinking optimization in
>>> https://reviews.llvm.org/D30416, and initially put it in instcombine,
>>> then moved it to CGP because we want to use some TargetLowering
>>> information.
>>>
>>> The initial implementation in D30416 only supports load-and-or-store
>>> pattern matching, and it uses a inst-by-inst scan as a safety check to
>>> make sure there is no other memory write insn between the load and
>>> store (no alias query is done).  After getting the initial
>>> implementation, we found more problems: EarlyCSE, LoadPRE and even
>>> InstCombine itself can do coalescing before the shrinking (LoadPRE
>>> does it the most thoroughly).
>>> The coalescing can move the load many BasicBlocks earlier and make
>>> simple inst-by-inst scan unscalable and oftentimes fail. It also
>>> breaks the load-and-or-store pattern. An example is below:
>>>
>>> Before coalescing done by earlycse/loadpre:
>>> %bf.load = load i64, i64* %0, align 8
>>> %bf.clear = and i64 %bf.load, -65536
>>> %bf.set = or i64 %bf.value, %bf.clear
>>> store i64 %bf.set2, i64* %9, align 8
>>> .....
>>> %bf.load1 = load i64, i64* %0, align 8
>>> %bf.clear1 = and i64 %bf.load1, -4294901761
>>> %bf.set1 = or i64 %bf.value1, %bf.clear1
>>> store i64 %bf.set2, i64* %9, align 8
>>> .....
>>> %bf.load2 = load i64, i64* %0, align 8
>>> %bf.clear2 = and i64 %bf.load2, -4294901761
>>> %bf.set2 = or i64 %bf.value2, %bf.clear2
>>> store i64 %bf.set2, i64* %9, align 8
>>>
>>> After coalescing, it will become:
>>> %bf.load = load i64, i64* %0, align 8
>>> %bf.clear = and i64 %bf.load, -65536
>>> %bf.set = or i64 %bf.value, %bf.clear
>>> .....
>>> %bf.clear1 = and i64 %bf.set, -4294901761
>>> %bf.set1 = or i64 %bf.value1, %bf.clear1
>>> .....
>>> %bf.clear2 = and i64 %bf.set1, -4294901761
>>> %bf.set2 = or i64 %bf.value2, %bf.clear2
>>> store i64 %bf.set2, i64* %9, align 8
>>>
>>> After load-store coalescing, %bf.load now is far away from the store,
>>> and the previous load-and-or-store pattern disappears.
>>>
>>> A simple idea to fix it is to move the shrinking in a very early pass
>>> before the first pass of EarlyCSE. However, if we move shrinking
>>> earlier, it is possible to block the coalescing of other ilegal type
>>> bitfields which can not be shrinked. So for coalescing and shrinking,
>>> no matter which one is first, it will block the other one.
>>>
>>> After some discussions with Eli and Michael, I come up with another
>>> idea to let shrinking stay in the late pipeline. It needs two changes
>>> to the current patch:
>>>
>>> 1. extending the pattern match to handle store(or(and(or(and...))
>>> pattern above. It needs to analyze the bit mask of every and-or pairs.
>>> If the last and-or pair touch different section with the other and-or
>>> pairs, we can split the original store into two, and do the shrinking
>>> for the second store, like below
>>>
>>> %bf.load = load i64, i64* %0, align 8
>>> %bf.clear = and i64 %bf.load, -65536
>>> %bf.set = or i64 %bf.value, %bf.clear
>>> .....
>>>
>>> %bf.clear1 = and i64 %bf.set, -4294901761
>>> %bf.set1 = or i64 %bf.value1, %bf.clear1
>>> store i64 %bf.set1, i64* %0, align 8                         // the
>>> first store.
>>> .....
>>>
>>> %bf.value2.shrinked = shrink_op %bf.value2
>>> store i16 %bf.value2.shrinked, i64* %0, align 8       // shrinked store.
>>>
>>> 2. use memoryssa + alias query to do the safety check. Because
>>> dominator tree is not properly updated in CGP, I have to create a new
>>> pass and put it before CGP, in order to use memoryssa.
>>>
>>> Eli suggested me to ask for more opinions before start writting code.
>>> I think it is a good idea and here is the post. Comments are
>>> appreciated.
>>>
>>> Thanks,
>>> Wei.
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> llvm-dev at lists.llvm.org
>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>
>>>
>>
> --
> Hal Finkel
> Lead, Compiler Technology and Programming Languages
> Leadership Computing Facility
> Argonne National Laboratory
>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170309/61ea2b14/attachment.html>