[PATCH] Xor reassociation

Fri Mar 29 19:07:54 PDT 2013

Hi, Pete:

    Thank you so much for the quick response.

    The purpose of rule 1 is twofold:
   1) when c1 = c, (x|c1) ^ c2 => x & ~c1, actually I only catch the case.
   2) it is used to prove rule 3

   As to the copy-constructor,  there is a non-plain-old-data in the 
class, without copy-constructor,
while it is ok for now, but could present a potential problem for the 
future.

Thank you again! Have a nice weekend!

Shuxin

On 3/29/13 6:24 PM, Peter Cooper wrote:
> Hi Shuxin
>
> At first. Look I couldn't see a reason for rule 1 as it doesn't reduce the number of instructions. Is it there to canonical use things or did you you find the & exposed more opportunity to remove unused bits from expressions. Just curious.
>
> Only comment on the patch is that I couldn't see a need for the copy constrictor. Otherwise LGTM.
>
> Thanks
> Pete
>
> Sent from my iPhone
>
> On Mar 29, 2013, at 5:37 PM, Shuxin Yang <shuxin.llvm at gmail.com> wrote:
>
>> Sorry, forget adding "RUN:" command to the testing *.ll file.
>> Updated patch is attached to this mail.
>>
>> On 3/29/13 5:17 PM, Shuxin Yang wrote:
>>> Hi, There:
>>>
>>>   The attached patch is about xor-reaasociation. It is based on following rules:
>>>
>>>   rule 1: (x | c1) ^ c2 => (x & ~c1) ^ (c1^c2)
>>>   rule 2: (x & c1) ^ (x & c2) = (x & (c1^c2))
>>>   rule 3: (x | c1) ^ (x | c2) = (x & c3) ^ c3 where c3 = c1 ^ c2
>>>   rule 4: (x | c1) ^ (x & c2) => (x & c3) ^ c1, where c3 = ~c1 ^ c2
>>>
>>>   This change reduce the code size (in terms of # of bitcode instructions) of an application
>>> by 8.9% (64673 vs 58893). I have not got chance to measure performance/code-size impact on
>>> other benchmarks. So far the testing only focus on correctness.
>>>
>>>   Thank you for code review.
>>> Shuxin
>>>
>>>
>>> Proof
>>> =====
>>>   Let X[i] be the i-th bit, counting from the least-significant-bit, zero-based.
>>>
>>>   rule 1: (x | c1) ^ c2
>>>            = (x | c1) ^ (c1 ^ c1) ^ c2
>>>            = ((x | c1) ^ c1) ^ (c1^c2)
>>>            = (x & ~c1) ^ (c1^c2)
>>>
>>>   rule2:  (x & c1) ^ (x & c2) = (x & (c1^c2))
>>>     Divide the bits in 3 disjointed classes:
>>>      1) Those bits corresponding to the "1"s in "c1 & c2"
>>>        E[i] = X[i] ^ X[i] = 0
>>>      2). Those bits corresponding to the "1"s in "c1 ^ c2"
>>>        E[i] = X[i] ^ 0 = X[i]
>>>      3). for rest bits
>>>        E[i] = 0 ^ 0 = 0
>>>
>>>      Combine above discussion, we have ...
>>>      rule 3 can be proved in a similar way
>>>
>>>    rule 4: (x | c1) ^ (x & c2) => (x & c3) ^ c1, where c3 = ~c1 ^ c2
>>>       (x|c1) ^ (x&c2)
>>>      = (x|c1) ^ (x&c2) ^ (c1 ^ c1)
>>>      = ((x|c1) ^ c1) ^ (x & c2) ^ c1
>>>      = (x & ~c1) ^ (x & c2) ^ c1            // rule 1
>>>      = (x & c3) ^ c1, where c3 = ~c1 ^ c2   // rule 3
>>>
>>>    To verify the correctness, I write a small C code, running X from 0 all the way
>>>    to (unsigned)-1, comparing the expr value with and without this opt.
>>>    See the attached offline-test.tar.gz.
>>>
>>> Algorithm
>>> =========
>>>    step 1: classify xor operands into two categories:
>>>      c1: "X & C", (C!= 0)
>>>      c2: "X | C",
>>>         The opnd would be a or-expr with non-zero operands, or
>>>         any other expr, e.g. V = x*y can be viewed as "V | 0".
>>>
>>>     step 2: sort operands such that operands sharing the same symbolic
>>>        value cluster together. e.g.
>>>          (x & 123) ^ (y & 456) ^ (x | 789)
>>>        sort into:
>>>          (x & 123) ^ (x | 789) ^ (y & 456)
>>>
>>>     step 3:
>>>         for each opernad E in the order of their symoblic value {
>>>           apply-rule1(E);
>>>           apply-rule-2/3/4 to E and its previous operand
>>>         }
>> <xor_reassociate.patch>
>> _______________________________________________
>> llvm-commits mailing list
>> llvm-commits at cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits