[PATCH] Xor reassociation
Shuxin Yang
shuxin.llvm at gmail.com
Fri Mar 29 19:07:54 PDT 2013
Hi, Pete:
Thank you so much for the quick response.
The purpose of rule 1 is twofold:
1) when c1 = c, (x|c1) ^ c2 => x & ~c1, actually I only catch the case.
2) it is used to prove rule 3
As to the copy-constructor, there is a non-plain-old-data in the
class, without copy-constructor,
while it is ok for now, but could present a potential problem for the
future.
Thank you again! Have a nice weekend!
Shuxin
On 3/29/13 6:24 PM, Peter Cooper wrote:
> Hi Shuxin
>
> At first. Look I couldn't see a reason for rule 1 as it doesn't reduce the number of instructions. Is it there to canonical use things or did you you find the & exposed more opportunity to remove unused bits from expressions. Just curious.
>
> Only comment on the patch is that I couldn't see a need for the copy constrictor. Otherwise LGTM.
>
> Thanks
> Pete
>
> Sent from my iPhone
>
> On Mar 29, 2013, at 5:37 PM, Shuxin Yang <shuxin.llvm at gmail.com> wrote:
>
>> Sorry, forget adding "RUN:" command to the testing *.ll file.
>> Updated patch is attached to this mail.
>>
>> On 3/29/13 5:17 PM, Shuxin Yang wrote:
>>> Hi, There:
>>>
>>> The attached patch is about xor-reaasociation. It is based on following rules:
>>>
>>> rule 1: (x | c1) ^ c2 => (x & ~c1) ^ (c1^c2)
>>> rule 2: (x & c1) ^ (x & c2) = (x & (c1^c2))
>>> rule 3: (x | c1) ^ (x | c2) = (x & c3) ^ c3 where c3 = c1 ^ c2
>>> rule 4: (x | c1) ^ (x & c2) => (x & c3) ^ c1, where c3 = ~c1 ^ c2
>>>
>>> This change reduce the code size (in terms of # of bitcode instructions) of an application
>>> by 8.9% (64673 vs 58893). I have not got chance to measure performance/code-size impact on
>>> other benchmarks. So far the testing only focus on correctness.
>>>
>>> Thank you for code review.
>>> Shuxin
>>>
>>>
>>> Proof
>>> =====
>>> Let X[i] be the i-th bit, counting from the least-significant-bit, zero-based.
>>>
>>> rule 1: (x | c1) ^ c2
>>> = (x | c1) ^ (c1 ^ c1) ^ c2
>>> = ((x | c1) ^ c1) ^ (c1^c2)
>>> = (x & ~c1) ^ (c1^c2)
>>>
>>> rule2: (x & c1) ^ (x & c2) = (x & (c1^c2))
>>> Divide the bits in 3 disjointed classes:
>>> 1) Those bits corresponding to the "1"s in "c1 & c2"
>>> E[i] = X[i] ^ X[i] = 0
>>> 2). Those bits corresponding to the "1"s in "c1 ^ c2"
>>> E[i] = X[i] ^ 0 = X[i]
>>> 3). for rest bits
>>> E[i] = 0 ^ 0 = 0
>>>
>>> Combine above discussion, we have ...
>>> rule 3 can be proved in a similar way
>>>
>>> rule 4: (x | c1) ^ (x & c2) => (x & c3) ^ c1, where c3 = ~c1 ^ c2
>>> (x|c1) ^ (x&c2)
>>> = (x|c1) ^ (x&c2) ^ (c1 ^ c1)
>>> = ((x|c1) ^ c1) ^ (x & c2) ^ c1
>>> = (x & ~c1) ^ (x & c2) ^ c1 // rule 1
>>> = (x & c3) ^ c1, where c3 = ~c1 ^ c2 // rule 3
>>>
>>> To verify the correctness, I write a small C code, running X from 0 all the way
>>> to (unsigned)-1, comparing the expr value with and without this opt.
>>> See the attached offline-test.tar.gz.
>>>
>>> Algorithm
>>> =========
>>> step 1: classify xor operands into two categories:
>>> c1: "X & C", (C!= 0)
>>> c2: "X | C",
>>> The opnd would be a or-expr with non-zero operands, or
>>> any other expr, e.g. V = x*y can be viewed as "V | 0".
>>>
>>> step 2: sort operands such that operands sharing the same symbolic
>>> value cluster together. e.g.
>>> (x & 123) ^ (y & 456) ^ (x | 789)
>>> sort into:
>>> (x & 123) ^ (x | 789) ^ (y & 456)
>>>
>>> step 3:
>>> for each opernad E in the order of their symoblic value {
>>> apply-rule1(E);
>>> apply-rule-2/3/4 to E and its previous operand
>>> }
>> <xor_reassociate.patch>
>> _______________________________________________
>> llvm-commits mailing list
>> llvm-commits at cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
More information about the llvm-commits
mailing list