[PATCH] Xor reassociation
Peter Cooper
peter_cooper at apple.com
Fri Mar 29 18:24:15 PDT 2013
Hi Shuxin
At first. Look I couldn't see a reason for rule 1 as it doesn't reduce the number of instructions. Is it there to canonical use things or did you you find the & exposed more opportunity to remove unused bits from expressions. Just curious.
Only comment on the patch is that I couldn't see a need for the copy constrictor. Otherwise LGTM.
Thanks
Pete
Sent from my iPhone
On Mar 29, 2013, at 5:37 PM, Shuxin Yang <shuxin.llvm at gmail.com> wrote:
> Sorry, forget adding "RUN:" command to the testing *.ll file.
> Updated patch is attached to this mail.
>
> On 3/29/13 5:17 PM, Shuxin Yang wrote:
>> Hi, There:
>>
>> The attached patch is about xor-reaasociation. It is based on following rules:
>>
>> rule 1: (x | c1) ^ c2 => (x & ~c1) ^ (c1^c2)
>> rule 2: (x & c1) ^ (x & c2) = (x & (c1^c2))
>> rule 3: (x | c1) ^ (x | c2) = (x & c3) ^ c3 where c3 = c1 ^ c2
>> rule 4: (x | c1) ^ (x & c2) => (x & c3) ^ c1, where c3 = ~c1 ^ c2
>>
>> This change reduce the code size (in terms of # of bitcode instructions) of an application
>> by 8.9% (64673 vs 58893). I have not got chance to measure performance/code-size impact on
>> other benchmarks. So far the testing only focus on correctness.
>>
>> Thank you for code review.
>> Shuxin
>>
>>
>> Proof
>> =====
>> Let X[i] be the i-th bit, counting from the least-significant-bit, zero-based.
>>
>> rule 1: (x | c1) ^ c2
>> = (x | c1) ^ (c1 ^ c1) ^ c2
>> = ((x | c1) ^ c1) ^ (c1^c2)
>> = (x & ~c1) ^ (c1^c2)
>>
>> rule2: (x & c1) ^ (x & c2) = (x & (c1^c2))
>> Divide the bits in 3 disjointed classes:
>> 1) Those bits corresponding to the "1"s in "c1 & c2"
>> E[i] = X[i] ^ X[i] = 0
>> 2). Those bits corresponding to the "1"s in "c1 ^ c2"
>> E[i] = X[i] ^ 0 = X[i]
>> 3). for rest bits
>> E[i] = 0 ^ 0 = 0
>>
>> Combine above discussion, we have ...
>> rule 3 can be proved in a similar way
>>
>> rule 4: (x | c1) ^ (x & c2) => (x & c3) ^ c1, where c3 = ~c1 ^ c2
>> (x|c1) ^ (x&c2)
>> = (x|c1) ^ (x&c2) ^ (c1 ^ c1)
>> = ((x|c1) ^ c1) ^ (x & c2) ^ c1
>> = (x & ~c1) ^ (x & c2) ^ c1 // rule 1
>> = (x & c3) ^ c1, where c3 = ~c1 ^ c2 // rule 3
>>
>> To verify the correctness, I write a small C code, running X from 0 all the way
>> to (unsigned)-1, comparing the expr value with and without this opt.
>> See the attached offline-test.tar.gz.
>>
>> Algorithm
>> =========
>> step 1: classify xor operands into two categories:
>> c1: "X & C", (C!= 0)
>> c2: "X | C",
>> The opnd would be a or-expr with non-zero operands, or
>> any other expr, e.g. V = x*y can be viewed as "V | 0".
>>
>> step 2: sort operands such that operands sharing the same symbolic
>> value cluster together. e.g.
>> (x & 123) ^ (y & 456) ^ (x | 789)
>> sort into:
>> (x & 123) ^ (x | 789) ^ (y & 456)
>>
>> step 3:
>> for each opernad E in the order of their symoblic value {
>> apply-rule1(E);
>> apply-rule-2/3/4 to E and its previous operand
>> }
>
> <xor_reassociate.patch>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
More information about the llvm-commits
mailing list