[llvm-commits] [llvm] r61297 - in /llvm/trunk: lib/Transforms/Scalar/SimplifyLibCalls.cpp test/Transforms/SimplifyLibCalls/2008-12-20-StrcmpMemcmp.ll
Török Edwin
edwintorok at gmail.com
Mon Dec 22 07:15:18 PST 2008
On 2008-12-21 10:41, Török Edwin wrote:
> On 2008-12-21 03:03, Nick Lewycky wrote:
>
>> Eli Friedman wrote:
>>
>>
>>> On Sat, Dec 20, 2008 at 4:38 PM, Eli Friedman <eli.friedman at gmail.com> wrote:
>>>
>>>
>>>> On Sat, Dec 20, 2008 at 4:19 PM, Nick Lewycky <nicholas at mxc.ca> wrote:
>>>>
>>>>
>>>>> Author: nicholas
>>>>> Date: Sat Dec 20 18:19:21 2008
>>>>> New Revision: 61297
>>>>>
>>>>> URL: http://llvm.org/viewvc/llvm-project?rev=61297&view=rev
>>>>> Log:
>>>>> Turn strcmp into memcmp, such as strcmp(P, "x") --> memcmp(P, "x", 2).
>>>>>
>>>>>
>>>> I'm pretty sure this isn't safe; take the following testcase:
>>>>
>>>> int foo(char* x) {return strcmp(x, "x", 2) == 0;}
>>>>
>>>>
>>> Oops, messed that up slightly; try the following:
>>> int foo(char* x) {return strcmp(x, "x") == 0;}
>>>
>>>
>> The pathological case is:
>>
>> char x[1] = "\0"; // pretend we don't know its length
>> char y[2] = "x\0"; // we know its length
>> return strcmp(x, y);
>>
>> If we transform it into
>>
>> return memcmp(x, y, 2)
>>
>> then the existing MemCmpOpt will exercise these cases:
>>
>> // memcmp(S1,S2,2) != 0 -> (*(short*)LHS ^ *(short*)RHS) != 0
>> // memcmp(S1,S2,4) != 0 -> (*(int*)LHS ^ *(int*)RHS) != 0
>>
>> resulting in a 2-byte load of x[1], which is illegal.
>>
>> Could someone comment on what part of this transform is wrong? I recall
>> having a discussion about it on IRC, but can't recall the details...
>>
>>
>
>
> You need to check the alignment.
> I think something like:
> MinAlignStr = Minimum Alignment of Str1P, Str2P (max 16)
> MinLen = Minimum length of Str1P and Str2P
> AlignMinLen = AlignOf(MinLen) (max 16)
> If (MinAlignStr >= AlingMinLen) -> transform is safe
>
I found the IRC discussion, we assumed that memcmp won't do a 2 byte
load if unaligned:
dwin: memcmp may do a 2 byte read
(11:12:15 PM) _sabre_: memcmp can't do a 2 byte read
(11:12:19 PM) danchr: sdt: oh, I didn't know memcmp stopped at first
difference
(11:12:19 PM) _sabre_: unless the pointers are aligned
(11:12:38 PM) _sabre_: or it knows it is safe some other way
(11:12:55 PM) baldrick: is that guaranteed?
(11:13:33 PM) edwin: for sure it can't transform strcmp(P,"1234\0") into
memcmp(P,"1234\0",5)
(11:13:39 PM) baldrick: or just what is expected by a quality
implementation :)
(11:13:58 PM) baldrick: edwin: why not?
(11:14:06 PM) edwin: because P may be 4 byte aligned
But try this with llvm-gcc:
#include <string.h>
int foo(const char *s)
{
return !memcmp(s,"x",2);
}
It produces:
foo:
.Leh_func_begin1:
.Llabel1:
cmpw $120, (%rdi)
sete %al
movzbl %al, %eax
ret
So is it a bug in MemCmp optimization?
Best regards,
--Edwin
More information about the llvm-commits
mailing list