[cfe-users] how clang merge strings in .rodata section

Hans Wennborg via cfe-users cfe-users at lists.llvm.org
Fri Jul 6 02:01:25 PDT 2018


On Fri, Jul 6, 2018 at 10:22 AM, Jian, Xu <Xu.Jian at dell.com> wrote:
> Hi Hans,
> We need to compare whether ELF files of two builds are identical.
> Because of string merge, the comparison has some trouble.
>
> For example in case following code lines (may be in different files):
> ---------------------------------------------------------------
> const char* s_array[1]="s";
> const char *first_s="this first bigger s";
> const char *second_s="this second bigger s";
> ---------------------------------------------------------------
>
> After clang build ELF out, sometimes the s_array[1] contail the position of the tail of first_s in .rodata second, while sometimes second_s.
> This lead to .data section diff since s_array is in it.
> The ELF diffs, while nothing changed from functionality point of view.

Did the inputs change? If Clang is sometimes using the tail of first_s
and sometimes second_s, for the same input, that's a bug. The
compilation should be deterministic.

Can you provide sample input files and command lines that show this problem?

Thanks,
Hans


> -----Original Message-----
> From: hwennborg at google.com [mailto:hwennborg at google.com] On Behalf Of Hans Wennborg
> Sent: Friday, July 6, 2018 3:54 PM
> To: Jian, Xu
> Cc: cfe-users at lists.llvm.org
> Subject: Re: [cfe-users] how clang merge strings in .rodata section
>
> On Thu, Jul 5, 2018 at 3:18 AM, Jian, Xu via cfe-users <cfe-users at lists.llvm.org> wrote:
>> Hi,
>>
>> The following c source code abc.c:
>>
>> #include <stdio.h>
>>
>> int g_val=10;
>>
>> const char *g_str="abc";
>>
>> const char *g_str1="c";
>>
>> int main(void)
>>
>> {
>>
>>     printf("%s %s: %d\n",g_str,g_str1,g_val);
>>
>>     return 0;
>>
>> }
>>
>>
>>
>> When compile with “clang abc.c -o abc” then dump .rodata section:
>>
>> # readelf -p .rodata abc
>>
>>
>>
>> String dump of section '.rodata':
>>
>>   [     0]  abc
>>
>>  [     4]  %s %s: %d
>>
>>
>>
>> When compile with “gcc abc.c -o abc” then dump .rodata section:
>>
>> $ readelf -p .rodata abc
>>
>>
>>
>> String dump of section '.rodata':
>>
>>   [    10]  abc
>>
>>   [    14]  c
>>
>>   [    16]  %s %s: %d^J
>>
>>
>>
>> clang is able to merge short string (“c”) into the tail of a long
>> string (“abc”), while gcc will not.
>>
>> Does anybody know how to disable this behavior (make it similar to gcc) ?
>
> I don't think there is a way to disable it.
>
> Why do you want to disable this behaviour?
>
>  - Hans



More information about the cfe-users mailing list