[cfe-users] how clang merge strings in .rodata section

Jian, Xu via cfe-users cfe-users at lists.llvm.org
Tue Jul 10 02:26:18 PDT 2018


Hi Hans,
Thank you very much for your support.
It should not be a clang problem.

It is a problem that variable string (date and build host) is injected into ELF.
In zile/src/help.c:
      DEFUN ("zile-version", zile_version)
      /*+
      Show the zile version.
      +*/
      {
        minibuf_write ("Zile " VERSION " of " CONFIGURE_DATE " on " CONFIGURE_HOST);

        return TRUE;
      }
      END_DEFUN

This result in .rodata diffs between two build:
      ***************
      *** 1 ****
      !   [  1d1a]  Zile 2.2.59 of Wed Nov 01 2017 on host-10
      --- 1 ----
      !   [  1d1a]  Zile 2.2.59 of Wed Jul 04 2018 on host-04

"4" is a constant string defined in source code:
In zile/src/variables.c:
      /*
      * Default variables values table.
      */
      static struct var_entry
      {
       char \*var;            /* Variable name. */
       char \*val;            /* Default value. */
       int local;            /* If true, becomes local when set. */
      } def_vars[] =
      {
      #define X(var, val, local, doc) { var, val, local },
      #include "tbl_vars.h"
      #undef X
      };
In zile/src/tbl_vars.h:
      X ("standard-indent", "4", FALSE, "\

"4" point at the end of " \F4" in one build, and point at the end of "Zile 2.2.59 of Wed Jul 04 2018 on host-04" in another build, thus after linking cause ELF .data section diffs.

-----Original Message-----
From: hwennborg at google.com [mailto:hwennborg at google.com] On Behalf Of Hans Wennborg
Sent: Friday, July 6, 2018 5:01 PM
To: Jian, Xu
Cc: cfe-users at lists.llvm.org
Subject: Re: [cfe-users] how clang merge strings in .rodata section

On Fri, Jul 6, 2018 at 10:22 AM, Jian, Xu <Xu.Jian at dell.com<mailto:Xu.Jian at dell.com>> wrote:
> Hi Hans,
> We need to compare whether ELF files of two builds are identical.
> Because of string merge, the comparison has some trouble.
>
> For example in case following code lines (may be in different files):
> ---------------------------------------------------------------
> const char* s_array[1]="s";
> const char *first_s="this first bigger s"; const char *second_s="this
> second bigger s";
> ---------------------------------------------------------------
>
> After clang build ELF out, sometimes the s_array[1] contail the position of the tail of first_s in .rodata second, while sometimes second_s.
> This lead to .data section diff since s_array is in it.
> The ELF diffs, while nothing changed from functionality point of view.

Did the inputs change? If Clang is sometimes using the tail of first_s and sometimes second_s, for the same input, that's a bug. The compilation should be deterministic.

Can you provide sample input files and command lines that show this problem?

Thanks,
Hans


> -----Original Message-----
> From: hwennborg at google.com<mailto:hwennborg at google.com> [mailto:hwennborg at google.com] On Behalf Of
> Hans Wennborg
> Sent: Friday, July 6, 2018 3:54 PM
> To: Jian, Xu
> Cc: cfe-users at lists.llvm.org<mailto:cfe-users at lists.llvm.org>
> Subject: Re: [cfe-users] how clang merge strings in .rodata section
>
> On Thu, Jul 5, 2018 at 3:18 AM, Jian, Xu via cfe-users <cfe-users at lists.llvm.org<mailto:cfe-users at lists.llvm.org>> wrote:
>> Hi,
>>
>> The following c source code abc.c:
>>
>> #include <stdio.h>
>>
>> int g_val=10;
>>
>> const char *g_str="abc";
>>
>> const char *g_str1="c";
>>
>> int main(void)
>>
>> {
>>
>>     printf("%s %s: %d\n",g_str,g_str1,g_val);
>>
>>     return 0;
>>
>> }
>>
>>
>>
>> When compile with “clang abc.c -o abc” then dump .rodata section:
>>
>> # readelf -p .rodata abc
>>
>>
>>
>> String dump of section '.rodata':
>>
>>   [     0]  abc
>>
>>  [     4]  %s %s: %d
>>
>>
>>
>> When compile with “gcc abc.c -o abc” then dump .rodata section:
>>
>> $ readelf -p .rodata abc
>>
>>
>>
>> String dump of section '.rodata':
>>
>>   [    10]  abc
>>
>>   [    14]  c
>>
>>   [    16]  %s %s: %d^J
>>
>>
>>
>> clang is able to merge short string (“c”) into the tail of a long
>> string (“abc”), while gcc will not.
>>
>> Does anybody know how to disable this behavior (make it similar to gcc) ?
>
> I don't think there is a way to disable it.
>
> Why do you want to disable this behaviour?
>
>  - Hans

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-users/attachments/20180710/f57e88d9/attachment.html>


More information about the cfe-users mailing list