Warnings and compile-time failure on 458.sjeng

Steven Wu via llvm-commits llvm-commits at lists.llvm.org
Wed May 25 11:42:38 PDT 2016


> On May 25, 2016, at 11:24 AM, Vedant Kumar <vsk at apple.com> wrote:
> 
> Hi David,
> 
> We're seeing another issue that we think is related to the recent static VP node allocation changes. Instrumented programs are hitting a segfault:
> 
> ```
> 458.sjeng is fixed but 445.gobmk is broken.
> LLVM Profile Warning: Unable to track new values: Running out of static counters.  Consider using option -mllvm -vp-counters-per-site=<n> to allocate more value profile counters at compile time.
> /Users/buildslave/jenkins/workspace/Performance_ARM64_SPEC2006_INT-O3_LTO_PGO-master/spec2006/cur_run/nt/build/LNTBased/speccpu2006/int/445.gobmk/tools/timeit-target: error: child terminated by signal 11
> ```
> 
> The backtrace is:
> 
> ```
> * frame #0: 0x0000000100112f10 445.gobmk.simple`__llvm_profile_instrument_target + 132 at InstrProfilingValue.c:137
>    frame #1: 0x000000010007d528 445.gobmk.simple`shapes_callback + 2352
>    frame #2: 0x0000000100035b0c 445.gobmk.simple`matchpat_loop + 2088
>    frame #3: 0x0000000100034b38 445.gobmk.simple`matchpat_goal_anchor + 1124
>    frame #4: 0x000000010007cb0c 445.gobmk.simple`shapes + 384
>    frame #5: 0x0000000100029e4c 445.gobmk.simple`do_genmove + 2344
>    frame #6: 0x00000001000a5a48 445.gobmk.simple`gtp_gg_genmove + 216
>    frame #7: 0x0000000100099a6c 445.gobmk.simple`gtp_main_loop + 660
>    frame #8: 0x000000010009bd00 445.gobmk.simple`main + 7188
> ```
> 
> In __llvm_profile_instrument_target(), it looks like we're failing the condition: CounterIndex < NumVSites.

I don't think these line number actually make sense when I am actually reading the assembly. It is more likely:
 ValueProfNode *CurrentVNode = ValueCounters[CounterIndex];
CurrentVNode is corrupted and pointed to somewhere completely unreasonable. Program segfault as soon as it gets dereferenced:
 if (TargetValue == CurrentVNode->Value) {

Steven


> 
> Do you mind taking a look?
> 
> thanks,
> vedant
> 
> 
>> On May 23, 2016, at 12:29 PM, Xinliang David Li <davidxl at google.com> wrote:
>> 
>> Fix is on the way.
>> 
>> David
>> 
>> On Mon, May 23, 2016 at 12:27 PM, Vedant Kumar <vsk at apple.com> wrote:
>> 
>>> On May 23, 2016, at 12:19 PM, Xinliang David Li <davidxl at google.com> wrote:
>>> 
>>> 
>>> 
>>> On Mon, May 23, 2016 at 12:15 PM, Vedant Kumar <vsk at apple.com> wrote:
>>> Hi David,
>>> 
>>> I think one of the SPEC2006 tests doesn't have enough statically-allocated VP nodes per site. We're seeing:
>>> 
>>>> "Child terminated by signal 25" (SIGXFSZ) after:
>>> 
>>> Is this related?
>> 
>> I don't think so. I suspect that's happening because the device is overloaded with logging info.
>> 
>> 
>>>> 
>>>> LLVM Profile Warning: Running out of nodes: site_0 at func_12822962448227433604, value=4295054468
>>>> ...
>>>> LLVM Profile Warning: Running out of nodes: site_0 at func_12822962448227433604, value=4295052980
>>>> LLVM Profile Warning: Running out of nodes: site_0 at func_12822962/Users/buildslave/jenkins/workspace/Performance_ARM64_SPEC2006_INT-O3_LTO_PGO-master/spec2006/cur_run/nt/build/LNTBased/speccpu2006/int/458.sjeng
>>> 
>>> It seems like the fix for now is to either tweak vp-counters-per-site for the test or to set -vp-static-alloc=false.
>>> 
>>> In the long term, do you think it's worth adjusting vp-counters-per-site s.t we can run SPEC without modifications? If so, is SPEC the right testbed?
>>> 
>>> Rong had helped collecting SPEC related statistics, but looks like there is something missing.  I will investigate. In the meantime, can you try the workaround?
>> 
>> Yes, we'll do that.
>> 
>> Here's all the logging info from our bot:
>> 
>> 
>> 
>> 
>> 
>> thanks,
>> vedant
>> 
>> 
>>> 
>>> thanks,
>>> 
>>> David
>>> 
>>> 
>>> thanks,
>>> vedant
>>> 
>> 
>> 
>> 
> 



More information about the llvm-commits mailing list