[polly] r281052 - FlattenAlgo: Ensure we _really_ obtain a param space

Michael Kruse via llvm-commits llvm-commits at lists.llvm.org
Sat Sep 10 15:49:21 PDT 2016


The issue is that libPollyISL.a is loaded twice into the processes'
address space.

Adding the code

class X {
public:
  X() {  printf("isl_id_none loaded at 0x%" PRIXPTR "\n", &isl_id_none); }
} x;

reveals

isl_id_none loaded at 0x7F409D769620
isl_id_none loaded at 0x791880

(I am still wondering why they are loaded so far apart)
Cmake runs the following commands:

[3/6] : && /usr/bin/c++ [...]  -o lib/libPolly.so [...] lib/libPollyISL.a [...]

[5/6] : && /usr/bin/c++ [...]  -o
tools/polly/unittests/Flatten/FlattenTests  [...] lib/libPollyISL.a
[...]

That is, one instance in libPolly.so, another in FlattenTests. The ISL
commands in FlattenTest.cpp (eg. reading a set from string) will use a
different one than the algos FlattenAlgo.cpp, because they are linked
into different dynamic libraries.

At the moment I cannot think of an elegant solution. At very least
cmake should not propagate linking PollyISL into everything that also
links to Polly, ie. replacing

target_link_libraries(Polly PollyISL)

by

target_link_libraries(Polly PRIVATE PollyISL)

However, then we cannot run any ISL test anymore as eg. ISLTest.cpp is
not allowed to call any ISL function anymore. Possible solutions I can
currently think of:
- Disable these unittests with BUILD_SHARED_LIBS=ON
- Don't use -fvisibility=hidden
- Forward the ISL calls we need for testing through libPolly:
  isl_ctx *polly_isl_ctx_alloc() { return isl_ctx_alloc(); }


Michael




2016-09-10 7:59 GMT+02:00 Tobias Grosser <tobias at grosser.es>:
> On Sat, Sep 10, 2016, at 12:10 AM, Michael Kruse via llvm-commits wrote:
>> 2016-09-09 22:22 GMT+02:00 Tobias Grosser via llvm-commits
>> <llvm-commits at lists.llvm.org>:
>> > On Fri, Sep 9, 2016, at 06:50 PM, Tobias Grosser via llvm-commits wrote:
>> >> On Fri, Sep 9, 2016, at 06:37 PM, Michael Kruse via llvm-commits wrote:
>> >> > Thank you for finding a fix.
>> >> >
>> >> > This is somewhat strange. Isn't isl_union_map_get_space defined to
>> >> > return a parameter space?
>> >>
>> >> Right.
>> >>
>> >> > What is different on those machines where it
>> >> > happened?
>> >>
>> >> They are owned by me. Seriously, I have no idea. This is very strange. I
>> >> spent 30 minutes to look into this but was not able to find anything
>> >> useful. :(
>> >
>> > OK. I spent a couple of hours to look into this. No idea what is going
>> > on still. I was first suspecting the new C++ interface to incorrectly
>> > free/deallocate memory. However, I could not find any issue and valgrind
>> > also does not report anything.
>> >
>> > Then I just went to printf debugging. I attach you both my patch and the
>> > typescript.
>> >
>> > The interesting piece is this:
>> >
>> > checkFlatten - SPtr: 18302032
>> > checkFlatten - isParams :  is_params: space 18298960
>> >  a
>> >  b
>> > space->tuple_id[0] 8021840
>> > isl_id_none 8021840
>> >  c
>> >  d
>> > 1
>> > copy
>> > Take ownership: 0
>> > Move That2
>> > copy
>> > flattenSchedule - SchedulePtr: 18302032
>> > flattenSchedule - isParams:  is_params: space 18298960
>> >  a
>> >  b
>> > space->tuple_id[0] 8021840
>> > isl_id_none 140647671602904
>> >
>> > The schedule remains the same when moving into flattenSchedule and the
>> > space as well, even the id in space->tuple_id[0]. However, it seems the
>> > address of the global isl_id_none changes. I tried to run this in gdb
>> > and run 'watch  isl_id_none', but do not observe any change. I would
>> > expect this address to stay constant at all times. I feel as if I missed
>> > something very basic. Any idea what is going on?
>>
>> Thank you for your debugging efforts. This is as puzzling for me as for
>> you.
>>
>> If the code is not compiled using -fpic, "&isl_id_none" should turn
>> into a constant.
>>
>> 0115E1A6  push        offset isl_id_none (01441810h)
>> 0115E1AB  push        1426678h
>> 0115E1B0  call        _printf (0E6E9C5h)
>> 0115E1B5  add         esp,8
>> (MSVC 32-bit disassembly, debug build)
>>
>> With -fpic, it becomes relative to the instruction pointer (%rip)
>>
>>    0x00000000004e73b1 <+129>:   lea    0x2bd2e8(%rip),%rbp        #
>> 0x7a46a0 <isl_id_none>
>>    0x00000000004e73b8 <+136>:   lea    0x86d14(%rip),%rsi        #
>>    0x56e0d3
>>    0x00000000004e73bf <+143>:   xor    %eax,%eax
>>    0x00000000004e73c1 <+145>:   mov    $0x1,%edi
>>    0x00000000004e73c6 <+150>:   mov    %rbp,%rdx
>>    0x00000000004e73c9 <+153>:   callq  0x404730 <__printf_chk at plt>
>> (gdb 64-bit disassembly, release build)
>>
>>
>> Your new address 0x7FEB16811ED8 (140647671602904) would indicate a
>> location on the stack. With this code, I have no idea how this could
>> print different values in the middle of the program's execution
>> (except the loader decided to relocate the .text section while
>> running?!?)
>>
>> If you are interested into looking further into it, you could try to
>> compile without -fpic. Or debug with gdb until before it changes, step
>> forward while observing &isl_id_none, maybe even tracking it through
>> the registers. Does gdb even evaluate it to the same value as printf
>> prints? Btw, 'watch  isl_id_none' would observe isl_id_none's
>> contents, not its address.
>
> Hi Michael,
>
> I just set BUILD_SHARED_LIBS=OFF and the issue disappeared even for the
> original code without my (isl_space_params "fix"). Can you reproduce
> this on linux with BUILD_SHARED_LIBS=ON?
>
> Best,
> Tobias


More information about the llvm-commits mailing list