[llvm-dev] LLJIT: __{math}_finite symbols not resolved ?

Lang Hames via llvm-dev llvm-dev at lists.llvm.org
Fri Oct 9 17:09:18 PDT 2020


Hi Jean-Michaël,

Sorry -- I misread your email earlier:

 When ran from *within the container* it works.


Ahh -- I should be looking for success here. I see why the failure is
happening: The testcase doesn't check errors or expected values. You can't
ignore those: They have embedded calls to abort in their destructor if you
ignore them. For the purpose of writing minimal tests you can always wrap
your calls in 'cantFail'. E.g.:

auto JIT_e = cantFail(llvm::orc::LLJITBuilder().create());

That will strip Expected<T> / Error return types (to T/void), asserting
that the value is success in each case.

Once I make those changes I'm seeing the test pass in my Arch Linux
container. Could you share the bitcode that is failing for you? That will
help me pin down where things are going off the rails (or failing to go off
the rails) with my setup.

-- Lang.

On Fri, Oct 9, 2020 at 3:11 PM Lang Hames <lhames at gmail.com> wrote:

> Hi Jean-Michaël,
>
> Sorry for the delayed reply -- The dev meeting kept me pretty busy the
> last couple of days.
>
> When I run your repro.sh script (thank you for taking the time to add all
> the container config code by the way, that's very helpful!) I see:
>
> % ./repro.sh
>  -- Starting docker container --
>
> Using default tag: latest
> latest: Pulling from ossia/score-package-linux
> Digest:
> sha256:aca4c255d4d5a6926e5cd4f50a1a57f6e262a3c931efaaf94a62066784e5424c
> Status: Image is up to date for ossia/score-package-linux:latest
> docker.io/ossia/score-package-linux:latest
>  -- Compiling example to bc --
>
>  -- Building --
>
> -- The CXX compiler identification is Clang 11.0.0
> -- Detecting CXX compiler ABI info
> -- Detecting CXX compiler ABI info - done
> -- Check for working CXX compiler: /opt/score-sdk/llvm/bin/clang++ -
> skipped
> -- Detecting CXX compile features
> -- Detecting CXX compile features - done
> -- Found LLVM 11.0.0
> -- Using LLVMConfig.cmake in: /opt/score-sdk/llvm/lib/cmake/llvm
> -- Configuring done
> -- Generating done
> -- Build files have been written to: /repro/build
> Scanning dependencies of target repro_ffastmath_llvm
> [ 50%] Building CXX object CMakeFiles/repro_ffastmath_llvm.dir/main.cpp.o
> [100%] Linking CXX executable repro_ffastmath_llvm
> [100%] Built target repro_ffastmath_llvm
>  -- Running from within the container : ok
>
> /repro/build.sh: line 48:    45 Illegal instruction
> ./repro_ffastmath_llvm
>  -- Leaving container --
>
>  -- Running from within the host system fails:
>
> ./repro.sh: line 10: ./repro_ffastmath_llvm: cannot execute binary file
>
> ---
>
> Is that the same failure that you're seeing for this reproduction case? I
> was expecting "Symbols not found: [ __log_finite, __exp2_finite ]".
>
> -- Lang.
>
> On Mon, Oct 5, 2020 at 9:23 PM Lang Hames <lhames at gmail.com> wrote:
>
>> Hi Jean-Michaël,
>>
>> Thanks very much for the reproduction case. I'll try this out tomorrow.
>>
>> If you can't take the address of __<F>_finite, then what about defining
>> your own <F>_wrapper functions and using their addresses when defining the
>> absolute symbols? I'm not sure what the performance implications would be
>> for your use-case though.
>>
>> -- Lang.
>>
>> On Mon, Oct 5, 2020 at 3:23 PM Jean-Michaël Celerier <
>> jeanmichael.celerier at gmail.com> wrote:
>>
>>> Hello,
>>> here is a repro which runs in a docker image.
>>> https://we.tl/t-O1EhIAOeOF
>>>
>>> To see the issue, run repro.sh
>>> It will first download a (big, sorry) centos:7 docker image with my
>>> build of LLVM-11 and build a simple lljit-based example.
>>>
>>> This example is called with some trivial .cpp which calls `cos`.
>>> When ran from *within the container* it works.
>>> When the same example, with the same bitcode input, runs from outside
>>> the container, it does not find this symbol,
>>> likely because the host (in my case Arch, I think you need a glibc-2.31
>>> at least for that behaviour to be visible)'s glibc symbols
>>> became versioned.
>>> Removing either the -fmath-errno or -ffinite-math-only flag for the
>>> clang cpp -> bitcode invocation in build.sh fixes the issue
>>> (at the expense of potentially slower code).
>>>
>>> <http://www.jcelerier.name>
>>> Thanks for the hint, sadly it's not possible to take the address of
>>> __log_finite : what happens is that you call the function e.g. log()
>>> in your code, and either clang or some magic glibc header transforms
>>> that into __log_finite further down the pipeline
>>> (see e.g. the discussion in https://reviews.llvm.org/D74712 - sadly in
>>> my case I can't "upgrade" the headers used by my JIT SDK to glibc-2.31+
>>> as it would mean that only people with very very recent distros would be
>>> able to run the code that's being jit-compiled.
>>>
>>> Thanks !
>>>
>>> Jean-Michaël
>>>
>>> On Mon, Oct 5, 2020 at 10:11 PM Lang Hames <lhames at gmail.com> wrote:
>>>
>>>> Hi Jean-Michaël,
>>>>
>>>> Ok -- if you're linking against other symbols without issue then your
>>>> setup sounds good.
>>>>
>>>> My first take is that if you're set up correctly then this should "just
>>>> work", and this failure should be considered a bug, but I need to
>>>> understand more about ELF indirect / versioned symbols before I can say
>>>> that definitively. I usually develop on MacOS, but I'll set up a VM and see
>>>> if I can reproduce this locally to get some more insight here.
>>>>
>>>> In the meantime one workaround would be to define absoluteSymbol
>>>> entries for these functions:
>>>>
>>>> auto Err = J->getMainJITDylib().define(
>>>>   absoluteSymbols({
>>>>     { J->mangleAndIntern("__log_finite"),
>>>> pointerToJITTargetAddress(&__log_finite) },
>>>>     { J->mangleAndIntern("__exp2_finite"),
>>>> pointerToJITTargetAddress(&__exp2_finite) }
>>>>  }));
>>>>
>>>> -- Lang.
>>>>
>>>> On Mon, Oct 5, 2020 at 12:31 PM Jean-Michaël Celerier <
>>>> jeanmichael.celerier at gmail.com> wrote:
>>>>
>>>>> Hello,
>>>>> Right now I am just using a Generator to look for symbols in my
>>>>> process (which links dynamically against libc / libm).
>>>>> It seems to have no trouble finding every other libc / libm / libc++ /
>>>>> ... symbol so I assumed that it was not necessary to specifically link
>>>>> against libm where these __finite symbols reside:
>>>>>
>>>>>   $ nm -D /usr/lib/libm.so.6 |  grep finite
>>>>>   0000000000050540 T __acosf128_finite at GLIBC_2.26
>>>>>   0000000000042f70 T __acosf_finite at GLIBC_2.15
>>>>>   0000000000026940 i __acos_finite at GLIBC_2.15
>>>>>   0000000000051000 T __acoshf128_finite at GLIBC_2.26
>>>>>   0000000000043240 T __acoshf_finite at GLIBC_2.15
>>>>> )
>>>>> but maybe it needs some help on that regard ?
>>>>>
>>>>> Thanks for your quick answer,
>>>>>
>>>>> Jean-Michaël
>>>>>
>>>>>
>>>>> On Mon, Oct 5, 2020 at 7:53 PM Lang Hames <lhames at gmail.com> wrote:
>>>>>
>>>>>> Hi Jean-Michaël,
>>>>>>
>>>>>> How are you trying to provide those symbols to the JIT? Are you using
>>>>>> a DynamicLibrarySearchGenerator to reflect process symbols (or this
>>>>>> specific library's symbols) into the JIT?
>>>>>>
>>>>>> I haven't looked at ELF symbol indirection before -- I'll need to
>>>>>> read up on that before I can provide a sensible answer. It's quite likely
>>>>>> that RuntimeDyld doesn't support it yet though. Depending on what is
>>>>>> required we can either try to implement it there, or aim to fix it in the
>>>>>> newer JITLink linker -- a few people are working on an initial
>>>>>> implementation of that at the moment.
>>>>>>
>>>>>> -- Lang.
>>>>>>
>>>>>> On Mon, Oct 5, 2020 at 12:52 AM Jean-Michaël Celerier via llvm-dev <
>>>>>> llvm-dev at lists.llvm.org> wrote:
>>>>>>
>>>>>>> Hello,
>>>>>>> when building code with -Ofast -ffinite-math-only -ffast-math, clang
>>>>>>> generates calls to "finite" variants of math functions.
>>>>>>>
>>>>>>> This has been the source of a fair amount of issues in a "normal",
>>>>>>> non-JIT pipeline, which seem to have been fixed over time - a simple fix
>>>>>>> being recompiling the target app against the new glibc.
>>>>>>> - https://bugs.llvm.org/show_bug.cgi?id=44842
>>>>>>> - https://github.com/cms-sw/cmssw/issues/24935
>>>>>>> - https://github.com/google/filament/issues/2146
>>>>>>>
>>>>>>> But when going through LLJIT (tested with LLVM-10 & LLVM-11, on
>>>>>>> ArchLinux, glibc-2.32) I still get
>>>>>>>
>>>>>>>      Symbols not found: [ __log_finite, __exp2_finite ]
>>>>>>>
>>>>>>> when trying to materialize my code.
>>>>>>>
>>>>>>> What could be done for that ? "Recompiling" doesn't seem to fix
>>>>>>> anything in this case so it looks like LLJIT lacks the mechanism to
>>>>>>> understand the ELF symbol indirection.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Jean-Michaël
>>>>>>> _______________________________________________
>>>>>>> LLVM Developers mailing list
>>>>>>> llvm-dev at lists.llvm.org
>>>>>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>>>>
>>>>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20201009/954035ad/attachment.html>


More information about the llvm-dev mailing list