[Openmp-dev] [RFC] Clarify the absence of API stability for the *device runtime* (aka. libomptarget-nvptx-sm_XX.bc)
Johannes Doerfert via Openmp-dev
openmp-dev at lists.llvm.org
Thu Jul 9 22:19:45 PDT 2020
On 7/9/20 8:25 PM, Hal Finkel wrote:
>
> On 7/9/20 6:46 PM, Johannes Doerfert wrote:
>>
>> On 7/9/20 6:22 PM, Hal Finkel wrote:
>>>
>>> On 7/9/20 5:33 PM, Johannes Doerfert wrote:
>>>>
>>>>
>>>> On 7/9/20 5:21 PM, Hal Finkel via Openmp-dev wrote:
>>>>>
>>>>> On 7/9/20 5:17 PM, Michael Kruse wrote:
>>>>>> Correct.
>>>>>>
>>>>>> I inferred from your response that we have no such guarantee yet,
>>>>>> we just haven't broken it yet.
>>>>>>
>>>>>> However, I think that change will break ABI of libomptarget-nvptx.a
>>>>>
>>>>>
>>>>> I think that this is an important point. I interpreted this thread
>>>>> in the context of the API between the runtime and the device-side
>>>>> application code. Not the ABI between the plugin and the
>>>>> device-side runtime. I suspect these two are separable, but we
>>>>> should definitely clarify this.
>>>>>
>>>>
>>>> Neither should be considered stable at this point. We kept
>>>> libomptarget stable while we recently added functions but I would
>>>> not assume it will stay that way. As I mentioned before, for the
>>>> device runtime there is basically no supported way today in which
>>>> we could even guarantee stability.
>>>>
>>>
>>> I think that we should separate these concerns.
>>>
>>> The device-side runtime library is a static library (in IR form),
>>> and multiple different versions can co-exist within one application
>>> (in device code contained in different shared libraries). It seems
>>> completely reasonable to consider that version-locked to the
>>> compiler that compiled the code. It's an internal interface between
>>> two parts of a translation unit compiled by the same compiler.
>>>
>>> libomptarget is a shared library. libomptarget.rtl.cuda.so is a
>>> shared library. Unless we take special care in the naming and
>>> linking, managing of global state, etc. I think that we need to
>>> consider these to have some kind of ABI stability (because you only
>>> get to have one in each process). That doesn't mean that we can't
>>> ever decide to break that ABI, but we would likely decide not to do
>>> so silently. This implies that the device-side runtime should
>>> maintain ABI compatibility with libomptarget.rtl.cuda.so unless we
>>> make a non-silent breaking change. Just having kernels in a shared
>>> library suddenly stop working correctly when a newer version of
>>> libomptarget.rtl.cuda.so is loaded is probably not something we can
>>> considerately do at this point.
>>>
>>
>> I think this is all nice and well if we would have a stable and
>> complete setup. I pretty much doubt we are there yet and pretending
>> we are is hurting us and the user alike.
>>
>> I think this goes in the same direction as Ye's comment. Why do we
>> want to guarantee stability if we don't even know if all the puzzle
>> pieces are in place.
>>
>> Interestingly, the OpenMP standard has a way out of this, as Ravi
>> hinted towards in another email. libomptarget is loaded on demand. If
>> the version is not a match we can just not load it (or skip it).
>>
>> At the end of the day it is not the only library that you cannot just
>> update and expect it to work. Nor the only one that will not work
>> with a program compiled for a newer version.
>>
>>
>> Long story short. I would strongly suggest to not put false hopes out
>> there that will come back and haunt us. All (openmp) target libraries
>> are bound to the compiler until further notice. There is no stability
>> guarantee until further notice. Update (+recompile) everything or
>> nothing until for the time being.
>>
>>
>> ~ Johannes
>
>
> I understand what you're saying, but I don't think it's that simple.
> We understand very well that the currently implementation has all
> sorts of usability issues and suboptimalities of various kinds. By
> many measures it's barely usable. Moreover, we're improving all of
> these things, in part, due to feedback from users like Ye. We have a
> lot of applications that want to use OpenMP offload support and can't
> yet. However, the current implementation is not completely unusable,
> and in fact, I think that we must assume that it has users who depend
> on it. I don't think it has very many compared to the number of users
> we'll have after things stabilize a bit. As a result, I think that we
> can prioritize future users over any current ones. However, we should
> still be kind to our current users and communicate with them clearly.
> In my opinion, however, we have limited options for effectively
> communicating with our users, and a mailing-list thread isn't one of
> them. Release notes aren't really either, unfortunately. All of the
> means we have are technical. We can name options with 'experimental'
> in the name (although the ship has sailed already on that one, and
> probably would not have been appropriate anyway). We can bump versions
> of things (symbols, library names, etc.) to prevent linking things
> together that will be broken. We can use dynamic, versioned
> registration checks (i.e., the expected version is embedded into the
> initialization call, and the library aborts or prints a warning if
> provided an unexpected version), but we need to do something.
>
> In short, while I agree with you that pretending we have a significant
> existing user base for which we need to prioritize stability would be
> a mistake, as we make things better, our number of users will grow.
> We'll have a significant number well before we consider the
> functionality to be stable. In addition, we depend on these users
> submitting bug reports and other feedback in order to improve things.
> Thus, we should use technical measures to make it clear what mixing
> will work and what won't work.
>
> I don't understand, however, whether this is an issue of present
> concern, or only a matter of general policy. My impression, Johannes,
> was that the patch that motivated this RFC does not break the
> libomptarget <-> plugin <-> kernel interface at all. It changes only
> the inward-facing IR-level AP of the device runtime. Is that correct?
Yes, this RFC was not about shared libraries (which are the interfaces
you mentioned). I don't really know why we now have a huge discussion
about something else that is theoretical in nature anyway. We always
kept the interface stable, and then we added a feature or fixed a bug
and told everyone to recompile everything anyway.
~ Johannes
>
> Thanks again,
>
> Hal
>
>
>>
>>
>>
>>> -Hal
>>>
>>>
>>>>
>>>>> -Hal
>>>>>
>>>>>
>>>>>>
>>>>>> Just want to clarify what ABI stability guarantees we have.
>>>>>>
>>>>>> Michael
>>>>>>
>>>>>>
>>>>>> Le jeu. 9 juil. 2020 à 17:08, Hal Finkel <hfinkel at anl.gov
>>>>>> <mailto:hfinkel at anl.gov>> a écrit :
>>>>>>
>>>>>>
>>>>>> On 7/9/20 3:39 PM, Michael Kruse wrote:
>>>>>> > Am Do., 9. Juli 2020 um 14:08 Uhr schrieb Hal Finkel
>>>>>> <hfinkel at anl.gov <mailto:hfinkel at anl.gov>>:
>>>>>> >> Also, if we wanted to change clang so that it linked
>>>>>> version-locked
>>>>>> >> versions of these libraries, -lomptarget-11 or whatever,
>>>>>> that,
>>>>>> in my
>>>>>> >> opinion, would also be a reasonable choice to discuss.
>>>>>> >>
>>>>>> >> One thing worth capturing is the extent to which these
>>>>>> things are
>>>>>> >> connected. There is a relationship between libomptarget
>>>>>> and its
>>>>>> plugins,
>>>>>> >> and the plugins and the device-side runtimes. There is an ABI
>>>>>> boundary
>>>>>> >> there somewhere. If we change nothing else, we might need to
>>>>>> consider
>>>>>> >> ABI stability of this part of the device-side interface.
>>>>>> > The official distribution apt.llvm.org <http://apt.llvm.org>
>>>>>> contains libomptarget.so for
>>>>>> > LLVM 7 to 10, but each into separate directories under
>>>>>> > /usr/lib/llvm-<version>/libomptarget.so. The prebuilt
>>>>>> binaries under
>>>>>> > https://releases.llvm.org/download.html
>>>>>> <https://releases.llvm.org/download.html> puts it directly under
>>>>>> > <prefix>/libomptarget.so. If libomptarget is
>>>>>> version-locked, users
>>>>>> > need to be careful about pointing to the right
>>>>>> LD_LIBRARY_PATH.
>>>>>> > However, I could not find target device plugins in the
>>>>>> distributions
>>>>>> > (such as lib/libomptarget-nvptx-sm_60.bc when built on a
>>>>>> machine
>>>>>> with
>>>>>> > CUDA). The official ubuntu repository doesn't contain
>>>>>> libomptarget at
>>>>>> > all. Arch Linux contains at least the x86_64 rtl
>>>>>> > (https://www.archlinux.org/packages/extra/x86_64/openmp/files/
>>>>>> <https://www.archlinux.org/packages/extra/x86_64/openmp/files/>)
>>>>>> > without any versioning resolution.
>>>>>> >
>>>>>> > Should make it explicit what the compatibility guarantees for
>>>>>> > libomptarget are, maybe even discourage OS distributions to
>>>>>> > pre-package libomptarget into ldconfig default paths? At least
>>>>>> on Arch
>>>>>> > Linux updating the openmp package will break previously
>>>>>> > compiled-with-offloading binaries.
>>>>>>
>>>>>>
>>>>>> You mean that it will break them *if* we make an ABI-breaking
>>>>>> change in
>>>>>> libomptarget. Changing the device-side runtime doesn't
>>>>>> necessarily
>>>>>> imply
>>>>>> that. Nevertheless, certainly good to know.
>>>>>>
>>>>>> -Hal
>>>>>>
>>>>>>
>>>>>> >
>>>>>> > Michael
>>>>>>
>>>>>> -- Hal Finkel
>>>>>> Lead, Compiler Technology and Programming Languages
>>>>>> Leadership Computing Facility
>>>>>> Argonne National Laboratory
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> Openmp-dev mailing list
>>>>>> Openmp-dev at lists.llvm.org
>>>>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/openmp-dev
>>>
More information about the Openmp-dev
mailing list