[llvm-dev] bitcode versioning

Fri Dec 11 18:13:35 PST 2015

> On Dec 11, 2015, at 6:13 AM, Martin J. O'Riordan <martin.oriordan at movidius.com> wrote:
> 
> Hi Mehdi and my apologies for the delay in responding - the day job got in the way :-)
> 
> Our target is still out-of-tree so my reasons for extending the IR would be eliminated if we were a proper part of LLVM, which I would like to do when the time is right for us.
> 
> My extensions are quite simple really, and I expect that they will be wanted in the TRUNK sometime anyway.
> 
> At the moment I only have one remaining change which is to add 'v16f16' to the set of IR types.  Previously I had several other FP16 vector types added, but over the past few iterations of LLVM my changes have been gradually made redundant because others have added them formally to the source.  I expect that 'v16f16' will go this way too allowing me to have an unaltered IR.
> 
> But the problem I have faced with making the changes, is that my LLVM cannot accept the BC produced by another version (and vice versa), not even the official version, because the placement of the types in the enumeration is very particular and changes the indices for all the subsequent values.
> 
> I had often thought it would be helpful if the BC (and LL for that matter) had a version resource of some kind, that would allow me to see that the incoming IR was produced by the official unchanged LLVM, and then I could have placed a translation in the loader that would remap the indices to the ones expect by my back-end.
> 
> When you proposed the addition of a version resource, I was thinking that rather than each target adding parsing code for it, it would be better and more transparent for it to appear as a "Version Resource Object" that I could query for simple things like:
> 
>  o  Get the major number
>  o  Get the minor number
>  o  Get the patch number

This would force a specific model for the version, which we didn’t want.

>  o  Is it extended? and if "yes":
>     -  Get the vendor ID (could be a string)
>     -  Get the vendor specific extension number
> 
> And this is really what I mean by an API - essentially a simple object representing the version information.  For IR production/emission, there would need to be a 'setter' interface too.

This is what we do, but using the string only. 
The “setter” is compile time (LLVM_VERSION probably), we patch the bitcode write internally 

> 
> This would allow me to make my extensions, yet be in a position to more robustly accept BC or LL from other sources.  In particular I should be able to remap IR coming from a well-known point-release of LLVM, and also be able to detect, diagnose and reject input from sources I don't recognise (at the moment it just causes a crash).

The string content is predictable: if it will begin with “LLVM3.8.0” or “LLVM3.9.0”, etc. So you should be able to do exactly what you want.
The bitcode produced by our binaries has a very different string, and we use this information to identify the producer as well.
What’s missing?

— 
Mehdi

> 
> From my experience of developing an out-of-tree LLVM backend, I am painfully aware of the downsides of not being "in-tree", and while eventually I expect that I will be able to contribute our work, I am also aware that other future out-of-tree developers will run into similar kinds of problems in the future, and a formal version resource would greatly help.
> 
> Thanks,
> 
> 	MartinO - Movidius Ltd.
> 
> -----Original Message-----
> From: mehdi.amini at apple.com [mailto:mehdi.amini at apple.com] 
> Sent: 03 December 2015 19:51
> To: Martin J. O'Riordan <martin.oriordan at movidius.com>
> Cc: Manuel Rigger <rigger.manuel at gmail.com>; llvm-dev at lists.llvm.org
> Subject: Re: [llvm-dev] bitcode versioning
> 
> What kind of API would you expect? The Bitcode Reader expose the API to get the information in this block. It is up to the client to interpret it.
> 
> Our internal use case is to parse the version string and identify bitcode generated by an Apple released LLVM. If the version is “from the future” the bitcode can be rejected (we’ll do it during LTO).
> 
> — 
> Mehdi
> 
> 
>> On Dec 3, 2015, at 11:48 AM, Martin J. O'Riordan <martin.oriordan at movidius.com> wrote:
>> 
>> Is there going to be a formal interface/API for this version-block information?  I have had to "extend" the IR and bitcode representations several times to address absences/limitations in the handling of various vector types, in particular FP16 vector types; and it would be really useful if I had a "standard" way of doing this, and identifying that my dialect was different.
>> 
>> Thanks,
>> 
>> 	MartinO - Movidius
>> 
>> -----Original Message-----
>> From: llvm-dev [mailto:llvm-dev-bounces at lists.llvm.org] On Behalf Of Mehdi Amini via llvm-dev
>> Sent: 03 December 2015 15:45
>> To: Manuel Rigger <rigger.manuel at gmail.com>
>> Cc: llvm-dev at lists.llvm.org
>> Subject: Re: [llvm-dev] bitcode versioning
>> 
>> 
>>> On Dec 3, 2015, at 4:10 AM, Manuel Rigger via llvm-dev <llvm-dev at lists.llvm.org> wrote:
>>> 
>>> Hi all,
>>> 
>>> I am implementing a LLVM IR interpreter and have the following problem: I want to support execution of bitcode files targeted towards different LLVM versions. For example, a user of the interpreter should be able to compile a C file with the latest version of Clang, a Fortran file with Dragonegg (targeting LLVM 3.3), and a Haskell file with GHC (targeting LLVM 3.5), and then just feed it to my interpreter without additional arguments.
>>> 
>>> Currently, my parser expects textual representation for a specific LLVM version. I could provide different parsers or parser configurations that support different bitcode versions, but there is no notion of a version field in the textual representation that I could use to determine which parser to use. Anyway, for the long term it is not a good idea to rely on the textual format due to the missing backward compatibility guarantees.
>>> 
>>> Hence, I want to replace the textual format parser with a parser for bitcode, which would also be able to parse the files of my example. But how should I treat bitcode files of major upcoming releases, e.g., of LLVM 4.1? I found a version ID in the bitcode wrapper format, but the documentation states that the ID is currently always 0. Is there a policy that specifies when the ID will be updated? Without having such a policy in place, I would just postpone the problem I currently have with the textual format parser.
>> 
>> The wrapper format is Darwin specific AFAIK. However starting with 3.8 there will be a another version block in the bitcode, which contains a string identifying the producer and an integer that will be bumped when needed (whatever it means).
>> Look for lib//Bitcode/Reader/BitcodeReader.cpp:llvm::getBitcodeProducerString() as a starting point.
>> 
>> — 
>> Mehdi
>> 
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>> 
>