[LLVMdev] [RFC] Exhaustive bitcode compatibility tests for IR features

Tue Nov 11 13:07:09 PST 2014

Hi Stephen

Thans for the feedback. The purpose of my compatibility tests is too catch bitcode level changes that similar to function attribute changes and make sure we still read those format. For the target machine metadata, it sounds more like a platform portability issue which will not be covered by this test. I don’t have a good way to test the metadata as well because this kind of test cannot check if the backend treat the metadata differently.
I think we have agree this is an approach to deal with the problem at least for the first step. I just want to know more if anyone is interested in a generated test together with a text file in the repo.

Thanks

Steven

> On Nov 10, 2014, at 2:04 PM, Stephen Hines <srhines at google.com> wrote:
> 
> I haven't had a chance to really go through this proposal, but I can comment quickly about what we have seen go wrong with our use of bitcode as a storage format in the past few years. The largest change we had to adapt to was the rewrite of function attributes (i.e. the changes that added a brand new AttributeSet, etc.). One of the related issues there was the inclusion of Target machine metadata (i.e. this code was compiled for this particular arch, etc.) as a per-function attribute. This caused lots of problems for our IR, so we just decided to drop it completely (we strip it from the files we are emitting). 
> 
> Other issues we have worked around:
> 1) case-range statements: we originally patched these to expand anything that might have been seen, but this has been removed from upstream (and thus also from our code).
> 2) ConstantDataSequential: there was an issue with this in our 2.9 writer back in 2012.
> 3) Removal and reuse of an existing name for a TYPE_*: This has always been a problem, so we just end up writing code to force it to the TYPE_*_OLD instead.
> 
> Steve
> 
> On Fri, Nov 7, 2014 at 7:22 PM, Sean Silva <chisophugis at gmail.com <mailto:chisophugis at gmail.com>> wrote:
> It sounds like the Android RenderScript guys have the most in-the-trenches experience with bitcode incompatibilities. Stephen Hines (CC'd), what sorts of incompatibilities have you guys seen during the 3.x timeline? Would Steven Wu's proposal catch the sorts of incompatibilities that you guys have seen?
> 
> -- Sean Silva
> 
> On Thu, Nov 6, 2014 at 4:38 PM, Steven Wu <stevenwu at apple.com <mailto:stevenwu at apple.com>> wrote:
> Sorry for sending the mail again. Including LLVMdev this time.
> 
> Hi
> 
> I have chance to work more on this topic. I was experimenting the idea of auto-generating all the IR tests because the amount of intrinsics and their variations exist in LLVM.
> My attempt is to write a C++ bitcode generator that iterate through all the TableGen and enum structures in the LLVM to generate all the bitcode using LLVM APIs. This should be an easy way to generate a bitcode test that is up-to-date on top of the trunk. I have a first working version taking care of intrinsics, globals (and most of its features) and some instructions. It also generates all the CHECKs at the same time. However there are quite some problems around this approach. First of all, informations encoded in the TableGens for intrinsics are not precise enough to generate good tests and some of them are even wrong (but not exposed, I will bring up the details in the end of the email). Second, the part I finished is the “easy” part in which lots of tests can be generated in batch. The rest of the test generation will require more coding and more careful planning. So before I spend lots of time to write up more features, I would like to have some feedback from the list.
> 
> I am not sure how many people support writing such tool and have it in the trunk. It uses LLVM API which is also subject to change (but also a pain to change). The benefit is to have a IR tests that is up-to-date and we can easily acquire a bitcode test from any point of the history. When we make a release, we can simply run this tool to have a frozen version of Bitcode in the repo. It is also good in testing the robustness of the API which allows my to patch few bugs along the way. If people like to see this tool, I can start a review soon for the part I finished and try to get it in the trunk. I might need some help to finish the rest, because all the recent changes that involves bitcode. Otherwise, I will finish my bitcode test file by hand written the rest of the tests, combine them with my current output and what Michael contributed earlier.
> 
> For people who interested in why intrinsics TableGen information is not precise and accurate, the main reason is that we simply ignores all type specified with LLVMAnyPointerType. LLVMAnyPointerType essentially support all types and encode the name into the function name. I found the original intention to add LLVMAnyPointerType is to specify a pointer of certain type in any address space, but it seems never care about the type. What makes it worse is that, since the verifier is not checking its type, many intrinsics actually has the wrong type definition. For example, int_aarch_neon_st2 is defined as (llvm_anyvector_ty, LLVMMatchType<0>, LLVMAnyPointerType<LLVMMatchType<0>>), but in reality, it has type like (v8i8, v8i8, i8*) instead of (v8i8, v8i8, v8i8*). I write up a patch myself to check all types in LLVMAnyPointerType and many regression tests failed. I cannot find a way to fix all the testcase failures without breaking the bitcode compatibility. I currently generate all of the variation of them since they are “valid” bitcode for current version.
> 
> Steven
> 
>> On Sep 23, 2014, at 6:13 AM, Kuperstein, Michael M <michael.m.kuperstein at intel.com <mailto:michael.m.kuperstein at intel.com>> wrote:
>> 
>> Hi Steven,
>> 
>> I just committed a few more tests I've had.
>> Unfortunately, since they were actually written about half a year ago, a couple of the tests turned out to be broken at LLVM top-of-trunk, because the textual representation changed somewhat.
>> I'll ask the person who originally wrote them to fix them up and I'll commit them later.
>> 
>> Michael
>> 
>> -----Original Message-----
>> From: Steven Wu [mailto:stevenwu at apple.com <mailto:stevenwu at apple.com>] 
>> Sent: Monday, September 22, 2014 18:50
>> To: Kuperstein, Michael M
>> Cc: Rafael Espíndola; LLVM Developers Mailing List
>> Subject: Re: [LLVMdev] [RFC] Exhaustive bitcode compatibility tests for IR features
>> 
>> Thanks Michael. I notice your tests which is a good place to start. My attached test file is just an example for myself to test how to cleanly layout all the tests in one file (or if possible).
>> I think I mainly want to improve two things. One is a systematic way to test bitcode from a certain LLVM version. That is the intention behind merge all the tests into one file. 
>> The other one is to test the compatibility of all intrinsics which is missing. 
>> If you have more tests that you want to commit, please do so because they can help me a lot! 
>> 
>>> On Sep 21, 2014, at 3:34 AM, Kuperstein, Michael M <michael.m.kuperstein at intel.com <mailto:michael.m.kuperstein at intel.com>> wrote:
>>> 
>>> I’ve committed a set of precisely this kind of tests earlier this year, based off LLVM 3.2.
>>> They’re in test/Bitcode. (e.g. test/Bitcode/global-variables.3.2.ll), the commits are 197340, 197873, 202262 and 202647 if you want the whole list.
>>> 
>>> Does that cover what you want, more or less? 
>>> The coverage isn't complete, but there are a few more test that I still have laying around that I unfortunately never got around to committing, I'll do it later this week.
>>> 
>>> -----Original Message-----
>>> From: llvmdev-bounces at cs.uiuc.edu <mailto:llvmdev-bounces at cs.uiuc.edu> [mailto:llvmdev-bounces at cs.uiuc.edu <mailto:llvmdev-bounces at cs.uiuc.edu>] 
>>> On Behalf Of Steven Wu
>>> Sent: Friday, September 19, 2014 20:02
>>> To: Rafael Espíndola
>>> Cc: LLVM Developers Mailing List
>>> Subject: Re: [LLVMdev] [RFC] Exhaustive bitcode compatibility tests 
>>> for IR features
>>> 
>>> 
>>>> On Sep 19, 2014, at 9:57 AM, Rafael Espíndola <rafael.espindola at gmail.com <mailto:rafael.espindola at gmail.com>> wrote:
>>>> 
>>>> So the proposal is that during development new features are added to 
>>>> test/Features/compatibility.ll (or some other name). When 3.6 is 
>>>> released, we will
>>>> 
>>>> * assemble the file with llvm-as-3.6.
>>>> * Check in the .bc file as test/Features/Input/compatibility-3.6.bc
>>>> * Copy test/Features/compatibility.ll to 
>>>> test/Features/compatibility-3.6.ll and change it to run llvm-dis 
>>>> directly on the 3.6 bitcode.
>>>> 
>>>> And then when 4.1 is released we have a discussion on what we want to 
>>>> drop from the old .bc files.
>>>> 
>>>> Correct? If so, sounds reasonable to me.
>>> Correct. This is exactly what I mean.
>>> 
>>>> 
>>>> On 18 September 2014 22:01, Steven Wu <stevenwu at apple.com <mailto:stevenwu at apple.com>> wrote:
>>>>> From the discussion of bitcode backward compatibility on the list, it seems we lack systematic way to test every existing IR features. It is useful to keep a test that exercises all the IR features for the current trunk and we can freeze that test in the form of bitcode for backward compatibility test in the future. I am proposing to implement such a test, which should try to accomplish following features:
>>>>> 1. Try to keep it in one file so it is easy to freeze and move to the next version.
>>>>> 2. Try to exercise and verify as much features as possible, which should includes all the globals, instructions, metadata and intrinsics (and more).
>>>>> 3. The test should be easily maintainable. It should be easy to fix when broken or get updated when assembly gets updated.
>>>>> I am going to implement such test with a lengthy LLVM assembly, in the form of the attachment (which I only tests for global variable). It is going to be long, but someone must do it first. Future updates should be much simper. In the test, I started with a default global variable and enumerate all the possible attributes by changing them one by one. I try to keep the variable declaration as simple as possible so that it won’t be affected by some simple assembly level changes (like changing the parsing order of some attributes, since this is supposed to be a bitcode compatibility test, not assembly test). I try to make the tests as thorough as possible but avoid large duplications. For example, I will tests Linkage attribute in both GlobalVariable as well as Function, but probably not enumerate all the types I want to test. I will keep the tests for Types in a different section since it is going to be huge and it is orthogonal to the tests of globals.
>>>>> When making a release or some big changes in IR, we can freeze the test by generating bitcode, change the RUN line so it runs llvm-dis directly, and modified the CHECKs that corresponding to the change. Then we can move on with a new version of bitcode tests. This will add some more works for people who would like to make changes to IR (which might be one more reason to discourage them from breaking the compatibility). I will make sure to update the docs for changing IRs after I add this test.
>>>>> 
>>>>> Currently, there are individual bitcode tests in the llvm which are created when IR or intrinsics get changed. This exhaustive test shouldn’t overlap with the existing ones since this tests is focusing on keeping a working up-to-date version of IR tests. Both approaches of bitcode tests can co-exists. For example, for small updates, we can add specific test cases like the ones currently to test auto-upgrade, while updating the exhaustive bitcode test to incorporate the new changes. When making huge upgrades and major releases, we can freeze the exhaustive test for future checks.
>>>>> 
>>>>> For the actual test cases, I think it should be trivial for globals, instructions, types (Correct me  if I am wrong), but intrinsics can be very tricky. I am not sure how much compatibility is guaranteed for intrinsics, but they can’t not be checked through llvm-as then llvm-dis. Intrinsics, as far as I know, are coded like normal functions, globals or metadata. My current plan is to write a separate tool to check the intrinsics actually supported in the IR or backend. Intrinsic function might be the easiest since the supported ones should all be declared in Intrinsics*.td and can be check by calling getIntrinsicID() after reading the bitcode. Intrinsics coded as globals (llvm.used) or metadata (llvm.loop) can be more tricky. Maybe another .td file with hardcoded intrinsics for these cases should be added just for the testing purpose (we can add a new API to it later so that we don’t need to do string compares to figure out these intrinsics). After we have another tool to test intrinsics (which can be merged with llvm-dis to save a RUN command and execution time), the attached test will just need to be updated like following (checking llvm.global_ctors for example):
>>>>> ; RUN: verify-intrinsics %s.bc | FileCheck -check-prefix=CHECK-INT 
>>>>> %s
>>>>> 
>>>>> %0 = type { i32, void ()*, i8* }
>>>>> @llvm.global_ctors = appending global [1 x %0] [%0 { i32 65535, void
>>>>> ()* @ctor, i8* @data }] ; CHECK: @llvm.global_ctors = appending 
>>>>> global [1 x %0] [%0 { i32 65535, void ()* @ctor, i8* @data }] ;
>>>>> CHECK-INT: @llvm.global_ctors int_global_ctors
>>>>> 
>>>>> Let me know if there is better proposal.
>>>>> 
>>>>> Steven
>>>>> 
>>> 
>>> 
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> LLVMdev at cs.uiuc.edu <mailto:LLVMdev at cs.uiuc.edu>         http://llvm.cs.uiuc.edu <http://llvm.cs.uiuc.edu/>
>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev <http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev>
>>> ---------------------------------------------------------------------
>>> Intel Israel (74) Limited
>>> 
>>> This e-mail and any attachments may contain confidential material for 
>>> the sole use of the intended recipient(s). Any review or distribution 
>>> by others is strictly prohibited. If you are not the intended 
>>> recipient, please contact the sender and delete all copies.
>> 
>> ---------------------------------------------------------------------
>> Intel Israel (74) Limited
>> 
>> This e-mail and any attachments may contain confidential material for
>> the sole use of the intended recipient(s). Any review or distribution
>> by others is strictly prohibited. If you are not the intended
>> recipient, please contact the sender and delete all copies.
> 
> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20141111/d922cb21/attachment.html>