[LLVMdev] RFC: Machine Level IR text-based serialization format

Wed Apr 29 21:17:18 PDT 2015

> On 2015 Apr 29, at 19:58, Hayden Livingston <halivingston at gmail.com> wrote:
> 
> Thank you.
> 
> So do you expect clients of LLVM to still continue to supply .ll files
> to llvm-as?
> 
> Or will this new format be new representation?
> 

The LLVM IR is still serialized as .ll and .bc.  This new format is just
for better testing of the backend(s).

(For clarity, `llvm-as` is a developer tool; it shouldn't be part of a
production workflow.  Production tools should use the C++ API.)

> On Wed, Apr 29, 2015 at 7:44 PM, Duncan P. N. Exon Smith
> <dexonsmith at apple.com> wrote:
>> 
>>> On 2015 Apr 29, at 19:13, Hayden Livingston <halivingston at gmail.com> wrote:
>>> 
>>> What is missing in the current textual format that doesn't allow going
>>> all the way to machine code?
>> 
>> Nothing.
>> 
>> What's missing is the ability to serialize the machine level itself.
>> Since many passes have to run to get from .ll to .s, it's currently
>> hard (impossible?) to test individual machine level passes robustly.
>> Having a way to serialize machine IR will let us test each pass in
>> isolation.
>> 
>>> Is the reason for this project because the current .LL format can't
>>> always be put to bitcode?
>> 
>> Nope, .ll and .bc can represent the same things.
>> 
>>> 
>>> On Wed, Apr 29, 2015 at 3:24 PM, Alex L <arphaman at gmail.com> wrote:
>>>> 
>>>> 
>>>> 2015-04-29 11:40 GMT-07:00 Duncan P. N. Exon Smith <dexonsmith at apple.com>:
>>>> 
>>>>> 
>>>>>> On 2015-Apr-29, at 06:40, Krzysztof Parzyszek <kparzysz at codeaurora.org>
>>>>>> wrote:
>>>>>> 
>>>>>> On 4/28/2015 7:13 PM, Alex L wrote:
>>>>>>> 
>>>>>>> 
>>>>>>> 2015-04-28 16:26 GMT-07:00 Matthias Braun <matze at braunis.de
>>>>>>> <mailto:matze at braunis.de>>:
>>>>>>> 
>>>>>>>  For that use case it is worth keeping the following things in mind:
>>>>>>>  - Please try to keep the output of the various dump functions, esp.
>>>>>>>  MachineInstr::dump(), MachineOperand::dump(),
>>>>>>>  MachineBasicBlock::dump() as close as possible to the format you use
>>>>>>>  for serializing.
>>>>>>> [...]
>>>>>>> 
>>>>>>> Ideally the new syntax would replace the existing print/dump syntax.
>>>>>>> The
>>>>>>> new syntax will lead to certain missing information when
>>>>>>> this information can be inferred (e.g. the TiedTo and IsEarlyClobber
>>>>>>> attributes for register operands that I mentioned earlier in this
>>>>>>> thread),
>>>>>>> so maybe we could have some sort of verbose dumping option where
>>>>>>> absolutely everything is dumped.
>>>>>> 
>>>>>> 
>>>>>> I think that the new syntax is less readable than the current format of
>>>>>> the "dump" functions, and in the long term it would be better to have
>>>>>> something more human-friendly.  However, using YAML has the advantage that
>>>>>> it's easier to parse it than the direct output of "dump" and so it will take
>>>>>> less time to implement a YAML-based solution.  My concern is that you may
>>>>>> run out of time to complete this and the file format is not the most
>>>>>> important thing in this project.  Getting it to work, if only as a proof of
>>>>>> concept, would be very helpful to everyone.  Coming up with a fancier
>>>>>> grammar and implementing a parser for it could be done later on top of the
>>>>>> initial implementation.
>>>>>> 
>>>>>> -Krzysztof
>>>>> 
>>>>> Until I got to this email, I was opposed to using YAML here -- I'd
>>>>> prefer a custom grammar and parser -- but I find Krzysztof's point
>>>>> here pretty convincing.
>>>>> 
>>>>> Starting with a (hybrid) YAML representation seems like a reasonable
>>>>> way to bootstrap a machine IR.  Once it's in place and working, we
>>>>> can come back and strip away the YAML parts until it's human-
>>>>> friendly.  (And since YAML is machine-friendly, upgrade scripts for
>>>>> testcases should be straightforward.)
>>>> 
>>>> 
>>>> I think that this would be a good approach.
>>>> I will work on the proposed YAML hybrid format for now and will begin
>>>> sending out the patches soon. Once it's working, people can evaluate it
>>>> for themselves and see if it suits them or if we need to change it to a
>>>> custom format.
>>>> 
>>>>> 
>>>>> 
>>>>> BTW, we probably need some sort of LangRef document for this.  Maybe
>>>>> docs/MIRLangRef.rst?
>>>> 
>>>> 
>>>> That's fine with me.
>>>> 
>>>> Alex
>>>> 
>>>> _______________________________________________
>>>> LLVM Developers mailing list
>>>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>>> 
>>