[llvm-dev] Request for comments on optimizing assembler

Tue May 30 09:58:27 PDT 2017

> On May 30, 2017, at 9:28 AM, Krzysztof Parzyszek <kparzysz at codeaurora.org> wrote:
> 
> On 5/30/2017 11:00 AM, Quentin Colombet via llvm-dev wrote:
>> The way I was seeing this happening is by changing incrementally the parser of the MIR format. Basically, I’d like the parser to get smarter and smarter to a point where it could understand assembly mnemonics and build the MachineFunction. The rest of the infrastructure would stay the same.
> 
> I'm not sure that this is possible. The MIR format was meant to represent the program on the MachineInstr level and is more or less the same for all targets. The "optimizing assembler" would take the .s file as its input, the structure of which will differ from one architecture to the next. Not only that, but the format of an individual assembly instruction may be nontrivial to parse. For example, Hexagon instructions don't follow the typical "mnenomic op, op, ..." format, even though the MIR representation for Hexagon does.

Sure. That being said the MIR parser could invoke the target MC parser for the specific syntax. I admit the line is blurry between an approach that does asm -> MC -> MI and asm -> MIR with such parser.
The reason I still think it is doable is because MIR already does "mnemonic parsing" for MI opcodes and that does not seem to be that farfetched to do it directly on asm mnemonics.
We might want target specific asm to "MIR asm” kind of converter (like transforming op a, b,c in a = op b, c) if that’s really too complicated.

The bottom line is given how much logic those two things seem to share, I think we need to give it more though before we rule out that they can’t be the same tool.

> 
> -Krzysztof
> 
> 
> -- 
> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation