[llvm-dev] Design issues in LLVM IR

Tue Jun 22 17:35:02 PDT 2021

Sorry for dropping off the face of the earth:

> On Jun 10, 2021, at 3:01 AM, Nicolai Hähnle <nhaehnle at gmail.com> wrote:
> 
> Right. I'd like to see more of the learnings of MLIR make it into LLVM IR. It's quite unfortunate that the introduction of MLIR caused a sort of split in the community. 

I agree, but I was trying to focus this thread on specific problems in LLVM that aren’t really related to MLIR.  Replacing the whole mid-level optimizer with a bunch of MLIR passes would be one solution to this problem, but that isn’t what I’m proposing. :)

> On Jun 10, 2021, at 1:44 PM, Nikita Popov <nikita.ppv at gmail.com> wrote:
> e) Constant Expressions are a disaster.  In addition to the problem identified, there are also many annoying cases to deal with, eg. When constexprs exist in phi nodes, trapping constexprs, etc.  In my opinion, the fix is to eliminate them entirely, in a few steps:
> 
>     1) Introduce a new “RelocatableConstant” object which is *not* a mirror of all the IR operations in LLVM, but is instead designed to be used in global variables and allows the standard “globalpointer+offset” pattern that object files support, and we should add a new MachoRelocatableConstant class to represent the “(gv1-gv2+offset)” relocations macho supports.  The presence of this would make codegen and frontends easier to write, and get rid of all the fiddly pattern matching stuff.  I think we need to talk about whether “offset” is a byte offset, or whether it is a series of (constant integer) field indexes in a GEP like operation.  I would argue for the later to make inter procedural optimizations easier to write, but it is debatable.
> 
> Something that isn't entirely clear to me is whether these two types of constants cover everything that is supported. LLVM is happy to take something like this:
> 
> @a = global i64 0 
> @g = global i64 sdiv (i64 ptrtoint (i64* getelementptr (i64, i64* @a, i64 1) to i64), i64 3)
> 
> And produce this kind of assembly from it:
> 
> g:
> .quad (a+8)/3
> 
> The code that decides what is accepted in initializers is https://github.com/llvm/llvm-project/blob/aaaeb4b160fe94e0ad3bcd6073eea4807f84a33a/llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp#L2445 <https://github.com/llvm/llvm-project/blob/aaaeb4b160fe94e0ad3bcd6073eea4807f84a33a/llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp#L2445> and covers quite a few operations. Did this code just get over-generalized, or is there some reason for the set of operations it supports?

Sort of both, but also the problem is that AsmPrinter::lowerConstant was written before really understanding the problem (actually, it was written in the MC timeframe, but was refactored from existing logic that didn’t understand it).

Because it predated the MC layer, all of the different things going on with object files were confused, as was the fact that assemblers have constant folding (which is highly irrelevant to the LLVM code generator).

The three things that matter at this level (please correct me if I’m wrong!) are:

1) we need aggregate constants for structs, arrays etc.
2) We need simple integer/float/vector constants which are bitwise initialized.
3) We need relocatable constants.

For #3, we have a closed set of object files we support (ELF,COFF,MachO), so we have a specific set of things to support.

-Chris

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210622/f0497ef1/attachment.html>