[LLVMdev] Multi-dimensional array accesses in LLVM-IR | Thoughts

Fri Sep 21 16:38:56 PDT 2012

Hal Finkel <hfinkel at anl.gov> writes:

> On Wed, 12 Sep 2012 15:56:25 +0100
> Renato Golin <rengolin at systemcall.org> wrote:
>
>> On 12 September 2012 09:38, Tobias Grosser <tobias at grosser.es> wrote:
>> > I personally would first have a look at approach '2'.
>>
>> While I normally argue to leave the IR as it is (since it's a compiler
>> IR, not a magical one), I can see some trends going on that should not
>> be ignored.
>>
>> This is one example, where the front-end bends its knees to generate
>> IR that LLVM understands, so that you can revert it to what should
>> have been in the first place (to analyse parallelism) and convert it
>> back again to "correct" IR. As you mentioned, there will be cases that
>> the analysis won't work, ex. when receiving an array from a function,
>> it could look like an opaque pointer in some architectures.
>
> I agree that this seems suboptimal: information known to the frontend
> is lost, and then must be guessed by the backend. This might also be a
> case where metadata is helpful.
>
> I also think that it is important to keep in mind that this particular
> case is one in which guessing is important. This is because there is a
> lot of existing C/C++ code which does manual indexing of
> multidimensional arrays, and we should try to support iteration-space
> transformations involving those arrays as well.
>
>>
>> Last month, in the Cambridge LLVM Social, David Chisnall asked me
>> about a builder that would validate procedure call standards depending
>> on the target, so that the front-end could use that to build the
>> horrible messes we end up doing in that scenario. Another idea is to
>> create a pass that will convert from high-level function declaration
>> to low-level target-dependent declaration during the validation phase,
>> so the front-end produces "good-looking" calls (with types as-is) and
>> the pass makes them target-valid.
>>
>> This technique could also apply to your case. If the front-end
>> supports multi-dimensional access and produces code conforming to
>> that,
>
> Do you mean that there is some kind of canonical form that all of the
> frontends will use, and that LLVM provides some kind of utility for
> generating this canonical form?
>
>  -Hal
>

This all looks like a common problem of single IR, this will happen in
every compiler that uses single centralised IR. It's usually easier to
describe a problem in terms of higher level representation that is even
tailored for the domain or a specific language. Meta data in IR
workarounds many of these problems, but not all of them.

Some of the compilers do employ this strategy of gradual rewrites, but
not all of them. LLVM IR hits the ultimate sweet spot of being flexible
enough and high level enough for most cases, but this discussion shows
that we either want to grow it up to the mid end or down towards the
machine.

There is an awful lot of different classical IRs: LLVM SSA IR, CPS, C--,
various RTLs etc.  The domain specific optimisations can be performed,
and then these IRs can be lowered into something else. In the end the
resulting RTL should be only as abstract to decouple from the machine,
but yet contain enough information for efficient instruction selection.

Sounds wonderful but only in theory!

two cents,

--
Wojciech Meyer
http://danmey.org