[LLVMdev] RFC: Indexing of structs vs arrays in getelementpointer

Thu May 22 21:46:12 PDT 2014

Hal rightly pointed out (after some denseness on my part) that I had
completely misread these two paragraphs. Sorry about that. Got distracted
by understanding your example, and missed that you had already done the
same analysis I had and even come to the exact same conclusion.

On Thu, May 22, 2014 at 4:42 PM, Louis Gerbarg <lgg at apple.com> wrote:

> Well, there are a couple of pragmatic solutions that I could apply. I
> could handle it as a peephole on x86, or do some bit casts in the IR.
> Before I go down that path I wanted to ask a more interesting question: Are
> we missing other optimization opportunities because we are using structs
> for homogenous data that could be expressed in terms of arrays? If we were
> to write an early pass that effectively rewrote structs into arrays would
> we see any improvements (besides the one I mentioned in this email). There
> are a couple of different ways of doing it:
>
> {i32, i32} has the same layout as [2 x i32] or {[2 x i32]}. My gut feeling
> is that [2 x i32] is the most optimizable, but that {[2 x i32]} probably is
> almost and good and potentially eliminates certain edge cases. How about
> heterogenous structs with homogenous runs: Does rewriting {i32, i32, i16}
> as {[2 x i32], i16} allow for better optimizations?
>

I think that *if* this makes sense, it makes sense to phrase the transform
as folding a run of the same type in a struct into an array of that type.
That is, we should never remove a layer of {}s from a type because that
might have alignment implications or other implications I've not thought
of. But within a {}, I think it makes a *lot* of sense to have a single,
canonical way to represent a sequence of N types, and [N x <type>] seems
like the obvious representation, much more so than repeating the type N
times.

I think at least an early pass makes a lot of sense here. I wonder whether
it makes sense to do something even more radical such as having the struct
type for '{i32, i32, i32}' *be* the same type as '{[N x i32]}' for all N !=
1. I think it would be good to do structural canonicalization of this form
quite early on the structural LLVM types, but I haven't even begun to think
about how much code that breaks.

I would suggest implementing your GEP optimization conservatively and
correctly, committing it, and then resuming this thread with some numbers
regarding what performance improvements can be had by doing this kind of
structural canonicalization?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20140522/c1c3f712/attachment.html>