<div dir="ltr"><div class="gmail_extra">Hal rightly pointed out (after some denseness on my part) that I had completely misread these two paragraphs. Sorry about that. Got distracted by understanding your example, and missed that you had already done the same analysis I had and even come to the exact same conclusion.</div>
<div class="gmail_extra"><br><div class="gmail_quote">On Thu, May 22, 2014 at 4:42 PM, Louis Gerbarg <span dir="ltr"><<a href="mailto:lgg@apple.com" target="_blank">lgg@apple.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div id=":6oj" class="a3s" style="overflow:hidden">Well, there are a couple of pragmatic solutions that I could apply. I could handle it as a peephole on x86, or do some bit casts in the IR. Before I go down that path I wanted to ask a more interesting question: Are we missing other optimization opportunities because we are using structs for homogenous data that could be expressed in terms of arrays? If we were to write an early pass that effectively rewrote structs into arrays would we see any improvements (besides the one I mentioned in this email). There are a couple of different ways of doing it:<br>
<br>
{i32, i32} has the same layout as [2 x i32] or {[2 x i32]}. My gut feeling is that [2 x i32] is the most optimizable, but that {[2 x i32]} probably is almost and good and potentially eliminates certain edge cases. How about heterogenous structs with homogenous runs: Does rewriting {i32, i32, i16} as {[2 x i32], i16} allow for better optimizations?</div>
</blockquote></div><br></div><div class="gmail_extra">I think that *if* this makes sense, it makes sense to phrase the transform as folding a run of the same type in a struct into an array of that type. That is, we should never remove a layer of {}s from a type because that might have alignment implications or other implications I've not thought of. But within a {}, I think it makes a *lot* of sense to have a single, canonical way to represent a sequence of N types, and [N x <type>] seems like the obvious representation, much more so than repeating the type N times.</div>
<div class="gmail_extra"><br></div><div class="gmail_extra">I think at least an early pass makes a lot of sense here. I wonder whether it makes sense to do something even more radical such as having the struct type for '{i32, i32, i32}' *be* the same type as '{[N x i32]}' for all N != 1. I think it would be good to do structural canonicalization of this form quite early on the structural LLVM types, but I haven't even begun to think about how much code that breaks.</div>
<div class="gmail_extra"><br></div><div class="gmail_extra">I would suggest implementing your GEP optimization conservatively and correctly, committing it, and then resuming this thread with some numbers regarding what performance improvements can be had by doing this kind of structural canonicalization?</div>
</div>