[PATCH] fix for PR16393: miscompile with struct byval

Mon Jul 8 15:24:07 PDT 2013

On Jul 8, 2013, at 2:06 PM, Jim Grosbach <grosbach at apple.com> wrote:

> 
> On Jul 8, 2013, at 1:47 PM, Jakob Stoklund Olesen <jolesen at apple.com> wrote:
> 
>> By complexity, do you mean compile time or amount of PEI code required?
> 
> The latter. I think compile time will be fine either way. My simplistic view of PEI is that it just iterates over the instructions in a function, in layout order, replaces frame indices along the way and keeps track of some interesting facts while doing so like which CSRs are clobbered. Anything that moves the algorithm further afield from that makes me nervous.
> 
>> I am not really concerned about compile time. We’re already (at least) linear in the number of CFG edges, and this won’t add much to that term.
> 
> I thought PEI was constant on the CFG and linear on the number of basic blocks unless shrink-wrapping was going on. I confess I haven’t looked at the guts of the algorithm in a long time, though, so I could easily be wrong.

You’re not wrong. I meant the algorithmic complexity of the whole backend is (super-) linear in the CFG edges. PEI is indeed constant.

The difference between O(#blocks) and O(#edges) only shows up when compiling functions with computed-goto, as you mentioned.

> The scavenger is the nasty bit, when it fires, but has the hard-coded cap on how far to scan to prevent things from going too badly. Anyways, I don’t see this as a significant concern for the problem at hand.
> 
>> As for code complexity, this will certainly make PEI more complex, but I think it is comparable to all the places that would need to learn how to maintain the SET_SP_ADJ invariants. I’d prefer to have the mess all in one place.
> 
> Entirely possible. My first thought is that there wouldn’t need to be many modifications elsewhere and the pseudo would just flow on through w/o getting mucked with. That may be overly optimistic, though.

There wouldn’t be too many cases, but it makes me nervous ;-)

I’d compare the SET_SP_ADJ instruction to the physical register live-in lists we keep on basic blocks. When you split a basic block, you need to compute which physical registers are live at the split point, so the live-in lists can be populated correctly. Similarly for SET_SP_ADJ, you would need to check if your split point is bracketed by STACKDOWN/STACKUP instructions.

We currently get this wrong when expanding X86 CMOV pseudo-instructions, this was the cause of the register allocator crash that I recently fixed by shrinking the physreg live ranges in landing pads.

So this comes down to “who gets the monkey?” We can add complexity to PEI, or we can add complexity to the MI intermediate representation. I prefer to simplify MI to have less rules for mutators to follow.

[I would love to ban physical register live ranges while MI is in SSA form, it would avoid the block splitting problem.]

/jakob