[LLVMdev] extractelement causes memory access violation - what to do?

Wed Jul 1 16:48:54 PDT 2015

----- Original Message -----
> From: "Pete Cooper" <peter_cooper at apple.com>
> To: "Hal Finkel" <hfinkel at anl.gov>
> Cc: "LLVMdev" <llvmdev at cs.uiuc.edu>, "Paweł Bylica" <chfast at gmail.com>
> Sent: Wednesday, July 1, 2015 6:42:41 PM
> Subject: Re: [LLVMdev] extractelement causes memory access violation - what	to do?
> 
> 
> > On Jul 1, 2015, at 3:45 PM, Hal Finkel <hfinkel at anl.gov> wrote:
> > 
> > ----- Original Message -----
> >> From: "Pete Cooper" <peter_cooper at apple.com>
> >> To: "Paweł Bylica" <chfast at gmail.com>
> >> Cc: "Hal Finkel" <hfinkel at anl.gov>, "LLVMdev"
> >> <llvmdev at cs.uiuc.edu>
> >> Sent: Wednesday, July 1, 2015 12:08:37 PM
> >> Subject: Re: [LLVMdev] extractelement causes memory access
> >> violation - what	to do?
> >> 
> >> Sorry for chiming in so late in this.
> >> 
> >> So I agree that negative indices are UB, I don’t think thats
> >> contentious.
> >> 
> >> However, I think the issue here is the DAG expansion. That is the
> >> point at which we go from UB which would just be hidden in the
> >> instruction to something which can crash. I think its at that
> >> point
> >> where we should mask to ensure that the in memory expansion
> >> doesn’t
> >> read out of bounds. On architectures which do the variable extract
> >> in an instruction, they won’t be penalized by a mask,
> > 
> > Why do you feel that they won't be penalized by the mask? Or are
> > you assuming will adjust the patterns to match the index plus the
> > mask?
> Ah, should have explained better.  What I meant was that if i can do
> the variable extract in a register without going to memory at all
> (so have suitable legal instructions to do so), then we won’t
> generate a mask at all.  The mask would only be generated when the
> legalizer moves the data to memory which we don’t do if its legal.

Ah, alright.

> > 
> >> only the
> >> memory expansion will be which should be rarer,
> > 
> > On some architectures expansion in memory is not particularly
> > expensive because they have very good store-to-load forwarding.
> > Adding additional masking instructions into the critical path of
> > correct code will not be a welcome change.
> Thats true, so i guess it depends how many architectures need to do
> variable extracts in memory.  I have no idea if any architectures we
> support are able to do a variable extract in a register, or if all
> use memory.

At least on PowerPC, when using QPX, we can do this using instructions.

>  If most use a register, then penalizing the few who do
> use memory by inserting a mask seems reasonable.
> > 
> >> 
> >> The point about speculation at the IR level is interesting.
> >> Personally i’m ok with constant indices being speculated and
> >> variable not. If we later want to find a good way to ask TTI
> >> whether
> >> a variable extract is cheap then we can find a way to do so.
> > 
> > It is not about expense, it is about not introducing UB when
> > speculating the instruction.
> Yeah, I see what you mean here.  So the user could have written
> 
> if (i >= 0) x = extract v[i]
> 
> but if we speculate then we aren’t guarded and have UB.  Having the
> backend insert the mask would fix this, but I agree that someone,
> somewhere needs to put in the mask if we want to allow speculation
> here, and the target can’t do the variable extract in a register.

I'd rather the frontend do this if the language wants it. We can use ComputeKnownBits when we do the speculation check, and so if there is a mask, we'll be fine.

> 
> Pete
> > 
> > -Hal
> > 
> >> 
> >> Anyway, just my 2c.
> >> 
> >> 
> >> Cheers,
> >> Pete
> >> 
> >> 
> >> 
> >> 
> >> On Jun 30, 2015, at 2:07 PM, Paweł Bylica < chfast at gmail.com >
> >> wrote:
> >> 
> >> 
> >> 
> >> 
> >> On Tue, Jun 30, 2015 at 11:03 PM Hal Finkel < hfinkel at anl.gov >
> >> wrote:
> >> 
> >> 
> >> ----- Original Message -----
> >>> From: "Paweł Bylica" < chfast at gmail.com >
> >>> To: "David Majnemer" < david.majnemer at gmail.com >
> >>> Cc: "LLVMdev" < llvmdev at cs.uiuc.edu >
> >>> Sent: Tuesday, June 30, 2015 5:42:24 AM
> >>> Subject: Re: [LLVMdev] extractelement causes memory access
> >>> violation - what to do?
> >>> 
> >>> 
> >>> 
> >>> 
> >>> 
> >>> On Fri, Jun 26, 2015 at 5:42 PM David Majnemer <
> >>> david.majnemer at gmail.com > wrote:
> >>> 
> >>> 
> >>> 
> >>> 
> >>> 
> >>> On Fri, Jun 26, 2015 at 7:00 AM, Paweł Bylica < chfast at gmail.com
> >>> >
> >>> wrote:
> >>> 
> >>> 
> >>> 
> >>> Hi,
> >>> 
> >>> 
> >>> Let's have a simple program:
> >>> 
> >>> define i32 @main(i32 %n, i64 %idx) {
> >>> %idxSafe = trunc i64 %idx to i5
> >>> %r = extractelement <4 x i32> <i32 -1, i32 -1, i32 -1, i32 -1>,
> >>> i64
> >>> %idx
> >>> ret i32 %r
> >>> }
> >>> 
> >>> 
> >>> The assembly of that would be:
> >>> 
> >>> pcmpeqd %xmm0, %xmm0
> >>> movdqa %xmm0, -24(%rsp)
> >>> movl -24(%rsp,%rsi,4), %eax
> >>> retq
> >>> 
> >>> 
> >>> The language reference states that the extractelement instruction
> >>> produces undefined value in case the index argument is invalid
> >>> (our
> >>> case). But the implementation simply dumps the vector to the
> >>> stack
> >>> memory, calculates the memory offset out of the index value and
> >>> tries to access the memory. That causes the crash.
> >>> 
> >>> 
> >>> The workaround is to trunc the index value before extractelement
> >>> (see
> >>> %idxSafe). But what should be the ultimate solution?
> >>> 
> >>> 
> >>> 
> >>> 
> >>> 
> >>> We could fix this by specifying that out of bounds access on an
> >>> extractelement leads to full-on undefined behavior, no need to
> >>> force
> >>> everyone to eat the cost of a mask.
> >>> 
> >>> 
> >>> I don't have preference for any of the solutions.
> >>> 
> >>> 
> >>> I have a side question. It is not stated explicitly in the
> >>> reference
> >>> but I would assume the index of extractelement is processed as an
> >>> unsigned value. However, the DAG Builder extends the index with
> >>> sext. Is it correct?
> >> 
> >> Hrmm. Given that only (small) positive numbers are valid, it
> >> shouldn't matter. Unless we can find a reason that it works better
> >> to be sext, it seems conceptually cleaner to make it zext.
> >> 
> >> 
> >> 
> >> I have tried to change it to zext. 2 Mips test have failed. I
> >> haven't
> >> checked the details though.
> >> sext looks pretty wrong to me because i5 -1 does not mean 31 any
> >> more.
> >> 
> >> 
> >> - PB
> >> 
> >> 
> >> 
> >> -Hal
> >> 
> >>> 
> >>> 
> >>> - PB
> >>> _______________________________________________
> >>> LLVM Developers mailing list
> >>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
> >>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
> >>> 
> >> 
> >> --
> >> Hal Finkel
> >> Assistant Computational Scientist
> >> Leadership Computing Facility
> >> Argonne National Laboratory
> >> _______________________________________________
> >> LLVM Developers mailing list
> >> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
> >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
> >> 
> >> 
> > 
> > --
> > Hal Finkel
> > Assistant Computational Scientist
> > Leadership Computing Facility
> > Argonne National Laboratory
> 
> 

-- 
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory