[llvm-dev] SROA and volatile memcpy/memset
Hal Finkel via llvm-dev
llvm-dev at lists.llvm.org
Wed Nov 11 06:53:47 PST 2015
----- Original Message -----
> From: "Krzysztof Parzyszek via llvm-dev" <llvm-dev at lists.llvm.org>
> To: llvm-dev at lists.llvm.org
> Sent: Tuesday, November 10, 2015 1:22:57 PM
> Subject: Re: [llvm-dev] SROA and volatile memcpy/memset
>
> On 11/10/2015 1:07 PM, Joerg Sonnenberger via llvm-dev wrote:
> > On Tue, Nov 10, 2015 at 10:41:06AM -0600, Krzysztof Parzyszek via
> > llvm-dev wrote:
> >> I have a customer testcase where SROA splits a volatile memcpy and
> >> we end up
> >> generating bad code[1]. While this looks like a bug, simply
> >> preventing SROA
> >> from splitting volatile memory intrinsics causes basictest.ll for
> >> SROA to
> >> fail. Not only that, but it also seems like handling of volatile
> >> memory
> >> transfers was done with some intent.
> >
> > There is no such thing as a volatile memcpy or memset in standard
> > ISO C,
> > so what exactly are you doing and why do you expect it to work that
> > way?
>
> The motivating example has an aggregate copy where the aggregate is
> volatile, followed by a store to one of its members. (This does not
> have
> anything to do with devices.) SROA expanded this into a series of
> volatile loads and stores, which cannot be coalesced back into fewer
> instructions. This is clearly worse than doing the copy and then the
> member overwrite.
>
> --- test.c ---
> typedef struct {
> volatile unsigned int value;
> } atomic_word_t;
>
> typedef union {
> struct {
> unsigned char state;
> unsigned char priority;
> };
> atomic_word_t atomic;
> unsigned int full;
> } mystruct_t;
>
>
> mystruct_t a;
>
> unsigned int foo(void) {
> mystruct_t x;
> mystruct_t y;
>
> x.full = a.atomic.value;
> y = x;
> y.priority = 7;
>
> return y.full;
> }
> --------------
SROA seems to be doing a number of things here. What about if we prevented SROA from generating multiple slices splitting volatile accesses? There might be a significant difference between that and something like this test (test/Transforms/SROA/basictest.ll):
define i32 @test6() {
; CHECK-LABEL: @test6(
; CHECK: alloca i32
; CHECK-NEXT: store volatile i32
; CHECK-NEXT: load i32, i32*
; CHECK-NEXT: ret i32
entry:
%a = alloca [4 x i8]
%ptr = getelementptr [4 x i8], [4 x i8]* %a, i32 0, i32 0
call void @llvm.memset.p0i8.i32(i8* %ptr, i8 42, i32 4, i32 1, i1 true)
%iptr = bitcast i8* %ptr to i32*
%val = load i32, i32* %iptr
ret i32 %val
}
-Hal
>
>
> -Krzysztof
>
>
> --
> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
> hosted by The Linux Foundation
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
--
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory
More information about the llvm-dev
mailing list