[LLVMdev] [Patch][RFC] Change R600 data layout
Tom Stellard
tom at stellard.net
Tue Dec 31 20:10:29 PST 2013
On Tue, Dec 31, 2013 at 04:55:51PM -0500, Jon Pry wrote:
> Hi,
>
> I've prepared patches for both LLVM and Clang to change the
> datalayout for R600. This may seem like a bold move, but I think it is
> warranted. R600/SI is a strange architecture in that it uses 64bit
> pointers but does not support 64 bit arithmetic except for load/store
> operations that roughly map onto getelementptr.
>
> The current datalayout for r600 includes n32:64, which is odd
> because r600 cannot actually do any 64bit arith natively. This causes
> particular problems in the optimizer with the following kernel:
>
> __kernel void if_eq(__global unsigned long* out, unsigned arg0)
> {
> int i=0;
> for(i = 0; i < arg0; i++){
> out[i] = i;
> }
> }
>
> Clang decides that instead of adding a sext i32 %i to i64 before
> getelementptr, it would be best to just go ahead and promote the
> variable i to i64. Which would be all good if i64 was actually a
> native type.
>
> By changing the native types to n32 *only*. clang emits better code
> for r600, such as
>
> ; Function Attrs: nounwind
> define void @if_eq(i64 addrspace(1)* nocapture %out, i32 %arg0) #0 {
> entry:
> %cmp4 = icmp eq i32 %arg0, 0
> br i1 %cmp4, label %for.end, label %for.body
>
> for.body: ; preds = %for.body, %entry
> %i.05 = phi i32 [ %inc, %for.body ], [ 0, %entry ]
> %conv = sext i32 %i.05 to i64
> %arrayidx = getelementptr inbounds i64 addrspace(1)* %out, i64 %conv
> store i64 %conv, i64 addrspace(1)* %arrayidx, align 8, !tbaa !2
> %inc = add nsw i32 %i.05, 1
> %exitcond = icmp eq i32 %inc, %arg0
> br i1 %exitcond, label %for.end, label %for.body
>
> for.end: ; preds = %for.body, %entry
> ret void
> }
>
> Another upside to this is that i64 addition on r600 can even be
> enabled as it the lowering code need only be called for arithmetic and
> not pointers, for which it actually works. In the future, r600 will
> require more patches wrt this issue as the old IR generated by clang
> was perfectly valid but crashes llc.
>
> For now, I think this patch is a good solution because it makes better
> code and allows me to compile programs much longer than my previous 3
> line record :) Not so much a bandaid as a something else that needed
> to be done.
>
I'm assuming you are using a Southern Islands GPU. Is this correct?
What errors are you seeing without this patch and with which tests?
SI supports 64-bit operations natively, so I don't think we should be
removing n64 from the DataLayout. I think you could achieve the same
results by adding a target-specific DAG combine that recognizes
some form of the (i64 add (i64 sext a), (i64 1)) pattern and replaces it
with a 32-bit add. I think there used to be a generic LLVM IR
optimization that did something like this (I'm not sure what it was
called, maybe "value range propagation"), so reviving this would be
another option.
It appears you are using tests from my opencl demos repo. If you are
interested, there are more more mature and comprehensive tests in piglit:
http://piglit.freedesktop.org/ take a look at the tests/cl directory.
-Tom
>
> Regards,
>
> Jon Pry
> jonpry at gmail.com
> From 0a2572c1f1bd322d1517e15135033be88afc6cd7 Mon Sep 17 00:00:00 2001
> From: Jon Pry <jonpry at gmail.com>
> Date: Tue, 31 Dec 2013 14:20:21 -0500
> Subject: [PATCH] R600 - Fix pointer arithmetic
>
> ---
> lib/Target/R600/AMDGPUTargetMachine.cpp | 2 +-
> lib/Target/R600/SIISelLowering.cpp | 2 +-
> 2 files changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/lib/Target/R600/AMDGPUTargetMachine.cpp b/lib/Target/R600/AMDGPUTargetMachine.cpp
> index 1279665..0e19439 100644
> --- a/lib/Target/R600/AMDGPUTargetMachine.cpp
> +++ b/lib/Target/R600/AMDGPUTargetMachine.cpp
> @@ -58,7 +58,7 @@ static std::string computeDataLayout(const AMDGPUSubtarget &ST) {
> }
>
> Ret += "-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256"
> - "-v512:512-v1024:1024-v2048:2048-n32:64";
> + "-v512:512-v1024:1024-v2048:2048-n32";
>
> return Ret;
> }
> diff --git a/lib/Target/R600/SIISelLowering.cpp b/lib/Target/R600/SIISelLowering.cpp
> index 4fb8444..add3dda 100644
> --- a/lib/Target/R600/SIISelLowering.cpp
> +++ b/lib/Target/R600/SIISelLowering.cpp
> @@ -76,7 +76,7 @@ SITargetLowering::SITargetLowering(TargetMachine &TM) :
> setOperationAction(ISD::VECTOR_SHUFFLE, MVT::v16i32, Expand);
> setOperationAction(ISD::VECTOR_SHUFFLE, MVT::v16f32, Expand);
>
> - setOperationAction(ISD::ADD, MVT::i64, Legal);
> + setOperationAction(ISD::ADD, MVT::i64, Custom);
> setOperationAction(ISD::ADD, MVT::i32, Legal);
> setOperationAction(ISD::ADDC, MVT::i32, Legal);
> setOperationAction(ISD::ADDE, MVT::i32, Legal);
> --
> 1.7.10.4
>
> From a9ebe3817733d64547eee399d75b16421681b1af Mon Sep 17 00:00:00 2001
> From: Jon Pry <jonpry at gmail.com>
> Date: Tue, 31 Dec 2013 14:20:40 -0500
> Subject: [PATCH] R600 - Change datalayout
>
> ---
> lib/Basic/Targets.cpp | 6 +++---
> test/CodeGen/target-data.c | 6 +++---
> 2 files changed, 6 insertions(+), 6 deletions(-)
>
> diff --git a/lib/Basic/Targets.cpp b/lib/Basic/Targets.cpp
> index d698090..8c762e2 100644
> --- a/lib/Basic/Targets.cpp
> +++ b/lib/Basic/Targets.cpp
> @@ -1419,7 +1419,7 @@ static const char *DescriptionStringR600 =
> "-i64:64"
> "-v16:16-v24:32-v32:32-v48:64-v96:128"
> "-v192:256-v256:256-v512:512-v1024:1024-v2048:2048"
> - "-n32:64";
> + "-n32";
>
> static const char *DescriptionStringR600DoubleOps =
> "e"
> @@ -1427,7 +1427,7 @@ static const char *DescriptionStringR600DoubleOps =
> "-i64:64"
> "-v16:16-v24:32-v32:32-v48:64-v96:128"
> "-v192:256-v256:256-v512:512-v1024:1024-v2048:2048"
> - "-n32:64";
> + "-n32";
>
> static const char *DescriptionStringSI =
> "e"
> @@ -1435,7 +1435,7 @@ static const char *DescriptionStringSI =
> "-i64:64"
> "-v16:16-v24:32-v32:32-v48:64-v96:128"
> "-v192:256-v256:256-v512:512-v1024:1024-v2048:2048"
> - "-n32:64";
> + "-n32";
>
> class R600TargetInfo : public TargetInfo {
> /// \brief The GPU profiles supported by the R600 target.
> diff --git a/test/CodeGen/target-data.c b/test/CodeGen/target-data.c
> index a4db2e0..9adde60 100644
> --- a/test/CodeGen/target-data.c
> +++ b/test/CodeGen/target-data.c
> @@ -108,15 +108,15 @@
>
> // RUN: %clang_cc1 -triple r600-unknown -o - -emit-llvm %s | \
> // RUN: FileCheck %s -check-prefix=R600
> -// R600: target datalayout = "e-p:32:32-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512-v1024:1024-v2048:2048-n32:64"
> +// R600: target datalayout = "e-p:32:32-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512-v1024:1024-v2048:2048-n32"
>
> // RUN: %clang_cc1 -triple r600-unknown -target-cpu cayman -o - -emit-llvm %s \
> // RUN: | FileCheck %s -check-prefix=R600D
> -// R600D: target datalayout = "e-p:32:32-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512-v1024:1024-v2048:2048-n32:64"
> +// R600D: target datalayout = "e-p:32:32-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512-v1024:1024-v2048:2048-n32"
>
> // RUN: %clang_cc1 -triple r600-unknown -target-cpu hawaii -o - -emit-llvm %s \
> // RUN: | FileCheck %s -check-prefix=R600SI
> -// R600SI: target datalayout = "e-p:32:32-p1:64:64-p2:64:64-p3:32:32-p4:32:32-p5:64:64-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512-v1024:1024-v2048:2048-n32:64"
> +// R600SI: target datalayout = "e-p:32:32-p1:64:64-p2:64:64-p3:32:32-p4:32:32-p5:64:64-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512-v1024:1024-v2048:2048-n32"
>
> // RUN: %clang_cc1 -triple aarch64-unknown -o - -emit-llvm %s | \
> // RUN: FileCheck %s -check-prefix=AARCH64
> --
> 1.7.10.4
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
More information about the llvm-dev
mailing list