[LLVMdev] [Patch][RFC] Change R600 data layout

Jon Pry jonpry at gmail.com
Tue Dec 31 13:55:51 PST 2013


Hi,

     I've prepared patches for both LLVM and Clang to change the
datalayout for R600. This may seem like a bold move, but I think it is
warranted.  R600/SI is a strange architecture in that it uses 64bit
pointers but does not support 64 bit arithmetic except for load/store
operations that roughly map onto getelementptr.

    The current datalayout for r600 includes n32:64, which is odd
because r600 cannot actually do any 64bit arith natively. This causes
particular problems in the optimizer with the following kernel:

__kernel void if_eq(__global unsigned long* out, unsigned  arg0)
{
    int i=0;
    for(i = 0; i < arg0; i++){
            out[i] =  i;
       }
}

Clang decides that instead of adding a sext i32 %i to i64 before
getelementptr, it would be best to just go ahead and promote the
variable i to i64. Which would be all good if i64 was actually a
native type.

By changing the native types to n32 *only*. clang emits better code
for r600, such as

; Function Attrs: nounwind
define void @if_eq(i64 addrspace(1)* nocapture %out, i32 %arg0) #0 {
entry:
  %cmp4 = icmp eq i32 %arg0, 0
  br i1 %cmp4, label %for.end, label %for.body

for.body:                                         ; preds = %for.body, %entry
  %i.05 = phi i32 [ %inc, %for.body ], [ 0, %entry ]
  %conv = sext i32 %i.05 to i64
  %arrayidx = getelementptr inbounds i64 addrspace(1)* %out, i64 %conv
  store i64 %conv, i64 addrspace(1)* %arrayidx, align 8, !tbaa !2
  %inc = add nsw i32 %i.05, 1
  %exitcond = icmp eq i32 %inc, %arg0
  br i1 %exitcond, label %for.end, label %for.body

for.end:                                          ; preds = %for.body, %entry
  ret void
}

Another upside to this is that i64 addition on r600 can even be
enabled as it the lowering code need only be called for arithmetic and
not pointers, for which it actually works. In the future, r600 will
require more patches wrt this issue as the old IR generated by clang
was perfectly valid but crashes llc.

For now, I think this patch is a good solution because it makes better
code and allows me to compile programs much longer than my previous 3
line record :)  Not so much a bandaid as a something else that needed
to be done.


Regards,

Jon Pry
jonpry at gmail.com
-------------- next part --------------
From 0a2572c1f1bd322d1517e15135033be88afc6cd7 Mon Sep 17 00:00:00 2001
From: Jon Pry <jonpry at gmail.com>
Date: Tue, 31 Dec 2013 14:20:21 -0500
Subject: [PATCH] R600 - Fix pointer arithmetic

---
 lib/Target/R600/AMDGPUTargetMachine.cpp |    2 +-
 lib/Target/R600/SIISelLowering.cpp      |    2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/lib/Target/R600/AMDGPUTargetMachine.cpp b/lib/Target/R600/AMDGPUTargetMachine.cpp
index 1279665..0e19439 100644
--- a/lib/Target/R600/AMDGPUTargetMachine.cpp
+++ b/lib/Target/R600/AMDGPUTargetMachine.cpp
@@ -58,7 +58,7 @@ static std::string computeDataLayout(const AMDGPUSubtarget &ST) {
   }
 
   Ret += "-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256"
-         "-v512:512-v1024:1024-v2048:2048-n32:64";
+         "-v512:512-v1024:1024-v2048:2048-n32";
 
   return Ret;
 }
diff --git a/lib/Target/R600/SIISelLowering.cpp b/lib/Target/R600/SIISelLowering.cpp
index 4fb8444..add3dda 100644
--- a/lib/Target/R600/SIISelLowering.cpp
+++ b/lib/Target/R600/SIISelLowering.cpp
@@ -76,7 +76,7 @@ SITargetLowering::SITargetLowering(TargetMachine &TM) :
   setOperationAction(ISD::VECTOR_SHUFFLE, MVT::v16i32, Expand);
   setOperationAction(ISD::VECTOR_SHUFFLE, MVT::v16f32, Expand);
 
-  setOperationAction(ISD::ADD, MVT::i64, Legal);
+  setOperationAction(ISD::ADD, MVT::i64, Custom);
   setOperationAction(ISD::ADD, MVT::i32, Legal);
   setOperationAction(ISD::ADDC, MVT::i32, Legal);
   setOperationAction(ISD::ADDE, MVT::i32, Legal);
-- 
1.7.10.4

-------------- next part --------------
From a9ebe3817733d64547eee399d75b16421681b1af Mon Sep 17 00:00:00 2001
From: Jon Pry <jonpry at gmail.com>
Date: Tue, 31 Dec 2013 14:20:40 -0500
Subject: [PATCH] R600 - Change datalayout

---
 lib/Basic/Targets.cpp      |    6 +++---
 test/CodeGen/target-data.c |    6 +++---
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/lib/Basic/Targets.cpp b/lib/Basic/Targets.cpp
index d698090..8c762e2 100644
--- a/lib/Basic/Targets.cpp
+++ b/lib/Basic/Targets.cpp
@@ -1419,7 +1419,7 @@ static const char *DescriptionStringR600 =
   "-i64:64"
   "-v16:16-v24:32-v32:32-v48:64-v96:128"
   "-v192:256-v256:256-v512:512-v1024:1024-v2048:2048"
-  "-n32:64";
+  "-n32";
 
 static const char *DescriptionStringR600DoubleOps =
   "e"
@@ -1427,7 +1427,7 @@ static const char *DescriptionStringR600DoubleOps =
   "-i64:64"
   "-v16:16-v24:32-v32:32-v48:64-v96:128"
   "-v192:256-v256:256-v512:512-v1024:1024-v2048:2048"
-  "-n32:64";
+  "-n32";
 
 static const char *DescriptionStringSI =
   "e"
@@ -1435,7 +1435,7 @@ static const char *DescriptionStringSI =
   "-i64:64"
   "-v16:16-v24:32-v32:32-v48:64-v96:128"
   "-v192:256-v256:256-v512:512-v1024:1024-v2048:2048"
-  "-n32:64";
+  "-n32";
 
 class R600TargetInfo : public TargetInfo {
   /// \brief The GPU profiles supported by the R600 target.
diff --git a/test/CodeGen/target-data.c b/test/CodeGen/target-data.c
index a4db2e0..9adde60 100644
--- a/test/CodeGen/target-data.c
+++ b/test/CodeGen/target-data.c
@@ -108,15 +108,15 @@
 
 // RUN: %clang_cc1 -triple r600-unknown -o - -emit-llvm %s | \
 // RUN: FileCheck %s -check-prefix=R600
-// R600: target datalayout = "e-p:32:32-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512-v1024:1024-v2048:2048-n32:64"
+// R600: target datalayout = "e-p:32:32-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512-v1024:1024-v2048:2048-n32"
 
 // RUN: %clang_cc1 -triple r600-unknown -target-cpu cayman -o - -emit-llvm %s \
 // RUN: | FileCheck %s -check-prefix=R600D
-// R600D: target datalayout = "e-p:32:32-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512-v1024:1024-v2048:2048-n32:64"
+// R600D: target datalayout = "e-p:32:32-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512-v1024:1024-v2048:2048-n32"
 
 // RUN: %clang_cc1 -triple r600-unknown -target-cpu hawaii -o - -emit-llvm %s \
 // RUN: | FileCheck %s -check-prefix=R600SI
-// R600SI: target datalayout = "e-p:32:32-p1:64:64-p2:64:64-p3:32:32-p4:32:32-p5:64:64-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512-v1024:1024-v2048:2048-n32:64"
+// R600SI: target datalayout = "e-p:32:32-p1:64:64-p2:64:64-p3:32:32-p4:32:32-p5:64:64-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512-v1024:1024-v2048:2048-n32"
 
 // RUN: %clang_cc1 -triple aarch64-unknown -o - -emit-llvm %s | \
 // RUN: FileCheck %s -check-prefix=AARCH64
-- 
1.7.10.4



More information about the llvm-dev mailing list