[llvm] r211939 - [NVPTX] Fix handling of ldg/ldu intrinsics.

Jingyue Wu jingyue at google.com
Mon Aug 25 09:30:58 PDT 2014


I agree. That would cut off the issue of metadata being removed by
optimizations.

Does the current NVPTX backend use or check the alignment at all? I tried,
for example, using llvm.nvvm.ldg.global.f.v4f32.p0v4f32 with alignment 4,
and the emitted PTX still uses ld.global.nc.v4.f32 even though the
alignment is not 4 * sizeof(float).

Jingyue

On Fri Aug 22 2014 at 8:54:47 AM Justin Holewinski <jholewinski at nvidia.com>
wrote:

>  Hi Jingyue,
>
>
>  I agree that this is fragile and needs to be reworked.  "Officially",
> the ldu/ldg intrinsics are not meant to be used by front-ends (they do not
> exist in the NVVM IR spec).  In libnvvm, they are generated automatically
> through IR transformations just before PTX code generation (and after all
> of the 'opt'-level transformations).  However, for upstream NVPTX, it makes
> a lot of sense to expose these intrinsics so users can generate them,
> especially since the ldg/ldu transforamtion passes are not upstream.  I
> believe the proper way to handle these would be to make the alignment an
> explicit argument, similar to llvm.memset/llvm.memcpy/llvm.memmove.  Do
> you agree?
>
>
>
>  ------------------------------
> *From:* Jingyue Wu <jingyue at google.com>
> *Sent:* Thursday, August 21, 2014 6:59 PM
> *To:* Justin Holewinski
> *Cc:* llvm-commits at cs.uiuc.edu; Eli Bendersky; Mark Heffernan; Cong Hou
> *Subject:* Re: [llvm] r211939 - [NVPTX] Fix handling of ldg/ldu
> intrinsics.
>
>  Hi Justin,
>
> The assertion on NVPTXISelLowering.cpp:3305
>
>  assert(I.hasMetadataOtherThanDebugLoc() && "Must have alignment
> metadata");
>
>  looks a little fragile. !align is extended metadata and the optimization
> passes (e.g., SimplifyCFG) don't guarantee to preserve it.
>
>  I attached a reduced test case (hoist.ll) to demonstrate this issue. If
> you run "opt -simplifycfg hoist.ll -S", you will see the !align metadata is
> removed by HoistThenElseCodeToIf. Further running llc on the simplified
> bitcode will hit the assertion error.
>
>  One way to fix this issue is of course having HositThenElseCodeToIf to
> preserve the align metadata. But before I do that, I am curious whether we
> should assume the align metadata always exists at first hand. Can you shed
> some light on this, Justin?
>
>  Thanks much,
> Jingyue
>
> On Fri Jun 27 2014 at 12:07:55 PM Justin Holewinski <
> jholewinski at nvidia.com> wrote:
>
>> Author: jholewinski
>> Date: Fri Jun 27 13:35:51 2014
>> New Revision: 211939
>>
>> URL: http://llvm.org/viewvc/llvm-project?rev=211939&view=rev
>> Log:
>> [NVPTX] Fix handling of ldg/ldu intrinsics.
>>
>> The address space of the pointer must be global (1) for these
>> intrinsics.  There must also be alignment metadata attached to the
>> intrinsic calls, e.g.
>>
>> %val = tail call i32 @llvm.nvvm.ldu.i.global.i32.p1i32(i32 addrspace(1)*
>> %ptr), !align !0
>>
>> !0 = metadata !{i32 4}
>>
>> Added:
>>     llvm/trunk/test/CodeGen/NVPTX/ldu-ldg.ll
>> Modified:
>>     llvm/trunk/include/llvm/IR/IntrinsicsNVVM.td
>>     llvm/trunk/lib/Target/NVPTX/NVPTXISelDAGToDAG.cpp
>>     llvm/trunk/lib/Target/NVPTX/NVPTXISelDAGToDAG.h
>>     llvm/trunk/lib/Target/NVPTX/NVPTXISelLowering.cpp
>>     llvm/trunk/lib/Target/NVPTX/NVPTXIntrinsics.td
>>     llvm/trunk/test/CodeGen/NVPTX/ldu-i8.ll
>>     llvm/trunk/test/CodeGen/NVPTX/ldu-reg-plus-offset.ll
>>
>> Modified: llvm/trunk/include/llvm/IR/IntrinsicsNVVM.td
>> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/
>> IR/IntrinsicsNVVM.td?rev=211939&r1=211938&r2=211939&view=diff
>> ============================================================
>> ==================
>> --- llvm/trunk/include/llvm/IR/IntrinsicsNVVM.td (original)
>> +++ llvm/trunk/include/llvm/IR/IntrinsicsNVVM.td Fri Jun 27 13:35:51 2014
>> @@ -796,26 +796,25 @@ def llvm_anyi64ptr_ty     : LLVMAnyPoint
>>
>>
>>  // Generated within nvvm. Use for ldu on sm_20 or later
>> -// @TODO: Revisit this, Changed LLVMAnyPointerType to LLVMPointerType
>>  def int_nvvm_ldu_global_i : Intrinsic<[llvm_anyint_ty],
>> -  [LLVMPointerType<LLVMMatchType<0>>], [IntrReadMem, NoCapture<0>],
>> +  [LLVMAnyPointerType<LLVMMatchType<0>>], [IntrReadMem, NoCapture<0>],
>>    "llvm.nvvm.ldu.global.i">;
>>  def int_nvvm_ldu_global_f : Intrinsic<[llvm_anyfloat_ty],
>> -  [LLVMPointerType<LLVMMatchType<0>>], [IntrReadMem, NoCapture<0>],
>> +  [LLVMAnyPointerType<LLVMMatchType<0>>], [IntrReadMem, NoCapture<0>],
>>    "llvm.nvvm.ldu.global.f">;
>>  def int_nvvm_ldu_global_p : Intrinsic<[llvm_anyptr_ty],
>> -  [LLVMPointerType<LLVMMatchType<0>>], [IntrReadMem, NoCapture<0>],
>> +  [LLVMAnyPointerType<LLVMMatchType<0>>], [IntrReadMem, NoCapture<0>],
>>    "llvm.nvvm.ldu.global.p">;
>>
>>  // Generated within nvvm. Use for ldg on sm_35 or later
>>  def int_nvvm_ldg_global_i : Intrinsic<[llvm_anyint_ty],
>> -  [LLVMPointerType<LLVMMatchType<0>>], [IntrReadMem, NoCapture<0>],
>> +  [LLVMAnyPointerType<LLVMMatchType<0>>], [IntrReadMem, NoCapture<0>],
>>    "llvm.nvvm.ldg.global.i">;
>>  def int_nvvm_ldg_global_f : Intrinsic<[llvm_anyfloat_ty],
>> -  [LLVMPointerType<LLVMMatchType<0>>], [IntrReadMem, NoCapture<0>],
>> +  [LLVMAnyPointerType<LLVMMatchType<0>>], [IntrReadMem, NoCapture<0>],
>>    "llvm.nvvm.ldg.global.f">;
>>  def int_nvvm_ldg_global_p : Intrinsic<[llvm_anyptr_ty],
>> -  [LLVMPointerType<LLVMMatchType<0>>], [IntrReadMem, NoCapture<0>],
>> +  [LLVMAnyPointerType<LLVMMatchType<0>>], [IntrReadMem, NoCapture<0>],
>>    "llvm.nvvm.ldg.global.p">;
>>
>>  // Use for generic pointers
>>
>> Modified: llvm/trunk/lib/Target/NVPTX/NVPTXISelDAGToDAG.cpp
>> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/NV
>> PTX/NVPTXISelDAGToDAG.cpp?rev=211939&r1=211938&r2=211939&view=diff
>> ============================================================
>> ==================
>> --- llvm/trunk/lib/Target/NVPTX/NVPTXISelDAGToDAG.cpp (original)
>> +++ llvm/trunk/lib/Target/NVPTX/NVPTXISelDAGToDAG.cpp Fri Jun 27
>> 13:35:51 2014
>> @@ -141,7 +141,7 @@ SDNode *NVPTXDAGToDAGISel::Select(SDNode
>>    case NVPTXISD::LDGV4:
>>    case NVPTXISD::LDUV2:
>>    case NVPTXISD::LDUV4:
>> -    ResNode = SelectLDGLDUVector(N);
>> +    ResNode = SelectLDGLDU(N);
>>      break;
>>    case NVPTXISD::StoreV2:
>>    case NVPTXISD::StoreV4:
>> @@ -167,6 +167,9 @@ SDNode *NVPTXDAGToDAGISel::Select(SDNode
>>    case ISD::INTRINSIC_WO_CHAIN:
>>      ResNode = SelectIntrinsicNoChain(N);
>>      break;
>> +  case ISD::INTRINSIC_W_CHAIN:
>> +    ResNode = SelectIntrinsicChain(N);
>> +    break;
>>    case NVPTXISD::Tex1DFloatI32:
>>    case NVPTXISD::Tex1DFloatFloat:
>>    case NVPTXISD::Tex1DFloatFloatLevel:
>> @@ -273,6 +276,21 @@ SDNode *NVPTXDAGToDAGISel::Select(SDNode
>>    return SelectCode(N);
>>  }
>>
>> +SDNode *NVPTXDAGToDAGISel::SelectIntrinsicChain(SDNode *N) {
>> +  unsigned IID = cast<ConstantSDNode>(N->getOperand(1))->getZExtValue();
>> +  switch (IID) {
>> +  default:
>> +    return NULL;
>> +  case Intrinsic::nvvm_ldg_global_f:
>> +  case Intrinsic::nvvm_ldg_global_i:
>> +  case Intrinsic::nvvm_ldg_global_p:
>> +  case Intrinsic::nvvm_ldu_global_f:
>> +  case Intrinsic::nvvm_ldu_global_i:
>> +  case Intrinsic::nvvm_ldu_global_p:
>> +    return SelectLDGLDU(N);
>> +  }
>> +}
>> +
>>  static unsigned int getCodeAddrSpace(MemSDNode *N,
>>                                       const NVPTXSubtarget &Subtarget) {
>>    const Value *Src = N->getMemOperand()->getValue();
>> @@ -990,22 +1008,101 @@ SDNode *NVPTXDAGToDAGISel::SelectLoadVec
>>    return LD;
>>  }
>>
>> -SDNode *NVPTXDAGToDAGISel::SelectLDGLDUVector(SDNode *N) {
>> +SDNode *NVPTXDAGToDAGISel::SelectLDGLDU(SDNode *N) {
>>
>>    SDValue Chain = N->getOperand(0);
>> -  SDValue Op1 = N->getOperand(1);
>> +  SDValue Op1;
>> +  MemSDNode *Mem;
>> +  bool IsLDG = true;
>> +
>> +  // If this is an LDG intrinsic, the address is the third operand. Its
>> its an
>> +  // LDG/LDU SD node (from custom vector handling), then its the second
>> operand
>> +  if (N->getOpcode() == ISD::INTRINSIC_W_CHAIN) {
>> +    Op1 = N->getOperand(2);
>> +    Mem = cast<MemIntrinsicSDNode>(N);
>> +    unsigned IID = cast<ConstantSDNode>(N->getOpe
>> rand(1))->getZExtValue();
>> +    switch (IID) {
>> +    default:
>> +      return NULL;
>> +    case Intrinsic::nvvm_ldg_global_f:
>> +    case Intrinsic::nvvm_ldg_global_i:
>> +    case Intrinsic::nvvm_ldg_global_p:
>> +      IsLDG = true;
>> +      break;
>> +    case Intrinsic::nvvm_ldu_global_f:
>> +    case Intrinsic::nvvm_ldu_global_i:
>> +    case Intrinsic::nvvm_ldu_global_p:
>> +      IsLDG = false;
>> +      break;
>> +    }
>> +  } else {
>> +    Op1 = N->getOperand(1);
>> +    Mem = cast<MemSDNode>(N);
>> +  }
>> +
>>    unsigned Opcode;
>>    SDLoc DL(N);
>>    SDNode *LD;
>> -  MemSDNode *Mem = cast<MemSDNode>(N);
>>    SDValue Base, Offset, Addr;
>>
>> -  EVT EltVT = Mem->getMemoryVT().getVectorElementType();
>> +  EVT EltVT = Mem->getMemoryVT();
>> +  if (EltVT.isVector()) {
>> +    EltVT = EltVT.getVectorElementType();
>> +  }
>>
>>    if (SelectDirectAddr(Op1, Addr)) {
>>      switch (N->getOpcode()) {
>>      default:
>>        return nullptr;
>> +    case ISD::INTRINSIC_W_CHAIN:
>> +      if (IsLDG) {
>> +        switch (EltVT.getSimpleVT().SimpleTy) {
>> +        default:
>> +          return nullptr;
>> +        case MVT::i8:
>> +          Opcode = NVPTX::INT_PTX_LDG_GLOBAL_i8avar;
>> +          break;
>> +        case MVT::i16:
>> +          Opcode = NVPTX::INT_PTX_LDG_GLOBAL_i16avar;
>> +          break;
>> +        case MVT::i32:
>> +          Opcode = NVPTX::INT_PTX_LDG_GLOBAL_i32avar;
>> +          break;
>> +        case MVT::i64:
>> +          Opcode = NVPTX::INT_PTX_LDG_GLOBAL_i64avar;
>> +          break;
>> +        case MVT::f32:
>> +          Opcode = NVPTX::INT_PTX_LDG_GLOBAL_f32avar;
>> +          break;
>> +        case MVT::f64:
>> +          Opcode = NVPTX::INT_PTX_LDG_GLOBAL_f64avar;
>> +          break;
>> +        }
>> +      } else {
>> +        switch (EltVT.getSimpleVT().SimpleTy) {
>> +        default:
>> +          return nullptr;
>> +        case MVT::i8:
>> +          Opcode = NVPTX::INT_PTX_LDU_GLOBAL_i8avar;
>> +          break;
>> +        case MVT::i16:
>> +          Opcode = NVPTX::INT_PTX_LDU_GLOBAL_i16avar;
>> +          break;
>> +        case MVT::i32:
>> +          Opcode = NVPTX::INT_PTX_LDU_GLOBAL_i32avar;
>> +          break;
>> +        case MVT::i64:
>> +          Opcode = NVPTX::INT_PTX_LDU_GLOBAL_i64avar;
>> +          break;
>> +        case MVT::f32:
>> +          Opcode = NVPTX::INT_PTX_LDU_GLOBAL_f32avar;
>> +          break;
>> +        case MVT::f64:
>> +          Opcode = NVPTX::INT_PTX_LDU_GLOBAL_f64avar;
>> +          break;
>> +        }
>> +      }
>> +      break;
>>      case NVPTXISD::LDGV2:
>>        switch (EltVT.getSimpleVT().SimpleTy) {
>>        default:
>> @@ -1101,6 +1198,55 @@ SDNode *NVPTXDAGToDAGISel::SelectLDGLDUV
>>        switch (N->getOpcode()) {
>>        default:
>>          return nullptr;
>> +      case ISD::INTRINSIC_W_CHAIN:
>> +        if (IsLDG) {
>> +          switch (EltVT.getSimpleVT().SimpleTy) {
>> +          default:
>> +            return nullptr;
>> +          case MVT::i8:
>> +            Opcode = NVPTX::INT_PTX_LDG_GLOBAL_i8ari64;
>> +            break;
>> +          case MVT::i16:
>> +            Opcode = NVPTX::INT_PTX_LDG_GLOBAL_i16ari64;
>> +            break;
>> +          case MVT::i32:
>> +            Opcode = NVPTX::INT_PTX_LDG_GLOBAL_i32ari64;
>> +            break;
>> +          case MVT::i64:
>> +            Opcode = NVPTX::INT_PTX_LDG_GLOBAL_i64ari64;
>> +            break;
>> +          case MVT::f32:
>> +            Opcode = NVPTX::INT_PTX_LDG_GLOBAL_f32ari64;
>> +            break;
>> +          case MVT::f64:
>> +            Opcode = NVPTX::INT_PTX_LDG_GLOBAL_f64ari64;
>> +            break;
>> +          }
>> +        } else {
>> +          switch (EltVT.getSimpleVT().SimpleTy) {
>> +          default:
>> +            return nullptr;
>> +          case MVT::i8:
>> +            Opcode = NVPTX::INT_PTX_LDU_GLOBAL_i8ari64;
>> +            break;
>> +          case MVT::i16:
>> +            Opcode = NVPTX::INT_PTX_LDU_GLOBAL_i16ari64;
>> +            break;
>> +          case MVT::i32:
>> +            Opcode = NVPTX::INT_PTX_LDU_GLOBAL_i32ari64;
>> +            break;
>> +          case MVT::i64:
>> +            Opcode = NVPTX::INT_PTX_LDU_GLOBAL_i64ari64;
>> +            break;
>> +          case MVT::f32:
>> +            Opcode = NVPTX::INT_PTX_LDU_GLOBAL_f32ari64;
>> +            break;
>> +          case MVT::f64:
>> +            Opcode = NVPTX::INT_PTX_LDU_GLOBAL_f64ari64;
>> +            break;
>> +          }
>> +        }
>> +        break;
>>        case NVPTXISD::LDGV2:
>>          switch (EltVT.getSimpleVT().SimpleTy) {
>>          default:
>> @@ -1190,6 +1336,55 @@ SDNode *NVPTXDAGToDAGISel::SelectLDGLDUV
>>        switch (N->getOpcode()) {
>>        default:
>>          return nullptr;
>> +      case ISD::INTRINSIC_W_CHAIN:
>> +        if (IsLDG) {
>> +          switch (EltVT.getSimpleVT().SimpleTy) {
>> +          default:
>> +            return nullptr;
>> +          case MVT::i8:
>> +            Opcode = NVPTX::INT_PTX_LDG_GLOBAL_i8ari;
>> +            break;
>> +          case MVT::i16:
>> +            Opcode = NVPTX::INT_PTX_LDG_GLOBAL_i16ari;
>> +            break;
>> +          case MVT::i32:
>> +            Opcode = NVPTX::INT_PTX_LDG_GLOBAL_i32ari;
>> +            break;
>> +          case MVT::i64:
>> +            Opcode = NVPTX::INT_PTX_LDG_GLOBAL_i64ari;
>> +            break;
>> +          case MVT::f32:
>> +            Opcode = NVPTX::INT_PTX_LDG_GLOBAL_f32ari;
>> +            break;
>> +          case MVT::f64:
>> +            Opcode = NVPTX::INT_PTX_LDG_GLOBAL_f64ari;
>> +            break;
>> +          }
>> +        } else {
>> +          switch (EltVT.getSimpleVT().SimpleTy) {
>> +          default:
>> +            return nullptr;
>> +          case MVT::i8:
>> +            Opcode = NVPTX::INT_PTX_LDU_GLOBAL_i8ari;
>> +            break;
>> +          case MVT::i16:
>> +            Opcode = NVPTX::INT_PTX_LDU_GLOBAL_i16ari;
>> +            break;
>> +          case MVT::i32:
>> +            Opcode = NVPTX::INT_PTX_LDU_GLOBAL_i32ari;
>> +            break;
>> +          case MVT::i64:
>> +            Opcode = NVPTX::INT_PTX_LDU_GLOBAL_i64ari;
>> +            break;
>> +          case MVT::f32:
>> +            Opcode = NVPTX::INT_PTX_LDU_GLOBAL_f32ari;
>> +            break;
>> +          case MVT::f64:
>> +            Opcode = NVPTX::INT_PTX_LDU_GLOBAL_f64ari;
>> +            break;
>> +          }
>> +        }
>> +        break;
>>        case NVPTXISD::LDGV2:
>>          switch (EltVT.getSimpleVT().SimpleTy) {
>>          default:
>> @@ -1285,6 +1480,55 @@ SDNode *NVPTXDAGToDAGISel::SelectLDGLDUV
>>        switch (N->getOpcode()) {
>>        default:
>>          return nullptr;
>> +      case ISD::INTRINSIC_W_CHAIN:
>> +        if (IsLDG) {
>> +          switch (EltVT.getSimpleVT().SimpleTy) {
>> +          default:
>> +            return nullptr;
>> +          case MVT::i8:
>> +            Opcode = NVPTX::INT_PTX_LDG_GLOBAL_i8areg64;
>> +            break;
>> +          case MVT::i16:
>> +            Opcode = NVPTX::INT_PTX_LDG_GLOBAL_i16areg64;
>> +            break;
>> +          case MVT::i32:
>> +            Opcode = NVPTX::INT_PTX_LDG_GLOBAL_i32areg64;
>> +            break;
>> +          case MVT::i64:
>> +            Opcode = NVPTX::INT_PTX_LDG_GLOBAL_i64areg64;
>> +            break;
>> +          case MVT::f32:
>> +            Opcode = NVPTX::INT_PTX_LDG_GLOBAL_f32areg64;
>> +            break;
>> +          case MVT::f64:
>> +            Opcode = NVPTX::INT_PTX_LDG_GLOBAL_f64areg64;
>> +            break;
>> +          }
>> +        } else {
>> +          switch (EltVT.getSimpleVT().SimpleTy) {
>> +          default:
>> +            return nullptr;
>> +          case MVT::i8:
>> +            Opcode = NVPTX::INT_PTX_LDU_GLOBAL_i8areg64;
>> +            break;
>> +          case MVT::i16:
>> +            Opcode = NVPTX::INT_PTX_LDU_GLOBAL_i16areg64;
>> +            break;
>> +          case MVT::i32:
>> +            Opcode = NVPTX::INT_PTX_LDU_GLOBAL_i32areg64;
>> +            break;
>> +          case MVT::i64:
>> +            Opcode = NVPTX::INT_PTX_LDU_GLOBAL_i64areg64;
>> +            break;
>> +          case MVT::f32:
>> +            Opcode = NVPTX::INT_PTX_LDU_GLOBAL_f32areg64;
>> +            break;
>> +          case MVT::f64:
>> +            Opcode = NVPTX::INT_PTX_LDU_GLOBAL_f64areg64;
>> +            break;
>> +          }
>> +        }
>> +        break;
>>        case NVPTXISD::LDGV2:
>>          switch (EltVT.getSimpleVT().SimpleTy) {
>>          default:
>> @@ -1374,6 +1618,55 @@ SDNode *NVPTXDAGToDAGISel::SelectLDGLDUV
>>        switch (N->getOpcode()) {
>>        default:
>>          return nullptr;
>> +      case ISD::INTRINSIC_W_CHAIN:
>> +        if (IsLDG) {
>> +          switch (EltVT.getSimpleVT().SimpleTy) {
>> +          default:
>> +            return nullptr;
>> +          case MVT::i8:
>> +            Opcode = NVPTX::INT_PTX_LDG_GLOBAL_i8areg;
>> +            break;
>> +          case MVT::i16:
>> +            Opcode = NVPTX::INT_PTX_LDG_GLOBAL_i16areg;
>> +            break;
>> +          case MVT::i32:
>> +            Opcode = NVPTX::INT_PTX_LDG_GLOBAL_i32areg;
>> +            break;
>> +          case MVT::i64:
>> +            Opcode = NVPTX::INT_PTX_LDG_GLOBAL_i64areg;
>> +            break;
>> +          case MVT::f32:
>> +            Opcode = NVPTX::INT_PTX_LDG_GLOBAL_f32areg;
>> +            break;
>> +          case MVT::f64:
>> +            Opcode = NVPTX::INT_PTX_LDG_GLOBAL_f64areg;
>> +            break;
>> +          }
>> +        } else {
>> +          switch (EltVT.getSimpleVT().SimpleTy) {
>> +          default:
>> +            return nullptr;
>> +          case MVT::i8:
>> +            Opcode = NVPTX::INT_PTX_LDU_GLOBAL_i8areg;
>> +            break;
>> +          case MVT::i16:
>> +            Opcode = NVPTX::INT_PTX_LDU_GLOBAL_i16areg;
>> +            break;
>> +          case MVT::i32:
>> +            Opcode = NVPTX::INT_PTX_LDU_GLOBAL_i32areg;
>> +            break;
>> +          case MVT::i64:
>> +            Opcode = NVPTX::INT_PTX_LDU_GLOBAL_i64areg;
>> +            break;
>> +          case MVT::f32:
>> +            Opcode = NVPTX::INT_PTX_LDU_GLOBAL_f32areg;
>> +            break;
>> +          case MVT::f64:
>> +            Opcode = NVPTX::INT_PTX_LDU_GLOBAL_f64areg;
>> +            break;
>> +          }
>> +        }
>> +        break;
>>        case NVPTXISD::LDGV2:
>>          switch (EltVT.getSimpleVT().SimpleTy) {
>>          default:
>> @@ -1466,7 +1759,7 @@ SDNode *NVPTXDAGToDAGISel::SelectLDGLDUV
>>    }
>>
>>    MachineSDNode::mmo_iterator MemRefs0 = MF->allocateMemRefsArray(1);
>> -  MemRefs0[0] = cast<MemSDNode>(N)->getMemOperand();
>> +  MemRefs0[0] = Mem->getMemOperand();
>>    cast<MachineSDNode>(LD)->setMemRefs(MemRefs0, MemRefs0 + 1);
>>
>>    return LD;
>>
>> Modified: llvm/trunk/lib/Target/NVPTX/NVPTXISelDAGToDAG.h
>> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/NV
>> PTX/NVPTXISelDAGToDAG.h?rev=211939&r1=211938&r2=211939&view=diff
>> ============================================================
>> ==================
>> --- llvm/trunk/lib/Target/NVPTX/NVPTXISelDAGToDAG.h (original)
>> +++ llvm/trunk/lib/Target/NVPTX/NVPTXISelDAGToDAG.h Fri Jun 27 13:35:51
>> 2014
>> @@ -59,10 +59,11 @@ private:
>>
>>    SDNode *Select(SDNode *N) override;
>>    SDNode *SelectIntrinsicNoChain(SDNode *N);
>> +  SDNode *SelectIntrinsicChain(SDNode *N);
>>    SDNode *SelectTexSurfHandle(SDNode *N);
>>    SDNode *SelectLoad(SDNode *N);
>>    SDNode *SelectLoadVector(SDNode *N);
>> -  SDNode *SelectLDGLDUVector(SDNode *N);
>> +  SDNode *SelectLDGLDU(SDNode *N);
>>    SDNode *SelectStore(SDNode *N);
>>    SDNode *SelectStoreVector(SDNode *N);
>>    SDNode *SelectLoadParam(SDNode *N);
>>
>> Modified: llvm/trunk/lib/Target/NVPTX/NVPTXISelLowering.cpp
>> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/NV
>> PTX/NVPTXISelLowering.cpp?rev=211939&r1=211938&r2=211939&view=diff
>> ============================================================
>> ==================
>> --- llvm/trunk/lib/Target/NVPTX/NVPTXISelLowering.cpp (original)
>> +++ llvm/trunk/lib/Target/NVPTX/NVPTXISelLowering.cpp Fri Jun 27
>> 13:35:51 2014
>> @@ -2363,22 +2363,62 @@ bool NVPTXTargetLowering::getTgtMemIntri
>>
>>    case Intrinsic::nvvm_ldu_global_i:
>>    case Intrinsic::nvvm_ldu_global_f:
>> -  case Intrinsic::nvvm_ldu_global_p:
>> +  case Intrinsic::nvvm_ldu_global_p: {
>>
>>      Info.opc = ISD::INTRINSIC_W_CHAIN;
>>      if (Intrinsic == Intrinsic::nvvm_ldu_global_i)
>>        Info.memVT = getValueType(I.getType());
>> -    else if (Intrinsic == Intrinsic::nvvm_ldu_global_p)
>> +    else if(Intrinsic == Intrinsic::nvvm_ldu_global_p)
>> +      Info.memVT = getPointerTy();
>> +    else
>>        Info.memVT = getValueType(I.getType());
>> +    Info.ptrVal = I.getArgOperand(0);
>> +    Info.offset = 0;
>> +    Info.vol = 0;
>> +    Info.readMem = true;
>> +    Info.writeMem = false;
>> +
>> +    // alignment is available as metadata.
>> +    // Grab it and set the alignment.
>> +    assert(I.hasMetadataOtherThanDebugLoc() && "Must have alignment
>> metadata");
>> +    MDNode *AlignMD = I.getMetadata("align");
>> +    assert(AlignMD && "Must have a non-null MDNode");
>> +    assert(AlignMD->getNumOperands() == 1 && "Must have a single
>> operand");
>> +    Value *Align = AlignMD->getOperand(0);
>> +    int64_t Alignment = cast<ConstantInt>(Align)->getZExtValue();
>> +    Info.align = Alignment;
>> +
>> +    return true;
>> +  }
>> +  case Intrinsic::nvvm_ldg_global_i:
>> +  case Intrinsic::nvvm_ldg_global_f:
>> +  case Intrinsic::nvvm_ldg_global_p: {
>> +
>> +    Info.opc = ISD::INTRINSIC_W_CHAIN;
>> +    if (Intrinsic == Intrinsic::nvvm_ldg_global_i)
>> +      Info.memVT = getValueType(I.getType());
>> +    else if(Intrinsic == Intrinsic::nvvm_ldg_global_p)
>> +      Info.memVT = getPointerTy();
>>      else
>> -      Info.memVT = MVT::f32;
>> +      Info.memVT = getValueType(I.getType());
>>      Info.ptrVal = I.getArgOperand(0);
>>      Info.offset = 0;
>>      Info.vol = 0;
>>      Info.readMem = true;
>>      Info.writeMem = false;
>> -    Info.align = 0;
>> +
>> +    // alignment is available as metadata.
>> +    // Grab it and set the alignment.
>> +    assert(I.hasMetadataOtherThanDebugLoc() && "Must have alignment
>> metadata");
>> +    MDNode *AlignMD = I.getMetadata("align");
>> +    assert(AlignMD && "Must have a non-null MDNode");
>> +    assert(AlignMD->getNumOperands() == 1 && "Must have a single
>> operand");
>> +    Value *Align = AlignMD->getOperand(0);
>> +    int64_t Alignment = cast<ConstantInt>(Align)->getZExtValue();
>> +    Info.align = Alignment;
>> +
>>      return true;
>> +  }
>>
>>    case Intrinsic::nvvm_tex_1d_v4f32_i32:
>>    case Intrinsic::nvvm_tex_1d_v4f32_f32:
>>
>> Modified: llvm/trunk/lib/Target/NVPTX/NVPTXIntrinsics.td
>> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/NV
>> PTX/NVPTXIntrinsics.td?rev=211939&r1=211938&r2=211939&view=diff
>> ============================================================
>> ==================
>> --- llvm/trunk/lib/Target/NVPTX/NVPTXIntrinsics.td (original)
>> +++ llvm/trunk/lib/Target/NVPTX/NVPTXIntrinsics.td Fri Jun 27 13:35:51
>> 2014
>> @@ -1374,67 +1374,33 @@ def INT_PTX_SREG_WARPSIZE : F_SREG<"mov.
>>  // Support for ldu on sm_20 or later
>>  //-----------------------------------
>>
>> -def ldu_i8 : PatFrag<(ops node:$ptr), (int_nvvm_ldu_global_i node:$ptr),
>> [{
>> -  MemIntrinsicSDNode *M = cast<MemIntrinsicSDNode>(N);
>> -  return M->getMemoryVT() == MVT::i8;
>> -}]>;
>> -
>>  // Scalar
>> -// @TODO: Revisit this, Changed imemAny to imem
>> -multiclass LDU_G<string TyStr, NVPTXRegClass regclass, Intrinsic IntOp> {
>> -  def areg: NVPTXInst<(outs regclass:$result), (ins Int32Regs:$src),
>> -               !strconcat("ldu.global.", TyStr),
>> -         [(set regclass:$result, (IntOp Int32Regs:$src))]>,
>> Requires<[hasLDU]>;
>> -  def areg64: NVPTXInst<(outs regclass:$result), (ins Int64Regs:$src),
>> -               !strconcat("ldu.global.", TyStr),
>> -         [(set regclass:$result, (IntOp Int64Regs:$src))]>,
>> Requires<[hasLDU]>;
>> - def avar:  NVPTXInst<(outs regclass:$result), (ins imem:$src),
>> -               !strconcat("ldu.global.", TyStr),
>> -                [(set regclass:$result, (IntOp (Wrapper
>> tglobaladdr:$src)))]>,
>> -                Requires<[hasLDU]>;
>> - def ari :  NVPTXInst<(outs regclass:$result), (ins MEMri:$src),
>> -               !strconcat("ldu.global.", TyStr),
>> -         [(set regclass:$result, (IntOp ADDRri:$src))]>,
>> Requires<[hasLDU]>;
>> - def ari64 :  NVPTXInst<(outs regclass:$result), (ins MEMri64:$src),
>> -               !strconcat("ldu.global.", TyStr),
>> -         [(set regclass:$result, (IntOp ADDRri64:$src))]>,
>> Requires<[hasLDU]>;
>> -}
>> -
>> -multiclass LDU_G_NOINTRIN<string TyStr, NVPTXRegClass regclass, PatFrag
>> IntOp> {
>> +multiclass LDU_G<string TyStr, NVPTXRegClass regclass> {
>>    def areg: NVPTXInst<(outs regclass:$result), (ins Int32Regs:$src),
>>                 !strconcat("ldu.global.", TyStr),
>> -         [(set regclass:$result, (IntOp Int32Regs:$src))]>,
>> Requires<[hasLDU]>;
>> +                      []>, Requires<[hasLDU]>;
>>    def areg64: NVPTXInst<(outs regclass:$result), (ins Int64Regs:$src),
>>                 !strconcat("ldu.global.", TyStr),
>> -         [(set regclass:$result, (IntOp Int64Regs:$src))]>,
>> Requires<[hasLDU]>;
>> - def avar:  NVPTXInst<(outs regclass:$result), (ins imem:$src),
>> +                        []>, Requires<[hasLDU]>;
>> + def avar:  NVPTXInst<(outs regclass:$result), (ins imemAny:$src),
>>                 !strconcat("ldu.global.", TyStr),
>> -         [(set regclass:$result, (IntOp (Wrapper tglobaladdr:$src)))]>,
>> -         Requires<[hasLDU]>;
>> +                      []>, Requires<[hasLDU]>;
>>   def ari :  NVPTXInst<(outs regclass:$result), (ins MEMri:$src),
>>                 !strconcat("ldu.global.", TyStr),
>> -         [(set regclass:$result, (IntOp ADDRri:$src))]>,
>> Requires<[hasLDU]>;
>> +                      []>, Requires<[hasLDU]>;
>>   def ari64 :  NVPTXInst<(outs regclass:$result), (ins MEMri64:$src),
>>                 !strconcat("ldu.global.", TyStr),
>> -         [(set regclass:$result, (IntOp ADDRri64:$src))]>,
>> Requires<[hasLDU]>;
>> +                        []>, Requires<[hasLDU]>;
>>  }
>>
>> -defm INT_PTX_LDU_GLOBAL_i8  : LDU_G_NOINTRIN<"u8 \t$result, [$src];",
>> Int16Regs,
>> -                                             ldu_i8>;
>> -defm INT_PTX_LDU_GLOBAL_i16 : LDU_G<"u16 \t$result, [$src];", Int16Regs,
>> -int_nvvm_ldu_global_i>;
>> -defm INT_PTX_LDU_GLOBAL_i32 : LDU_G<"u32 \t$result, [$src];", Int32Regs,
>> -int_nvvm_ldu_global_i>;
>> -defm INT_PTX_LDU_GLOBAL_i64 : LDU_G<"u64 \t$result, [$src];", Int64Regs,
>> -int_nvvm_ldu_global_i>;
>> -defm INT_PTX_LDU_GLOBAL_f32 : LDU_G<"f32 \t$result, [$src];",
>> Float32Regs,
>> -int_nvvm_ldu_global_f>;
>> -defm INT_PTX_LDU_GLOBAL_f64 : LDU_G<"f64 \t$result, [$src];",
>> Float64Regs,
>> -int_nvvm_ldu_global_f>;
>> -defm INT_PTX_LDU_GLOBAL_p32 : LDU_G<"u32 \t$result, [$src];", Int32Regs,
>> -int_nvvm_ldu_global_p>;
>> -defm INT_PTX_LDU_GLOBAL_p64 : LDU_G<"u64 \t$result, [$src];", Int64Regs,
>> -int_nvvm_ldu_global_p>;
>> +defm INT_PTX_LDU_GLOBAL_i8  : LDU_G<"u8 \t$result, [$src];", Int16Regs>;
>> +defm INT_PTX_LDU_GLOBAL_i16 : LDU_G<"u16 \t$result, [$src];", Int16Regs>;
>> +defm INT_PTX_LDU_GLOBAL_i32 : LDU_G<"u32 \t$result, [$src];", Int32Regs>;
>> +defm INT_PTX_LDU_GLOBAL_i64 : LDU_G<"u64 \t$result, [$src];", Int64Regs>;
>> +defm INT_PTX_LDU_GLOBAL_f32 : LDU_G<"f32 \t$result, [$src];",
>> Float32Regs>;
>> +defm INT_PTX_LDU_GLOBAL_f64 : LDU_G<"f64 \t$result, [$src];",
>> Float64Regs>;
>> +defm INT_PTX_LDU_GLOBAL_p32 : LDU_G<"u32 \t$result, [$src];", Int32Regs>;
>> +defm INT_PTX_LDU_GLOBAL_p64 : LDU_G<"u64 \t$result, [$src];", Int64Regs>;
>>
>>  // vector
>>
>> @@ -1504,65 +1470,40 @@ defm INT_PTX_LDU_G_v4f32_ELE
>>  // Support for ldg on sm_35 or later
>>  //-----------------------------------
>>
>> -def ldg_i8 : PatFrag<(ops node:$ptr), (int_nvvm_ldg_global_i node:$ptr),
>> [{
>> -  MemIntrinsicSDNode *M = cast<MemIntrinsicSDNode>(N);
>> -  return M->getMemoryVT() == MVT::i8;
>> -}]>;
>> -
>> -multiclass LDG_G<string TyStr, NVPTXRegClass regclass, Intrinsic IntOp> {
>> -  def areg: NVPTXInst<(outs regclass:$result), (ins Int32Regs:$src),
>> -               !strconcat("ld.global.nc.", TyStr),
>> -         [(set regclass:$result, (IntOp Int32Regs:$src))]>,
>> Requires<[hasLDG]>;
>> -  def areg64: NVPTXInst<(outs regclass:$result), (ins Int64Regs:$src),
>> -               !strconcat("ld.global.nc.", TyStr),
>> -         [(set regclass:$result, (IntOp Int64Regs:$src))]>,
>> Requires<[hasLDG]>;
>> - def avar:  NVPTXInst<(outs regclass:$result), (ins imem:$src),
>> -               !strconcat("ld.global.nc.", TyStr),
>> -         [(set regclass:$result, (IntOp (Wrapper tglobaladdr:$src)))]>,
>> -         Requires<[hasLDG]>;
>> - def ari :  NVPTXInst<(outs regclass:$result), (ins MEMri:$src),
>> -               !strconcat("ld.global.nc.", TyStr),
>> -         [(set regclass:$result, (IntOp ADDRri:$src))]>,
>> Requires<[hasLDG]>;
>> - def ari64 :  NVPTXInst<(outs regclass:$result), (ins MEMri64:$src),
>> -               !strconcat("ld.global.nc.", TyStr),
>> -         [(set regclass:$result, (IntOp ADDRri64:$src))]>,
>> Requires<[hasLDG]>;
>> -}
>> -
>> -multiclass LDG_G_NOINTRIN<string TyStr, NVPTXRegClass regclass, PatFrag
>> IntOp> {
>> +multiclass LDG_G<string TyStr, NVPTXRegClass regclass> {
>>    def areg: NVPTXInst<(outs regclass:$result), (ins Int32Regs:$src),
>>                 !strconcat("ld.global.nc.", TyStr),
>> -         [(set regclass:$result, (IntOp Int32Regs:$src))]>,
>> Requires<[hasLDG]>;
>> +                      []>, Requires<[hasLDG]>;
>>    def areg64: NVPTXInst<(outs regclass:$result), (ins Int64Regs:$src),
>>                 !strconcat("ld.global.nc.", TyStr),
>> -         [(set regclass:$result, (IntOp Int64Regs:$src))]>,
>> Requires<[hasLDG]>;
>> - def avar:  NVPTXInst<(outs regclass:$result), (ins imem:$src),
>> +                        []>, Requires<[hasLDG]>;
>> + def avar:  NVPTXInst<(outs regclass:$result), (ins imemAny:$src),
>>                 !strconcat("ld.global.nc.", TyStr),
>> -         [(set regclass:$result, (IntOp (Wrapper tglobaladdr:$src)))]>,
>> -        Requires<[hasLDG]>;
>> +                      []>, Requires<[hasLDG]>;
>>   def ari :  NVPTXInst<(outs regclass:$result), (ins MEMri:$src),
>>                 !strconcat("ld.global.nc.", TyStr),
>> -         [(set regclass:$result, (IntOp ADDRri:$src))]>,
>> Requires<[hasLDG]>;
>> +                      []>, Requires<[hasLDG]>;
>>   def ari64 :  NVPTXInst<(outs regclass:$result), (ins MEMri64:$src),
>>                 !strconcat("ld.global.nc.", TyStr),
>> -         [(set regclass:$result, (IntOp ADDRri64:$src))]>,
>> Requires<[hasLDG]>;
>> +                        []>, Requires<[hasLDG]>;
>>  }
>>
>>  defm INT_PTX_LDG_GLOBAL_i8
>> -  : LDG_G_NOINTRIN<"u8 \t$result, [$src];",  Int16Regs, ldg_i8>;
>> +  : LDG_G<"u8 \t$result, [$src];", Int16Regs>;
>>  defm INT_PTX_LDG_GLOBAL_i16
>> -  : LDG_G<"u16 \t$result, [$src];", Int16Regs,   int_nvvm_ldg_global_i>;
>> +  : LDG_G<"u16 \t$result, [$src];", Int16Regs>;
>>  defm INT_PTX_LDG_GLOBAL_i32
>> -  : LDG_G<"u32 \t$result, [$src];", Int32Regs,   int_nvvm_ldg_global_i>;
>> +  : LDG_G<"u32 \t$result, [$src];", Int32Regs>;
>>  defm INT_PTX_LDG_GLOBAL_i64
>> -  : LDG_G<"u64 \t$result, [$src];", Int64Regs,   int_nvvm_ldg_global_i>;
>> +  : LDG_G<"u64 \t$result, [$src];", Int64Regs>;
>>  defm INT_PTX_LDG_GLOBAL_f32
>> -  : LDG_G<"f32 \t$result, [$src];", Float32Regs, int_nvvm_ldg_global_f>;
>> +  : LDG_G<"f32 \t$result, [$src];", Float32Regs>;
>>  defm INT_PTX_LDG_GLOBAL_f64
>> -  : LDG_G<"f64 \t$result, [$src];", Float64Regs, int_nvvm_ldg_global_f>;
>> +  : LDG_G<"f64 \t$result, [$src];", Float64Regs>;
>>  defm INT_PTX_LDG_GLOBAL_p32
>> -  : LDG_G<"u32 \t$result, [$src];", Int32Regs,   int_nvvm_ldg_global_p>;
>> +  : LDG_G<"u32 \t$result, [$src];", Int32Regs>;
>>  defm INT_PTX_LDG_GLOBAL_p64
>> -  : LDG_G<"u64 \t$result, [$src];", Int64Regs,   int_nvvm_ldg_global_p>;
>> +  : LDG_G<"u64 \t$result, [$src];", Int64Regs>;
>>
>>  // vector
>>
>>
>> Modified: llvm/trunk/test/CodeGen/NVPTX/ldu-i8.ll
>> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/
>> NVPTX/ldu-i8.ll?rev=211939&r1=211938&r2=211939&view=diff
>> ============================================================
>> ==================
>> --- llvm/trunk/test/CodeGen/NVPTX/ldu-i8.ll (original)
>> +++ llvm/trunk/test/CodeGen/NVPTX/ldu-i8.ll Fri Jun 27 13:35:51 2014
>> @@ -2,13 +2,15 @@
>>
>>  target datalayout = "e-p:32:32:32-i1:8:8-i8:8:8-i1
>> 6:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v16:16:16-v3
>> 2:32:32-v64:64:64-v128:128:128-n16:32:64"
>>
>> -declare i8 @llvm.nvvm.ldu.global.i.i8(i8*)
>> +declare i8 @llvm.nvvm.ldu.global.i.i8.p0i8(i8*)
>>
>>  define i8 @foo(i8* %a) {
>>  ; Ensure we properly truncate off the high-order 24 bits
>>  ; CHECK:        ldu.global.u8
>>  ; CHECK:        cvt.u32.u16
>>  ; CHECK:        and.b32         %r{{[0-9]+}}, %r{{[0-9]+}}, 255
>> -  %val = tail call i8 @llvm.nvvm.ldu.global.i.i8(i8* %a)
>> +  %val = tail call i8 @llvm.nvvm.ldu.global.i.i8.p0i8(i8* %a), !align !0
>>    ret i8 %val
>>  }
>> +
>> +!0 = metadata !{i32 4}
>>
>> Added: llvm/trunk/test/CodeGen/NVPTX/ldu-ldg.ll
>> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/
>> NVPTX/ldu-ldg.ll?rev=211939&view=auto
>> ============================================================
>> ==================
>> --- llvm/trunk/test/CodeGen/NVPTX/ldu-ldg.ll (added)
>> +++ llvm/trunk/test/CodeGen/NVPTX/ldu-ldg.ll Fri Jun 27 13:35:51 2014
>> @@ -0,0 +1,40 @@
>> +; RUN: llc < %s -march=nvptx -mcpu=sm_20 | FileCheck %s
>> +
>> +
>> +declare i8 @llvm.nvvm.ldu.global.i.i8.p1i8(i8 addrspace(1)* %ptr)
>> +declare i32 @llvm.nvvm.ldu.global.i.i32.p1i32(i32 addrspace(1)* %ptr)
>> +declare i8 @llvm.nvvm.ldg.global.i.i8.p1i8(i8 addrspace(1)* %ptr)
>> +declare i32 @llvm.nvvm.ldg.global.i.i32.p1i32(i32 addrspace(1)* %ptr)
>> +
>> +
>> +; CHECK: func0
>> +define i8 @func0(i8 addrspace(1)* %ptr) {
>> +; ldu.global.u8
>> +  %val = tail call i8 @llvm.nvvm.ldu.global.i.i8.p1i8(i8 addrspace(1)*
>> %ptr), !align !0
>> +  ret i8 %val
>> +}
>> +
>> +; CHECK: func1
>> +define i32 @func1(i32 addrspace(1)* %ptr) {
>> +; ldu.global.u32
>> +  %val = tail call i32 @llvm.nvvm.ldu.global.i.i32.p1i32(i32
>> addrspace(1)* %ptr), !align !0
>> +  ret i32 %val
>> +}
>> +
>> +; CHECK: func2
>> +define i8 @func2(i8 addrspace(1)* %ptr) {
>> +; ld.global.nc.u8
>> +  %val = tail call i8 @llvm.nvvm.ldg.global.i.i8.p1i8(i8 addrspace(1)*
>> %ptr), !align !0
>> +  ret i8 %val
>> +}
>> +
>> +; CHECK: func3
>> +define i32 @func3(i32 addrspace(1)* %ptr) {
>> +; ld.global.nc.u32
>> +  %val = tail call i32 @llvm.nvvm.ldg.global.i.i32.p1i32(i32
>> addrspace(1)* %ptr), !align !0
>> +  ret i32 %val
>> +}
>> +
>> +
>> +
>> +!0 = metadata !{i32 4}
>>
>> Modified: llvm/trunk/test/CodeGen/NVPTX/ldu-reg-plus-offset.ll
>> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/
>> NVPTX/ldu-reg-plus-offset.ll?rev=211939&r1=211938&r2=211939&view=diff
>> ============================================================
>> ==================
>> --- llvm/trunk/test/CodeGen/NVPTX/ldu-reg-plus-offset.ll (original)
>> +++ llvm/trunk/test/CodeGen/NVPTX/ldu-reg-plus-offset.ll Fri Jun 27
>> 13:35:51 2014
>> @@ -7,9 +7,9 @@ define void @reg_plus_offset(i32* %a) {
>>  ; CHECK:        ldu.global.u32  %r{{[0-9]+}}, [%r{{[0-9]+}}+32];
>>  ; CHECK:        ldu.global.u32  %r{{[0-9]+}}, [%r{{[0-9]+}}+36];
>>    %p2 = getelementptr i32* %a, i32 8
>> -  %t1 = call i32 @llvm.nvvm.ldu.global.i.i32(i32* %p2), !align !1
>> +  %t1 = call i32 @llvm.nvvm.ldu.global.i.i32.p0i32(i32* %p2), !align !1
>>    %p3 = getelementptr i32* %a, i32 9
>> -  %t2 = call i32 @llvm.nvvm.ldu.global.i.i32(i32* %p3), !align !1
>> +  %t2 = call i32 @llvm.nvvm.ldu.global.i.i32.p0i32(i32* %p3), !align !1
>>    %t3 = mul i32 %t1, %t2
>>    store i32 %t3, i32* %a
>>    ret void
>> @@ -17,5 +17,5 @@ define void @reg_plus_offset(i32* %a) {
>>
>>  !1 = metadata !{ i32 4 }
>>
>> -declare i32 @llvm.nvvm.ldu.global.i.i32(i32*)
>> +declare i32 @llvm.nvvm.ldu.global.i.i32.p0i32(i32*)
>>  declare i32 @llvm.nvvm.read.ptx.sreg.tid.x()
>>
>>
>> _______________________________________________
>> llvm-commits mailing list
>> llvm-commits at cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>
>    ------------------------------
>  This email message is for the sole use of the intended recipient(s) and
> may contain confidential information.  Any unauthorized review, use,
> disclosure or distribution is prohibited.  If you are not the intended
> recipient, please contact the sender by reply email and destroy all copies
> of the original message.
>  ------------------------------
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20140825/803421cf/attachment.html>


More information about the llvm-commits mailing list