[llvm-commits] [llvm] r160270 [1/2] - in /llvm/trunk/lib/Target/AMDGPU: ./ MCTargetDesc/ TargetInfo/

Mon Jul 16 10:57:34 PDT 2012

Tom, I have to ask that you revert this.

As we discussed a long time ago, and as I explained in great detail to the
Intel folks working on x32 support[1], we simply cannot accept really
significant additions to the codebase without active, trusted maintainers
who have an established track record contributing and maintaining LLVM's
code. For the reasons why this is so important, I would read the x32 email.

[1]:
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20120709/146241.html

I can't emphasize this enough: established maintainers with a track record.
I really do know how high a barrier to entry this is, it's excruciating.
We've gone through this multiple times. However, without this, the project
and the codebase simply cannot scale.

Next, there is a further problem here: this patch went in without review.
That is unacceptable for a contribution of this magnitude. I realize that
in the past there may be examples where this rule has not been applied
well, but that does not invalidate it or exempt you from it. I'm
particularly frustrated because you *knew* this was a requirement and
committed anyways.

Finally, there are really deep problems with your contribution as posed.
I'll touch on a few of them here, but by no means should you consider this
an exhaustive list:
1) You must have an AsmPrinter. You must properly use and support the MC
layer. This layer is no longer experimental or poorly supported, every new
backend should be expected to implement proper MC support.
2) You need to write tests that follow the prevailing LLVM style. This
includes a textual input and output, with FileCheck to manage detailed and
robust assertions.
3) You need to consistently leverage the modern elements of the target
independent code generator. I haven't done a deep study of this backend,
but a cursory look indicates that you're not properly integrating with some
of the latest additions no the target independent pipeline. Adding a new
backend that doesn't support them greatly magnifies the cost of making
changes to this common infrastructure.
4) You must bring the code up to the coding standards and style of LLVM. I
don't know why people find this so challenging. Look at recently added LLVM
code in the backend, look at the patterns and style it follows, and
*exactly* replicate it. You're not even close, considering the patch
contained 'or' instead of '||'.
5) High quality documentation about the backend, the target platform, and
your plans here.
6) An active build bot to help developers without access to your platform
debug issues.

I realize that not every backend meets this bar. That doesn't imply that
your backend doesn't need to meet this bar. We have to raise the quality
bar if we're going to keep LLVM moving forward at a rapid pace. Currently,
we aren't doing that, and it is costing the project a great deal.

====

On a separate note, I truly understand that getting a review for a patch of
this magnitude is hard. You are not alone in wanting a review and not
getting it. However, submitting without review does not solve anything, it
merely takes more time from reviewers to deal with the problems in a rush,
and makes every single reviewer less inclined to actually review your patch
thoroughly.

You need to specifically motivate people to review your patch. There is no
other way you can get it into the tree. There are many ways to do this:
1) Make it's code so excellent in quality and familiar in style to the
potential reviewers that they actually enjoy it.
2) Work tirelessly on fixing and improving the core LLVM infrastructure so
that potential reviewers are grateful and motivated to keep you active in
the project.
3) Talk to developers in the community, showcase amazing things that this
backend will do or let you do when it is in the tree.
4) More of 1, 2, and 3.

The only magic I know of is to submit more patches. To submit so many
patches that other developers simply cannot ignore your presence and will
have to review your code. The currency of patches does actually work in
this project, but you haven't yet invested enough.

====

I truly hope you don't take this to mean that I (or others in the LLVM
project) am uninterested in this backend eventually being in the tree. We
are interested, but it's not ready yet. We need you (or others) to be much
more active in maintaining things in LLVM. We need the quality of the code
and implementation and testing to go up. We need it to go through proper
code review. That is the context in which we are interested.

-Chandler

On Mon, Jul 16, 2012 at 7:17 AM, Tom Stellard <thomas.stellard at amd.com>wrote:

> Author: tstellar
> Date: Mon Jul 16 09:17:08 2012
> New Revision: 160270
>
> URL: http://llvm.org/viewvc/llvm-project?rev=160270&view=rev
> Log:
> AMDGPU: Add core backend files for R600/SI codegen v6
>
> Added:
>     llvm/trunk/lib/Target/AMDGPU/
>     llvm/trunk/lib/Target/AMDGPU/AMDGPU.h
>     llvm/trunk/lib/Target/AMDGPU/AMDGPU.td
>     llvm/trunk/lib/Target/AMDGPU/AMDGPUConvertToISA.cpp
>     llvm/trunk/lib/Target/AMDGPU/AMDGPUISelLowering.cpp
>     llvm/trunk/lib/Target/AMDGPU/AMDGPUISelLowering.h
>     llvm/trunk/lib/Target/AMDGPU/AMDGPUInstrInfo.cpp
>     llvm/trunk/lib/Target/AMDGPU/AMDGPUInstrInfo.h
>     llvm/trunk/lib/Target/AMDGPU/AMDGPUInstrInfo.td
>     llvm/trunk/lib/Target/AMDGPU/AMDGPUInstructions.td
>     llvm/trunk/lib/Target/AMDGPU/AMDGPUIntrinsics.td
>     llvm/trunk/lib/Target/AMDGPU/AMDGPURegisterInfo.cpp
>     llvm/trunk/lib/Target/AMDGPU/AMDGPURegisterInfo.h
>     llvm/trunk/lib/Target/AMDGPU/AMDGPURegisterInfo.td
>     llvm/trunk/lib/Target/AMDGPU/AMDGPUSubtarget.h
>     llvm/trunk/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
>     llvm/trunk/lib/Target/AMDGPU/AMDGPUTargetMachine.h
>     llvm/trunk/lib/Target/AMDGPU/AMDGPUUtil.cpp
>     llvm/trunk/lib/Target/AMDGPU/AMDGPUUtil.h
>     llvm/trunk/lib/Target/AMDGPU/AMDIL.h
>     llvm/trunk/lib/Target/AMDGPU/AMDIL7XXDevice.cpp
>     llvm/trunk/lib/Target/AMDGPU/AMDIL7XXDevice.h
>     llvm/trunk/lib/Target/AMDGPU/AMDILAlgorithms.tpp
>     llvm/trunk/lib/Target/AMDGPU/AMDILBase.td
>     llvm/trunk/lib/Target/AMDGPU/AMDILCFGStructurizer.cpp
>     llvm/trunk/lib/Target/AMDGPU/AMDILCallingConv.td
>     llvm/trunk/lib/Target/AMDGPU/AMDILCodeEmitter.h
>     llvm/trunk/lib/Target/AMDGPU/AMDILDevice.cpp
>     llvm/trunk/lib/Target/AMDGPU/AMDILDevice.h
>     llvm/trunk/lib/Target/AMDGPU/AMDILDeviceInfo.cpp
>     llvm/trunk/lib/Target/AMDGPU/AMDILDeviceInfo.h
>     llvm/trunk/lib/Target/AMDGPU/AMDILDevices.h
>     llvm/trunk/lib/Target/AMDGPU/AMDILEnumeratedTypes.td
>     llvm/trunk/lib/Target/AMDGPU/AMDILEvergreenDevice.cpp
>     llvm/trunk/lib/Target/AMDGPU/AMDILEvergreenDevice.h
>     llvm/trunk/lib/Target/AMDGPU/AMDILFormats.td
>     llvm/trunk/lib/Target/AMDGPU/AMDILFrameLowering.cpp
>     llvm/trunk/lib/Target/AMDGPU/AMDILFrameLowering.h
>     llvm/trunk/lib/Target/AMDGPU/AMDILISelDAGToDAG.cpp
>     llvm/trunk/lib/Target/AMDGPU/AMDILISelLowering.cpp
>     llvm/trunk/lib/Target/AMDGPU/AMDILISelLowering.h
>     llvm/trunk/lib/Target/AMDGPU/AMDILInstrInfo.cpp
>     llvm/trunk/lib/Target/AMDGPU/AMDILInstrInfo.h
>     llvm/trunk/lib/Target/AMDGPU/AMDILInstrInfo.td
>     llvm/trunk/lib/Target/AMDGPU/AMDILInstructions.td
>     llvm/trunk/lib/Target/AMDGPU/AMDILIntrinsicInfo.cpp
>     llvm/trunk/lib/Target/AMDGPU/AMDILIntrinsicInfo.h
>     llvm/trunk/lib/Target/AMDGPU/AMDILIntrinsics.td
>     llvm/trunk/lib/Target/AMDGPU/AMDILMultiClass.td
>     llvm/trunk/lib/Target/AMDGPU/AMDILNIDevice.cpp
>     llvm/trunk/lib/Target/AMDGPU/AMDILNIDevice.h
>     llvm/trunk/lib/Target/AMDGPU/AMDILNodes.td
>     llvm/trunk/lib/Target/AMDGPU/AMDILOperands.td
>     llvm/trunk/lib/Target/AMDGPU/AMDILPatterns.td
>     llvm/trunk/lib/Target/AMDGPU/AMDILPeepholeOptimizer.cpp
>     llvm/trunk/lib/Target/AMDGPU/AMDILProfiles.td
>     llvm/trunk/lib/Target/AMDGPU/AMDILRegisterInfo.cpp
>     llvm/trunk/lib/Target/AMDGPU/AMDILRegisterInfo.h
>     llvm/trunk/lib/Target/AMDGPU/AMDILRegisterInfo.td
>     llvm/trunk/lib/Target/AMDGPU/AMDILSIDevice.cpp
>     llvm/trunk/lib/Target/AMDGPU/AMDILSIDevice.h
>     llvm/trunk/lib/Target/AMDGPU/AMDILSubtarget.cpp
>     llvm/trunk/lib/Target/AMDGPU/AMDILSubtarget.h
>     llvm/trunk/lib/Target/AMDGPU/AMDILTokenDesc.td
>     llvm/trunk/lib/Target/AMDGPU/AMDILUtilityFunctions.h
>     llvm/trunk/lib/Target/AMDGPU/AMDILVersion.td
>     llvm/trunk/lib/Target/AMDGPU/CMakeLists.txt
>     llvm/trunk/lib/Target/AMDGPU/GENERATED_FILES
>     llvm/trunk/lib/Target/AMDGPU/LLVMBuild.txt
>     llvm/trunk/lib/Target/AMDGPU/MCTargetDesc/
>     llvm/trunk/lib/Target/AMDGPU/MCTargetDesc/AMDGPUMCAsmInfo.cpp
>     llvm/trunk/lib/Target/AMDGPU/MCTargetDesc/AMDGPUMCAsmInfo.h
>     llvm/trunk/lib/Target/AMDGPU/MCTargetDesc/AMDGPUMCTargetDesc.cpp
>     llvm/trunk/lib/Target/AMDGPU/MCTargetDesc/AMDGPUMCTargetDesc.h
>     llvm/trunk/lib/Target/AMDGPU/MCTargetDesc/CMakeLists.txt
>     llvm/trunk/lib/Target/AMDGPU/MCTargetDesc/LLVMBuild.txt
>     llvm/trunk/lib/Target/AMDGPU/MCTargetDesc/Makefile
>     llvm/trunk/lib/Target/AMDGPU/Makefile
>     llvm/trunk/lib/Target/AMDGPU/Processors.td
>     llvm/trunk/lib/Target/AMDGPU/R600CodeEmitter.cpp
>     llvm/trunk/lib/Target/AMDGPU/R600GenRegisterInfo.pl
>     llvm/trunk/lib/Target/AMDGPU/R600HwRegInfo.include
>     llvm/trunk/lib/Target/AMDGPU/R600ISelLowering.cpp
>     llvm/trunk/lib/Target/AMDGPU/R600ISelLowering.h
>     llvm/trunk/lib/Target/AMDGPU/R600InstrInfo.cpp
>     llvm/trunk/lib/Target/AMDGPU/R600InstrInfo.h
>     llvm/trunk/lib/Target/AMDGPU/R600Instructions.td
>     llvm/trunk/lib/Target/AMDGPU/R600Intrinsics.td
>     llvm/trunk/lib/Target/AMDGPU/R600KernelParameters.cpp
>     llvm/trunk/lib/Target/AMDGPU/R600MachineFunctionInfo.cpp
>     llvm/trunk/lib/Target/AMDGPU/R600MachineFunctionInfo.h
>     llvm/trunk/lib/Target/AMDGPU/R600RegisterInfo.cpp
>     llvm/trunk/lib/Target/AMDGPU/R600RegisterInfo.h
>     llvm/trunk/lib/Target/AMDGPU/R600RegisterInfo.td
>     llvm/trunk/lib/Target/AMDGPU/R600Schedule.td
>     llvm/trunk/lib/Target/AMDGPU/SIAssignInterpRegs.cpp
>     llvm/trunk/lib/Target/AMDGPU/SICodeEmitter.cpp
>     llvm/trunk/lib/Target/AMDGPU/SIGenRegisterInfo.pl
>     llvm/trunk/lib/Target/AMDGPU/SIISelLowering.cpp
>     llvm/trunk/lib/Target/AMDGPU/SIISelLowering.h
>     llvm/trunk/lib/Target/AMDGPU/SIInstrFormats.td
>     llvm/trunk/lib/Target/AMDGPU/SIInstrInfo.cpp
>     llvm/trunk/lib/Target/AMDGPU/SIInstrInfo.h
>     llvm/trunk/lib/Target/AMDGPU/SIInstrInfo.td
>     llvm/trunk/lib/Target/AMDGPU/SIInstructions.td
>     llvm/trunk/lib/Target/AMDGPU/SIIntrinsics.td
>     llvm/trunk/lib/Target/AMDGPU/SIMachineFunctionInfo.cpp
>     llvm/trunk/lib/Target/AMDGPU/SIMachineFunctionInfo.h
>     llvm/trunk/lib/Target/AMDGPU/SIRegisterInfo.cpp
>     llvm/trunk/lib/Target/AMDGPU/SIRegisterInfo.h
>     llvm/trunk/lib/Target/AMDGPU/SIRegisterInfo.td
>     llvm/trunk/lib/Target/AMDGPU/SISchedule.td
>     llvm/trunk/lib/Target/AMDGPU/TargetInfo/
>     llvm/trunk/lib/Target/AMDGPU/TargetInfo/AMDGPUTargetInfo.cpp
>     llvm/trunk/lib/Target/AMDGPU/TargetInfo/CMakeLists.txt
>     llvm/trunk/lib/Target/AMDGPU/TargetInfo/LLVMBuild.txt
>     llvm/trunk/lib/Target/AMDGPU/TargetInfo/Makefile
>
> Added: llvm/trunk/lib/Target/AMDGPU/AMDGPU.h
> URL:
> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AMDGPU/AMDGPU.h?rev=160270&view=auto
>
> ==============================================================================
> --- llvm/trunk/lib/Target/AMDGPU/AMDGPU.h (added)
> +++ llvm/trunk/lib/Target/AMDGPU/AMDGPU.h Mon Jul 16 09:17:08 2012
> @@ -0,0 +1,35 @@
> +//===-- AMDGPU.h - MachineFunction passes hw codegen --------------*- C++
> -*-=//
> +//
> +//                     The LLVM Compiler Infrastructure
> +//
> +// This file is distributed under the University of Illinois Open Source
> +// License. See LICENSE.TXT for details.
> +//
>
> +//===----------------------------------------------------------------------===//
> +
> +#ifndef AMDGPU_H
> +#define AMDGPU_H
> +
> +#include "AMDGPUTargetMachine.h"
> +#include "llvm/Support/TargetRegistry.h"
> +#include "llvm/Target/TargetMachine.h"
> +
> +namespace llvm {
> +
> +class FunctionPass;
> +class AMDGPUTargetMachine;
> +
> +// R600 Passes
> +FunctionPass* createR600KernelParametersPass(const TargetData* TD);
> +FunctionPass *createR600CodeEmitterPass(formatted_raw_ostream &OS);
> +
> +// SI Passes
> +FunctionPass *createSIAssignInterpRegsPass(TargetMachine &tm);
> +FunctionPass *createSICodeEmitterPass(formatted_raw_ostream &OS);
> +
> +// Passes common to R600 and SI
> +FunctionPass *createAMDGPUConvertToISAPass(TargetMachine &tm);
> +
> +} // End namespace llvm
> +
> +#endif // AMDGPU_H
>
> Added: llvm/trunk/lib/Target/AMDGPU/AMDGPU.td
> URL:
> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AMDGPU/AMDGPU.td?rev=160270&view=auto
>
> ==============================================================================
> --- llvm/trunk/lib/Target/AMDGPU/AMDGPU.td (added)
> +++ llvm/trunk/lib/Target/AMDGPU/AMDGPU.td Mon Jul 16 09:17:08 2012
> @@ -0,0 +1,21 @@
> +//===-- AMDIL.td - AMDIL Tablegen files --*- tablegen
> -*-------------------===//
> +//
> +//                     The LLVM Compiler Infrastructure
> +//
> +// This file is distributed under the University of Illinois Open Source
> +// License. See LICENSE.TXT for details.
> +//
>
> +//==-----------------------------------------------------------------------===//
> +
> +// Include AMDIL TD files
> +include "AMDILBase.td"
> +include "AMDILVersion.td"
> +
> +// Include AMDGPU TD files
> +include "R600Schedule.td"
> +include "SISchedule.td"
> +include "Processors.td"
> +include "AMDGPUInstrInfo.td"
> +include "AMDGPUIntrinsics.td"
> +include "AMDGPURegisterInfo.td"
> +include "AMDGPUInstructions.td"
>
> Added: llvm/trunk/lib/Target/AMDGPU/AMDGPUConvertToISA.cpp
> URL:
> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AMDGPU/AMDGPUConvertToISA.cpp?rev=160270&view=auto
>
> ==============================================================================
> --- llvm/trunk/lib/Target/AMDGPU/AMDGPUConvertToISA.cpp (added)
> +++ llvm/trunk/lib/Target/AMDGPU/AMDGPUConvertToISA.cpp Mon Jul 16
> 09:17:08 2012
> @@ -0,0 +1,63 @@
> +//===-- AMDGPUConvertToISA.cpp - Lower AMDIL to HW ISA
> --------------------===//
> +//
> +//                     The LLVM Compiler Infrastructure
> +//
> +// This file is distributed under the University of Illinois Open Source
> +// License. See LICENSE.TXT for details.
> +//
>
> +//===----------------------------------------------------------------------===//
> +//
> +// This pass lowers AMDIL machine instructions to the appropriate hardware
> +// instructions.
> +//
>
> +//===----------------------------------------------------------------------===//
> +
> +#include "AMDGPU.h"
> +#include "AMDGPUInstrInfo.h"
> +#include "llvm/CodeGen/MachineFunctionPass.h"
> +
> +#include <stdio.h>
> +using namespace llvm;
> +
> +namespace {
> +
> +class AMDGPUConvertToISAPass : public MachineFunctionPass {
> +
> +private:
> +  static char ID;
> +  TargetMachine &TM;
> +
> +public:
> +  AMDGPUConvertToISAPass(TargetMachine &tm) :
> +    MachineFunctionPass(ID), TM(tm) { }
> +
> +  virtual bool runOnMachineFunction(MachineFunction &MF);
> +
> +  virtual const char *getPassName() const {return "AMDGPU Convert to
> ISA";}
> +
> +};
> +
> +} // End anonymous namespace
> +
> +char AMDGPUConvertToISAPass::ID = 0;
> +
> +FunctionPass *llvm::createAMDGPUConvertToISAPass(TargetMachine &tm) {
> +  return new AMDGPUConvertToISAPass(tm);
> +}
> +
> +bool AMDGPUConvertToISAPass::runOnMachineFunction(MachineFunction &MF)
> +{
> +  const AMDGPUInstrInfo * TII =
> +                      static_cast<const
> AMDGPUInstrInfo*>(TM.getInstrInfo());
> +
> +  for (MachineFunction::iterator BB = MF.begin(), BB_E = MF.end();
> +                                                  BB != BB_E; ++BB) {
> +    MachineBasicBlock &MBB = *BB;
> +    for (MachineBasicBlock::iterator I = MBB.begin(), E = MBB.end();
> +                                                      I != E; ++I) {
> +      MachineInstr &MI = *I;
> +      TII->convertToISA(MI, MF, MBB.findDebugLoc(I));
> +    }
> +  }
> +  return false;
> +}
>
> Added: llvm/trunk/lib/Target/AMDGPU/AMDGPUISelLowering.cpp
> URL:
> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AMDGPU/AMDGPUISelLowering.cpp?rev=160270&view=auto
>
> ==============================================================================
> --- llvm/trunk/lib/Target/AMDGPU/AMDGPUISelLowering.cpp (added)
> +++ llvm/trunk/lib/Target/AMDGPU/AMDGPUISelLowering.cpp Mon Jul 16
> 09:17:08 2012
> @@ -0,0 +1,393 @@
> +//===-- AMDGPUISelLowering.cpp - AMDGPU Common DAG lowering functions
> -----===//
> +//
> +//                     The LLVM Compiler Infrastructure
> +//
> +// This file is distributed under the University of Illinois Open Source
> +// License. See LICENSE.TXT for details.
> +//
>
> +//===----------------------------------------------------------------------===//
> +//
> +// This is the parent TargetLowering class for hardware code gen targets.
> +//
>
> +//===----------------------------------------------------------------------===//
> +
> +#include "AMDGPUISelLowering.h"
> +#include "AMDILIntrinsicInfo.h"
> +#include "AMDGPUUtil.h"
> +#include "llvm/CodeGen/MachineRegisterInfo.h"
> +
> +using namespace llvm;
> +
> +AMDGPUTargetLowering::AMDGPUTargetLowering(TargetMachine &TM) :
> +  AMDILTargetLowering(TM)
> +{
> +  // We need to custom lower some of the intrinsics
> +  setOperationAction(ISD::INTRINSIC_WO_CHAIN, MVT::Other, Custom);
> +
> +  setOperationAction(ISD::SELECT_CC, MVT::f32, Custom);
> +  setOperationAction(ISD::SELECT_CC, MVT::i32, Custom);
> +
> +  // Library functions.  These default to Expand, but we have instructions
> +  // for them.
> +  setOperationAction(ISD::FCEIL,  MVT::f32, Legal);
> +  setOperationAction(ISD::FEXP2,  MVT::f32, Legal);
> +  setOperationAction(ISD::FRINT,  MVT::f32, Legal);
> +
> +  setOperationAction(ISD::UDIV, MVT::i32, Expand);
> +  setOperationAction(ISD::UDIVREM, MVT::i32, Custom);
> +  setOperationAction(ISD::UREM, MVT::i32, Expand);
> +}
> +
> +SDValue AMDGPUTargetLowering::LowerOperation(SDValue Op, SelectionDAG
> &DAG)
> +    const
> +{
> +  switch (Op.getOpcode()) {
> +  default: return AMDILTargetLowering::LowerOperation(Op, DAG);
> +  case ISD::INTRINSIC_WO_CHAIN: return LowerINTRINSIC_WO_CHAIN(Op, DAG);
> +  case ISD::SELECT_CC: return LowerSELECT_CC(Op, DAG);
> +  case ISD::UDIVREM: return LowerUDIVREM(Op, DAG);
> +  }
> +}
> +
> +SDValue AMDGPUTargetLowering::LowerINTRINSIC_WO_CHAIN(SDValue Op,
> +    SelectionDAG &DAG) const
> +{
> +  unsigned IntrinsicID =
> cast<ConstantSDNode>(Op.getOperand(0))->getZExtValue();
> +  DebugLoc DL = Op.getDebugLoc();
> +  EVT VT = Op.getValueType();
> +
> +  switch (IntrinsicID) {
> +    default: return Op;
> +    case AMDGPUIntrinsic::AMDIL_abs:
> +      return LowerIntrinsicIABS(Op, DAG);
> +    case AMDGPUIntrinsic::AMDIL_exp:
> +      return DAG.getNode(ISD::FEXP2, DL, VT, Op.getOperand(1));
> +    case AMDGPUIntrinsic::AMDIL_fabs:
> +      return DAG.getNode(ISD::FABS, DL, VT, Op.getOperand(1));
> +    case AMDGPUIntrinsic::AMDGPU_lrp:
> +      return LowerIntrinsicLRP(Op, DAG);
> +    case AMDGPUIntrinsic::AMDIL_fraction:
> +      return DAG.getNode(AMDGPUISD::FRACT, DL, VT, Op.getOperand(1));
> +    case AMDGPUIntrinsic::AMDIL_mad:
> +      return DAG.getNode(AMDILISD::MAD, DL, VT, Op.getOperand(1),
> +                              Op.getOperand(2), Op.getOperand(3));
> +    case AMDGPUIntrinsic::AMDIL_max:
> +      return DAG.getNode(AMDGPUISD::FMAX, DL, VT, Op.getOperand(1),
> +                                                  Op.getOperand(2));
> +    case AMDGPUIntrinsic::AMDGPU_imax:
> +      return DAG.getNode(AMDGPUISD::SMAX, DL, VT, Op.getOperand(1),
> +                                                  Op.getOperand(2));
> +    case AMDGPUIntrinsic::AMDGPU_umax:
> +      return DAG.getNode(AMDGPUISD::UMAX, DL, VT, Op.getOperand(1),
> +                                                  Op.getOperand(2));
> +    case AMDGPUIntrinsic::AMDIL_min:
> +      return DAG.getNode(AMDGPUISD::FMIN, DL, VT, Op.getOperand(1),
> +                                                  Op.getOperand(2));
> +    case AMDGPUIntrinsic::AMDGPU_imin:
> +      return DAG.getNode(AMDGPUISD::SMIN, DL, VT, Op.getOperand(1),
> +                                                  Op.getOperand(2));
> +    case AMDGPUIntrinsic::AMDGPU_umin:
> +      return DAG.getNode(AMDGPUISD::UMIN, DL, VT, Op.getOperand(1),
> +                                                  Op.getOperand(2));
> +    case AMDGPUIntrinsic::AMDIL_round_nearest:
> +      return DAG.getNode(ISD::FRINT, DL, VT, Op.getOperand(1));
> +    case AMDGPUIntrinsic::AMDIL_round_posinf:
> +      return DAG.getNode(ISD::FCEIL, DL, VT, Op.getOperand(1));
> +  }
> +}
> +
> +///IABS(a) = SMAX(sub(0, a), a)
> +SDValue AMDGPUTargetLowering::LowerIntrinsicIABS(SDValue Op,
> +    SelectionDAG &DAG) const
> +{
> +
> +  DebugLoc DL = Op.getDebugLoc();
> +  EVT VT = Op.getValueType();
> +  SDValue Neg = DAG.getNode(ISD::SUB, DL, VT, DAG.getConstant(0, VT),
> +                                              Op.getOperand(1));
> +
> +  return DAG.getNode(AMDGPUISD::SMAX, DL, VT, Neg, Op.getOperand(1));
> +}
> +
> +/// Linear Interpolation
> +/// LRP(a, b, c) = muladd(a,  b, (1 - a) * c)
> +SDValue AMDGPUTargetLowering::LowerIntrinsicLRP(SDValue Op,
> +    SelectionDAG &DAG) const
> +{
> +  DebugLoc DL = Op.getDebugLoc();
> +  EVT VT = Op.getValueType();
> +  SDValue OneSubA = DAG.getNode(ISD::FSUB, DL, VT,
> +                                DAG.getConstantFP(1.0f, MVT::f32),
> +                                Op.getOperand(1));
> +  SDValue OneSubAC = DAG.getNode(ISD::FMUL, DL, VT, OneSubA,
> +                                                    Op.getOperand(3));
> +  return DAG.getNode(AMDILISD::MAD, DL, VT, Op.getOperand(1),
> +                                               Op.getOperand(2),
> +                                               OneSubAC);
> +}
> +
> +SDValue AMDGPUTargetLowering::LowerSELECT_CC(SDValue Op,
> +    SelectionDAG &DAG) const
> +{
> +  DebugLoc DL = Op.getDebugLoc();
> +  EVT VT = Op.getValueType();
> +
> +  SDValue LHS = Op.getOperand(0);
> +  SDValue RHS = Op.getOperand(1);
> +  SDValue True = Op.getOperand(2);
> +  SDValue False = Op.getOperand(3);
> +  SDValue CC = Op.getOperand(4);
> +  ISD::CondCode CCOpcode = cast<CondCodeSDNode>(CC)->get();
> +  SDValue Temp;
> +
> +  // LHS and RHS are guaranteed to be the same value type
> +  EVT CompareVT = LHS.getValueType();
> +
> +  // We need all the operands of SELECT_CC to have the same value type,
> so if
> +  // necessary we need to convert LHS and RHS to be the same type True and
> +  // False.  True and False are guaranteed to have the same type as this
> +  // SELECT_CC node.
> +
> +  if (CompareVT !=  VT) {
> +    ISD::NodeType ConversionOp = ISD::DELETED_NODE;
> +    if (VT == MVT::f32 && CompareVT == MVT::i32) {
> +      if (isUnsignedIntSetCC(CCOpcode)) {
> +        ConversionOp = ISD::UINT_TO_FP;
> +      } else {
> +        ConversionOp = ISD::SINT_TO_FP;
> +      }
> +    } else if (VT == MVT::i32 && CompareVT == MVT::f32) {
> +      ConversionOp = ISD::FP_TO_SINT;
> +    } else {
> +      // I don't think there will be any other type pairings.
> +      assert(!"Unhandled operand type parings in SELECT_CC");
> +    }
> +    // XXX Check the value of LHS and RHS and avoid creating sequences
> like
> +    // (FTOI (ITOF))
> +    LHS = DAG.getNode(ConversionOp, DL, VT, LHS);
> +    RHS = DAG.getNode(ConversionOp, DL, VT, RHS);
> +  }
> +
> +  // If True is a hardware TRUE value and False is a hardware FALSE value
> or
> +  // vice-versa we can handle this with a native instruction (SET*
> instructions).
> +  if ((isHWTrueValue(True) && isHWFalseValue(False))) {
> +    return DAG.getNode(ISD::SELECT_CC, DL, VT, LHS, RHS, True, False, CC);
> +  }
> +
> +  // XXX If True is a hardware TRUE value and False is a hardware FALSE
> value,
> +  // we can handle this with a native instruction, but we need to swap
> true
> +  // and false and change the conditional.
> +  if (isHWTrueValue(False) && isHWFalseValue(True)) {
> +  }
> +
> +  // XXX Check if we can lower this to a SELECT or if it is supported by
> a native
> +  // operation. (The code below does this but we don't have the
> Instruction
> +  // selection patterns to do this yet.
> +#if 0
> +  if (isZero(LHS) || isZero(RHS)) {
> +    SDValue Cond = (isZero(LHS) ? RHS : LHS);
> +    bool SwapTF = false;
> +    switch (CCOpcode) {
> +    case ISD::SETOEQ:
> +    case ISD::SETUEQ:
> +    case ISD::SETEQ:
> +      SwapTF = true;
> +      // Fall through
> +    case ISD::SETONE:
> +    case ISD::SETUNE:
> +    case ISD::SETNE:
> +      // We can lower to select
> +      if (SwapTF) {
> +        Temp = True;
> +        True = False;
> +        False = Temp;
> +      }
> +      // CNDE
> +      return DAG.getNode(ISD::SELECT, DL, VT, Cond, True, False);
> +    default:
> +      // Supported by a native operation (CNDGE, CNDGT)
> +      return DAG.getNode(ISD::SELECT_CC, DL, VT, LHS, RHS, True, False,
> CC);
> +    }
> +  }
> +#endif
> +
> +  // If we make it this for it means we have no native instructions to
> handle
> +  // this SELECT_CC, so we must lower it.
> +  SDValue HWTrue, HWFalse;
> +
> +  if (VT == MVT::f32) {
> +    HWTrue = DAG.getConstantFP(1.0f, VT);
> +    HWFalse = DAG.getConstantFP(0.0f, VT);
> +  } else if (VT == MVT::i32) {
> +    HWTrue = DAG.getConstant(-1, VT);
> +    HWFalse = DAG.getConstant(0, VT);
> +  }
> +  else {
> +    assert(!"Unhandled value type in LowerSELECT_CC");
> +  }
> +
> +  // Lower this unsupported SELECT_CC into a combination of two supported
> +  // SELECT_CC operations.
> +  SDValue Cond = DAG.getNode(ISD::SELECT_CC, DL, VT, LHS, RHS, HWTrue,
> HWFalse, CC);
> +
> +  return DAG.getNode(ISD::SELECT, DL, VT, Cond, True, False);
> +}
> +
> +
> +SDValue AMDGPUTargetLowering::LowerUDIVREM(SDValue Op,
> +    SelectionDAG &DAG) const
> +{
> +  DebugLoc DL = Op.getDebugLoc();
> +  EVT VT = Op.getValueType();
> +
> +  SDValue Num = Op.getOperand(0);
> +  SDValue Den = Op.getOperand(1);
> +
> +  SmallVector<SDValue, 8> Results;
> +
> +  // RCP =  URECIP(Den) = 2^32 / Den + e
> +  // e is rounding error.
> +  SDValue RCP = DAG.getNode(AMDGPUISD::URECIP, DL, VT, Den);
> +
> +  // RCP_LO = umulo(RCP, Den) */
> +  SDValue RCP_LO = DAG.getNode(ISD::UMULO, DL, VT, RCP, Den);
> +
> +  // RCP_HI = mulhu (RCP, Den) */
> +  SDValue RCP_HI = DAG.getNode(ISD::MULHU, DL, VT, RCP, Den);
> +
> +  // NEG_RCP_LO = -RCP_LO
> +  SDValue NEG_RCP_LO = DAG.getNode(ISD::SUB, DL, VT, DAG.getConstant(0,
> VT),
> +                                                     RCP_LO);
> +
> +  // ABS_RCP_LO = (RCP_HI == 0 ? NEG_RCP_LO : RCP_LO)
> +  SDValue ABS_RCP_LO = DAG.getSelectCC(DL, RCP_HI, DAG.getConstant(0, VT),
> +                                           NEG_RCP_LO, RCP_LO,
> +                                           ISD::SETEQ);
> +  // Calculate the rounding error from the URECIP instruction
> +  // E = mulhu(ABS_RCP_LO, RCP)
> +  SDValue E = DAG.getNode(ISD::MULHU, DL, VT, ABS_RCP_LO, RCP);
> +
> +  // RCP_A_E = RCP + E
> +  SDValue RCP_A_E = DAG.getNode(ISD::ADD, DL, VT, RCP, E);
> +
> +  // RCP_S_E = RCP - E
> +  SDValue RCP_S_E = DAG.getNode(ISD::SUB, DL, VT, RCP, E);
> +
> +  // Tmp0 = (RCP_HI == 0 ? RCP_A_E : RCP_SUB_E)
> +  SDValue Tmp0 = DAG.getSelectCC(DL, RCP_HI, DAG.getConstant(0, VT),
> +                                     RCP_A_E, RCP_S_E,
> +                                     ISD::SETEQ);
> +  // Quotient = mulhu(Tmp0, Num)
> +  SDValue Quotient = DAG.getNode(ISD::MULHU, DL, VT, Tmp0, Num);
> +
> +  // Num_S_Remainder = Quotient * Den
> +  SDValue Num_S_Remainder = DAG.getNode(ISD::UMULO, DL, VT, Quotient,
> Den);
> +
> +  // Remainder = Num - Num_S_Remainder
> +  SDValue Remainder = DAG.getNode(ISD::SUB, DL, VT, Num, Num_S_Remainder);
> +
> +  // Remainder_GE_Den = (Remainder >= Den ? -1 : 0)
> +  SDValue Remainder_GE_Den = DAG.getSelectCC(DL, Remainder, Den,
> +                                                 DAG.getConstant(-1, VT),
> +                                                 DAG.getConstant(0, VT),
> +                                                 ISD::SETGE);
> +  // Remainder_GE_Zero = (Remainder >= 0 ? -1 : 0)
> +  SDValue Remainder_GE_Zero = DAG.getSelectCC(DL, Remainder,
> +                                                  DAG.getConstant(0, VT),
> +                                                  DAG.getConstant(-1, VT),
> +                                                  DAG.getConstant(0, VT),
> +                                                  ISD::SETGE);
> +  // Tmp1 = Remainder_GE_Den & Remainder_GE_Zero
> +  SDValue Tmp1 = DAG.getNode(ISD::AND, DL, VT, Remainder_GE_Den,
> +                                               Remainder_GE_Zero);
> +
> +  // Calculate Division result:
> +
> +  // Quotient_A_One = Quotient + 1
> +  SDValue Quotient_A_One = DAG.getNode(ISD::ADD, DL, VT, Quotient,
> +
> DAG.getConstant(1, VT));
> +
> +  // Quotient_S_One = Quotient - 1
> +  SDValue Quotient_S_One = DAG.getNode(ISD::SUB, DL, VT, Quotient,
> +
> DAG.getConstant(1, VT));
> +
> +  // Div = (Tmp1 == 0 ? Quotient : Quotient_A_One)
> +  SDValue Div = DAG.getSelectCC(DL, Tmp1, DAG.getConstant(0, VT),
> +                                     Quotient, Quotient_A_One,
> ISD::SETEQ);
> +
> +  // Div = (Remainder_GE_Zero == 0 ? Quotient_S_One : Div)
> +  Div = DAG.getSelectCC(DL, Remainder_GE_Zero, DAG.getConstant(0, VT),
> +                            Quotient_S_One, Div, ISD::SETEQ);
> +
> +  // Calculate Rem result:
> +
> +  // Remainder_S_Den = Remainder - Den
> +  SDValue Remainder_S_Den = DAG.getNode(ISD::SUB, DL, VT, Remainder, Den);
> +
> +  // Remainder_A_Den = Remainder + Den
> +  SDValue Remainder_A_Den = DAG.getNode(ISD::ADD, DL, VT, Remainder, Den);
> +
> +  // Rem = (Tmp1 == 0 ? Remainder : Remainder_S_Den)
> +  SDValue Rem = DAG.getSelectCC(DL, Tmp1, DAG.getConstant(0, VT),
> +                                    Remainder, Remainder_S_Den,
> ISD::SETEQ);
> +
> +  // Rem = (Remainder_GE_Zero == 0 ? Remainder_A_Den : Rem)
> +  Rem = DAG.getSelectCC(DL, Remainder_GE_Zero, DAG.getConstant(0, VT),
> +                            Remainder_A_Den, Rem, ISD::SETEQ);
> +
> +  DAG.ReplaceAllUsesWith(Op.getValue(0).getNode(), &Div);
> +  DAG.ReplaceAllUsesWith(Op.getValue(1).getNode(), &Rem);
> +
> +  return Op;
> +}
> +
>
> +//===----------------------------------------------------------------------===//
> +// Helper functions
>
> +//===----------------------------------------------------------------------===//
> +
> +bool AMDGPUTargetLowering::isHWTrueValue(SDValue Op) const
> +{
> +  if (ConstantFPSDNode * CFP = dyn_cast<ConstantFPSDNode>(Op)) {
> +    return CFP->isExactlyValue(1.0);
> +  }
> +  if (ConstantSDNode *C = dyn_cast<ConstantSDNode>(Op)) {
> +    return C->isAllOnesValue();
> +  }
> +  return false;
> +}
> +
> +bool AMDGPUTargetLowering::isHWFalseValue(SDValue Op) const
> +{
> +  if (ConstantFPSDNode * CFP = dyn_cast<ConstantFPSDNode>(Op)) {
> +    return CFP->getValueAPF().isZero();
> +  }
> +  if (ConstantSDNode *C = dyn_cast<ConstantSDNode>(Op)) {
> +    return C->isNullValue();
> +  }
> +  return false;
> +}
> +
> +void AMDGPUTargetLowering::addLiveIn(MachineInstr * MI,
> +    MachineFunction * MF, MachineRegisterInfo & MRI,
> +    const TargetInstrInfo * TII, unsigned reg) const
> +{
> +  AMDGPU::utilAddLiveIn(MF, MRI, TII, reg, MI->getOperand(0).getReg());
> +}
> +
> +#define NODE_NAME_CASE(node) case AMDGPUISD::node: return #node;
> +
> +const char* AMDGPUTargetLowering::getTargetNodeName(unsigned Opcode) const
> +{
> +  switch (Opcode) {
> +  default: return AMDILTargetLowering::getTargetNodeName(Opcode);
> +
> +  NODE_NAME_CASE(FRACT)
> +  NODE_NAME_CASE(FMAX)
> +  NODE_NAME_CASE(SMAX)
> +  NODE_NAME_CASE(UMAX)
> +  NODE_NAME_CASE(FMIN)
> +  NODE_NAME_CASE(SMIN)
> +  NODE_NAME_CASE(UMIN)
> +  NODE_NAME_CASE(URECIP)
> +  }
> +}
>
> Added: llvm/trunk/lib/Target/AMDGPU/AMDGPUISelLowering.h
> URL:
> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AMDGPU/AMDGPUISelLowering.h?rev=160270&view=auto
>
> ==============================================================================
> --- llvm/trunk/lib/Target/AMDGPU/AMDGPUISelLowering.h (added)
> +++ llvm/trunk/lib/Target/AMDGPU/AMDGPUISelLowering.h Mon Jul 16 09:17:08
> 2012
> @@ -0,0 +1,77 @@
> +//===-- AMDGPUISelLowering.h - AMDGPU Lowering Interface --------*- C++
> -*-===//
> +//
> +//                     The LLVM Compiler Infrastructure
> +//
> +// This file is distributed under the University of Illinois Open Source
> +// License. See LICENSE.TXT for details.
> +//
>
> +//===----------------------------------------------------------------------===//
> +//
> +// This file contains the interface defintiion of the TargetLowering class
> +// that is common to all AMD GPUs.
> +//
>
> +//===----------------------------------------------------------------------===//
> +
> +#ifndef AMDGPUISELLOWERING_H
> +#define AMDGPUISELLOWERING_H
> +
> +#include "AMDILISelLowering.h"
> +
> +namespace llvm {
> +
> +class AMDGPUTargetLowering : public AMDILTargetLowering
> +{
> +private:
> +  SDValue LowerINTRINSIC_WO_CHAIN(SDValue Op, SelectionDAG &DAG) const;
> +  SDValue LowerSELECT_CC(SDValue Op, SelectionDAG &DAG) const;
> +  SDValue LowerUDIVREM(SDValue Op, SelectionDAG &DAG) const;
> +
> +protected:
> +
> +  /// addLiveIn - This functions adds reg to the live in list of the
> entry block
> +  /// and emits a copy from reg to MI.getOperand(0).
> +  ///
> +  //  Some registers are loaded with values before the program
> +  /// begins to execute.  The loading of these values is modeled with
> pseudo
> +  /// instructions which are lowered using this function.
> +  void addLiveIn(MachineInstr * MI, MachineFunction * MF,
> +                 MachineRegisterInfo & MRI, const TargetInstrInfo * TII,
> +                unsigned reg) const;
> +
> +  bool isHWTrueValue(SDValue Op) const;
> +  bool isHWFalseValue(SDValue Op) const;
> +
> +public:
> +  AMDGPUTargetLowering(TargetMachine &TM);
> +
> +  virtual SDValue LowerOperation(SDValue Op, SelectionDAG &DAG) const;
> +  SDValue LowerIntrinsicIABS(SDValue Op, SelectionDAG &DAG) const;
> +  SDValue LowerIntrinsicLRP(SDValue Op, SelectionDAG &DAG) const;
> +  virtual const char* getTargetNodeName(unsigned Opcode) const;
> +
> +};
> +
> +namespace AMDGPUISD
> +{
> +
> +enum
> +{
> +  AMDGPU_FIRST = AMDILISD::LAST_ISD_NUMBER,
> +  BITALIGN,
> +  FRACT,
> +  FMAX,
> +  SMAX,
> +  UMAX,
> +  FMIN,
> +  SMIN,
> +  UMIN,
> +  URECIP,
> +  LAST_AMDGPU_ISD_NUMBER
> +};
> +
> +
> +} // End namespace AMDGPUISD
> +
> +} // End namespace llvm
> +
> +#endif // AMDGPUISELLOWERING_H
>
> Added: llvm/trunk/lib/Target/AMDGPU/AMDGPUInstrInfo.cpp
> URL:
> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AMDGPU/AMDGPUInstrInfo.cpp?rev=160270&view=auto
>
> ==============================================================================
> --- llvm/trunk/lib/Target/AMDGPU/AMDGPUInstrInfo.cpp (added)
> +++ llvm/trunk/lib/Target/AMDGPU/AMDGPUInstrInfo.cpp Mon Jul 16 09:17:08
> 2012
> @@ -0,0 +1,46 @@
> +//===-- AMDGPUInstrInfo.cpp - Base class for AMD GPU InstrInfo
> ------------===//
> +//
> +//                     The LLVM Compiler Infrastructure
> +//
> +// This file is distributed under the University of Illinois Open Source
> +// License. See LICENSE.TXT for details.
> +//
>
> +//===----------------------------------------------------------------------===//
> +//
> +// This file contains the implementation of the TargetInstrInfo class
> that is
> +// common to all AMD GPUs.
> +//
>
> +//===----------------------------------------------------------------------===//
> +
> +#include "AMDGPUInstrInfo.h"
> +#include "AMDGPURegisterInfo.h"
> +#include "AMDGPUTargetMachine.h"
> +#include "AMDIL.h"
> +#include "llvm/CodeGen/MachineRegisterInfo.h"
> +
> +using namespace llvm;
> +
> +AMDGPUInstrInfo::AMDGPUInstrInfo(AMDGPUTargetMachine &tm)
> +  : AMDILInstrInfo(tm) { }
> +
> +void AMDGPUInstrInfo::convertToISA(MachineInstr & MI, MachineFunction &MF,
> +    DebugLoc DL) const
> +{
> +  MachineRegisterInfo &MRI = MF.getRegInfo();
> +  const AMDGPURegisterInfo & RI = getRegisterInfo();
> +
> +  for (unsigned i = 0; i < MI.getNumOperands(); i++) {
> +    MachineOperand &MO = MI.getOperand(i);
> +    // Convert dst regclass to one that is supported by the ISA
> +    if (MO.isReg() && MO.isDef()) {
> +      if (TargetRegisterInfo::isVirtualRegister(MO.getReg())) {
> +        const TargetRegisterClass * oldRegClass =
> MRI.getRegClass(MO.getReg());
> +        const TargetRegisterClass * newRegClass =
> RI.getISARegClass(oldRegClass);
> +
> +        assert(newRegClass);
> +
> +        MRI.setRegClass(MO.getReg(), newRegClass);
> +      }
> +    }
> +  }
> +}
>
> Added: llvm/trunk/lib/Target/AMDGPU/AMDGPUInstrInfo.h
> URL:
> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AMDGPU/AMDGPUInstrInfo.h?rev=160270&view=auto
>
> ==============================================================================
> --- llvm/trunk/lib/Target/AMDGPU/AMDGPUInstrInfo.h (added)
> +++ llvm/trunk/lib/Target/AMDGPU/AMDGPUInstrInfo.h Mon Jul 16 09:17:08 2012
> @@ -0,0 +1,46 @@
> +//===-- AMDGPUInstrInfo.h - AMDGPU Instruction Information ------*- C++
> -*-===//
> +//
> +//                     The LLVM Compiler Infrastructure
> +//
> +// This file is distributed under the University of Illinois Open Source
> +// License. See LICENSE.TXT for details.
> +//
>
> +//===----------------------------------------------------------------------===//
> +//
> +// This file contains the definition of a TargetInstrInfo class that is
> common
> +// to all AMD GPUs.
> +//
>
> +//===----------------------------------------------------------------------===//
> +
> +#ifndef AMDGPUINSTRUCTIONINFO_H_
> +#define AMDGPUINSTRUCTIONINFO_H_
> +
> +#include "AMDGPURegisterInfo.h"
> +#include "AMDILInstrInfo.h"
> +
> +#include <map>
> +
> +namespace llvm {
> +
> +class AMDGPUTargetMachine;
> +class MachineFunction;
> +class MachineInstr;
> +class MachineInstrBuilder;
> +
> +class AMDGPUInstrInfo : public AMDILInstrInfo {
> +
> +public:
> +  explicit AMDGPUInstrInfo(AMDGPUTargetMachine &tm);
> +
> +  virtual const AMDGPURegisterInfo &getRegisterInfo() const = 0;
> +
> +  /// convertToISA - Convert the AMDIL MachineInstr to a supported ISA
> +  /// MachineInstr
> +  virtual void convertToISA(MachineInstr & MI, MachineFunction &MF,
> +    DebugLoc DL) const;
> +
> +};
> +
> +} // End llvm namespace
> +
> +#endif // AMDGPUINSTRINFO_H_
>
> Added: llvm/trunk/lib/Target/AMDGPU/AMDGPUInstrInfo.td
> URL:
> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AMDGPU/AMDGPUInstrInfo.td?rev=160270&view=auto
>
> ==============================================================================
> --- llvm/trunk/lib/Target/AMDGPU/AMDGPUInstrInfo.td (added)
> +++ llvm/trunk/lib/Target/AMDGPU/AMDGPUInstrInfo.td Mon Jul 16 09:17:08
> 2012
> @@ -0,0 +1,69 @@
> +//===-- AMDGPUInstrInfo.td - AMDGPU DAG nodes --------------*- tablegen
> -*-===//
> +//
> +//                     The LLVM Compiler Infrastructure
> +//
> +// This file is distributed under the University of Illinois Open Source
> +// License. See LICENSE.TXT for details.
> +//
>
> +//===----------------------------------------------------------------------===//
> +//
> +// This file contains DAG node defintions for the AMDGPU target.
> +//
>
> +//===----------------------------------------------------------------------===//
> +
>
> +//===----------------------------------------------------------------------===//
> +// AMDGPU DAG Profiles
>
> +//===----------------------------------------------------------------------===//
> +
> +def AMDGPUDTIntTernaryOp : SDTypeProfile<1, 3, [
> +  SDTCisSameAs<0, 1>, SDTCisSameAs<0, 2>, SDTCisInt<0>, SDTCisInt<3>
> +]>;
> +
>
> +//===----------------------------------------------------------------------===//
> +// AMDGPU DAG Nodes
> +//
> +
> +// out = ((a << 32) | b) >> c)
> +//
> +// Can be used to optimize rtol:
> +// rotl(a, b) = bitalign(a, a, 32 - b)
> +def AMDGPUbitalign : SDNode<"AMDGPUISD::BITALIGN", AMDGPUDTIntTernaryOp>;
> +
> +// out = a - floor(a)
> +def AMDGPUfract : SDNode<"AMDGPUISD::FRACT", SDTFPUnaryOp>;
> +
> +// out = max(a, b) a and b are floats
> +def AMDGPUfmax : SDNode<"AMDGPUISD::FMAX", SDTFPBinOp,
> +  [SDNPCommutative, SDNPAssociative]
> +>;
> +
> +// out = max(a, b) a and b are signed ints
> +def AMDGPUsmax : SDNode<"AMDGPUISD::SMAX", SDTIntBinOp,
> +  [SDNPCommutative, SDNPAssociative]
> +>;
> +
> +// out = max(a, b) a and b are unsigned ints
> +def AMDGPUumax : SDNode<"AMDGPUISD::UMAX", SDTIntBinOp,
> +  [SDNPCommutative, SDNPAssociative]
> +>;
> +
> +// out = min(a, b) a and b are floats
> +def AMDGPUfmin : SDNode<"AMDGPUISD::FMIN", SDTFPBinOp,
> +  [SDNPCommutative, SDNPAssociative]
> +>;
> +
> +// out = min(a, b) a snd b are signed ints
> +def AMDGPUsmin : SDNode<"AMDGPUISD::SMIN", SDTIntBinOp,
> +  [SDNPCommutative, SDNPAssociative]
> +>;
> +
> +// out = min(a, b) a and b are unsigned ints
> +def AMDGPUumin : SDNode<"AMDGPUISD::UMIN", SDTIntBinOp,
> +  [SDNPCommutative, SDNPAssociative]
> +>;
> +
> +// urecip - This operation is a helper for integer division, it returns
> the
> +// result of 1 / a as a fractional unsigned integer.
> +// out = (2^32 / a) + e
> +// e is rounding error
> +def AMDGPUurecip : SDNode<"AMDGPUISD::URECIP", SDTIntUnaryOp>;
>
> Added: llvm/trunk/lib/Target/AMDGPU/AMDGPUInstructions.td
> URL:
> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AMDGPU/AMDGPUInstructions.td?rev=160270&view=auto
>
> ==============================================================================
> --- llvm/trunk/lib/Target/AMDGPU/AMDGPUInstructions.td (added)
> +++ llvm/trunk/lib/Target/AMDGPU/AMDGPUInstructions.td Mon Jul 16 09:17:08
> 2012
> @@ -0,0 +1,123 @@
> +//===-- AMDGPUInstructions.td - Common instruction defs ---*- tablegen
> -*-===//
> +//
> +//                     The LLVM Compiler Infrastructure
> +//
> +// This file is distributed under the University of Illinois Open Source
> +// License. See LICENSE.TXT for details.
> +//
>
> +//===----------------------------------------------------------------------===//
> +//
> +// This file contains instruction defs that are common to all hw codegen
> +// targets.
> +//
>
> +//===----------------------------------------------------------------------===//
> +
> +class AMDGPUInst <dag outs, dag ins, string asm, list<dag> pattern> :
> Instruction {
> +  field bits<16> AMDILOp = 0;
> +  field bits<3> Gen = 0;
> +
> +  let Namespace = "AMDGPU";
> +  let OutOperandList = outs;
> +  let InOperandList = ins;
> +  let AsmString = asm;
> +  let Pattern = pattern;
> +  let Itinerary = NullALU;
> +  let TSFlags{42-40} = Gen;
> +  let TSFlags{63-48} = AMDILOp;
> +}
> +
> +class AMDGPUShaderInst <dag outs, dag ins, string asm, list<dag> pattern>
> +    : AMDGPUInst<outs, ins, asm, pattern> {
> +
> +  field bits<32> Inst = 0xffffffff;
> +
> +}
> +
> +class Constants {
> +int TWO_PI = 0x40c90fdb;
> +int PI = 0x40490fdb;
> +int TWO_PI_INV = 0x3e22f983;
> +}
> +def CONST : Constants;
> +
> +def FP_ZERO : PatLeaf <
> +  (fpimm),
> +  [{return N->getValueAPF().isZero();}]
> +>;
> +
> +def FP_ONE : PatLeaf <
> +  (fpimm),
> +  [{return N->isExactlyValue(1.0);}]
> +>;
> +
> +let isCodeGenOnly = 1, isPseudo = 1, usesCustomInserter = 1  in {
> +
> +class CLAMP <RegisterClass rc> : AMDGPUShaderInst <
> +  (outs rc:$dst),
> +  (ins rc:$src0),
> +  "CLAMP $dst, $src0",
> +  [(set rc:$dst, (int_AMDIL_clamp rc:$src0, (f32 FP_ZERO), (f32 FP_ONE)))]
> +>;
> +
> +class FABS <RegisterClass rc> : AMDGPUShaderInst <
> +  (outs rc:$dst),
> +  (ins rc:$src0),
> +  "FABS $dst, $src0",
> +  [(set rc:$dst, (fabs rc:$src0))]
> +>;
> +
> +class FNEG <RegisterClass rc> : AMDGPUShaderInst <
> +  (outs rc:$dst),
> +  (ins rc:$src0),
> +  "FNEG $dst, $src0",
> +  [(set rc:$dst, (fneg rc:$src0))]
> +>;
> +
> +} // End isCodeGenOnly = 1, isPseudo = 1, hasCustomInserter = 1
> +
> +/* Generic helper patterns for intrinsics */
> +/* -------------------------------------- */
> +
> +class POW_Common <AMDGPUInst log_ieee, AMDGPUInst exp_ieee, AMDGPUInst
> mul,
> +                  RegisterClass rc> : Pat <
> +  (int_AMDGPU_pow rc:$src0, rc:$src1),
> +  (exp_ieee (mul rc:$src1, (log_ieee rc:$src0)))
> +>;
> +
> +/* Other helper patterns */
> +/* --------------------- */
> +
> +/* Extract element pattern */
> +class Extract_Element <ValueType sub_type, ValueType vec_type,
> +                     RegisterClass vec_class, int sub_idx,
> +                     SubRegIndex sub_reg>: Pat<
> +  (sub_type (vector_extract (vec_type vec_class:$src), sub_idx)),
> +  (EXTRACT_SUBREG vec_class:$src, sub_reg)
> +>;
> +
> +/* Insert element pattern */
> +class Insert_Element <ValueType elem_type, ValueType vec_type,
> +                      RegisterClass elem_class, RegisterClass vec_class,
> +                      int sub_idx, SubRegIndex sub_reg> : Pat <
> +
> +  (vec_type (vector_insert (vec_type vec_class:$vec),
> +                           (elem_type elem_class:$elem), sub_idx)),
> +  (INSERT_SUBREG vec_class:$vec, elem_class:$elem, sub_reg)
> +>;
> +
> +// Vector Build pattern
> +class Vector_Build <ValueType vecType, RegisterClass elemClass> : Pat <
> +  (IL_vbuild elemClass:$src),
> +  (INSERT_SUBREG (vecType (IMPLICIT_DEF)), elemClass:$src, sel_x)
> +>;
> +
> +// bitconvert pattern
> +class BitConvert <ValueType dt, ValueType st, RegisterClass rc> : Pat <
> +  (dt (bitconvert (st rc:$src0))),
> +  (dt rc:$src0)
> +>;
> +
> +include "R600Instructions.td"
> +
> +include "SIInstrInfo.td"
> +
>
> Added: llvm/trunk/lib/Target/AMDGPU/AMDGPUIntrinsics.td
> URL:
> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AMDGPU/AMDGPUIntrinsics.td?rev=160270&view=auto
>
> ==============================================================================
> --- llvm/trunk/lib/Target/AMDGPU/AMDGPUIntrinsics.td (added)
> +++ llvm/trunk/lib/Target/AMDGPU/AMDGPUIntrinsics.td Mon Jul 16 09:17:08
> 2012
> @@ -0,0 +1,64 @@
> +//===-- AMDGPUIntrinsics.td - Common intrinsics  -*- tablegen
> -*-----------===//
> +//
> +//                     The LLVM Compiler Infrastructure
> +//
> +// This file is distributed under the University of Illinois Open Source
> +// License. See LICENSE.TXT for details.
> +//
>
> +//===----------------------------------------------------------------------===//
> +//
> +// This file defines intrinsics that are used by all hw codegen targets.
> +//
>
> +//===----------------------------------------------------------------------===//
> +
> +let TargetPrefix = "AMDGPU", isTarget = 1 in {
> +
> +  def int_AMDGPU_load_const : Intrinsic<[llvm_float_ty], [llvm_i32_ty],
> [IntrNoMem]>;
> +  def int_AMDGPU_load_imm : Intrinsic<[llvm_v4f32_ty], [llvm_i32_ty],
> [IntrNoMem]>;
> +  def int_AMDGPU_reserve_reg : Intrinsic<[], [llvm_i32_ty], [IntrNoMem]>;
> +  def int_AMDGPU_store_output : Intrinsic<[], [llvm_float_ty,
> llvm_i32_ty], [IntrNoMem]>;
> +  def int_AMDGPU_swizzle : Intrinsic<[llvm_v4f32_ty], [llvm_v4f32_ty,
> llvm_i32_ty], [IntrNoMem]>;
> +
> +  def int_AMDGPU_arl : Intrinsic<[llvm_i32_ty], [llvm_float_ty],
> [IntrNoMem]>;
> +  def int_AMDGPU_cndlt : Intrinsic<[llvm_float_ty], [llvm_float_ty,
> llvm_float_ty, llvm_float_ty], [IntrNoMem]>;
> +  def int_AMDGPU_cos : Intrinsic<[llvm_float_ty], [llvm_float_ty],
> [IntrNoMem]>;
> +  def int_AMDGPU_div : Intrinsic<[llvm_float_ty], [llvm_float_ty,
> llvm_float_ty], [IntrNoMem]>;
> +  def int_AMDGPU_dp4 : Intrinsic<[llvm_float_ty], [llvm_v4f32_ty,
> llvm_v4f32_ty], [IntrNoMem]>;
> +  def int_AMDGPU_floor : Intrinsic<[llvm_float_ty], [llvm_float_ty],
> [IntrNoMem]>;
> +  def int_AMDGPU_kill : Intrinsic<[], [llvm_float_ty], []>;
> +  def int_AMDGPU_kilp : Intrinsic<[], [], []>;
> +  def int_AMDGPU_lrp : Intrinsic<[llvm_float_ty], [llvm_float_ty,
> llvm_float_ty, llvm_float_ty], [IntrNoMem]>;
> +  def int_AMDGPU_mul : Intrinsic<[llvm_float_ty], [llvm_float_ty,
> llvm_float_ty], [IntrNoMem]>;
> +  def int_AMDGPU_pow : Intrinsic<[llvm_float_ty], [llvm_float_ty,
> llvm_float_ty], [IntrNoMem]>;
> +  def int_AMDGPU_rcp : Intrinsic<[llvm_float_ty], [llvm_float_ty],
> [IntrNoMem]>;
> +  def int_AMDGPU_rsq : Intrinsic<[llvm_float_ty], [llvm_float_ty],
> [IntrNoMem]>;
> +  def int_AMDGPU_seq : Intrinsic<[llvm_float_ty], [llvm_float_ty,
> llvm_float_ty], [IntrNoMem]>;
> +  def int_AMDGPU_sgt : Intrinsic<[llvm_float_ty], [llvm_float_ty,
> llvm_float_ty], [IntrNoMem]>;
> +  def int_AMDGPU_sge : Intrinsic<[llvm_float_ty], [llvm_float_ty,
> llvm_float_ty], [IntrNoMem]>;
> +  def int_AMDGPU_sin : Intrinsic<[llvm_float_ty], [llvm_float_ty],
> [IntrNoMem]>;
> +  def int_AMDGPU_sle : Intrinsic<[llvm_float_ty], [llvm_float_ty,
> llvm_float_ty], [IntrNoMem]>;
> +  def int_AMDGPU_sne : Intrinsic<[llvm_float_ty], [llvm_float_ty,
> llvm_float_ty], [IntrNoMem]>;
> +  def int_AMDGPU_ssg : Intrinsic<[llvm_float_ty], [llvm_float_ty],
> [IntrNoMem]>;
> +  def int_AMDGPU_mullit : Intrinsic<[llvm_v4f32_ty], [llvm_float_ty,
> llvm_float_ty, llvm_float_ty], [IntrNoMem]>;
> +  def int_AMDGPU_tex : Intrinsic<[llvm_v4f32_ty], [llvm_v4f32_ty,
> llvm_i32_ty, llvm_i32_ty], [IntrNoMem]>;
> +  def int_AMDGPU_txb : Intrinsic<[llvm_v4f32_ty], [llvm_v4f32_ty,
> llvm_i32_ty, llvm_i32_ty], [IntrNoMem]>;
> +  def int_AMDGPU_txf : Intrinsic<[llvm_v4f32_ty], [llvm_v4f32_ty,
> llvm_i32_ty, llvm_i32_ty, llvm_i32_ty, llvm_i32_ty, llvm_i32_ty],
> [IntrNoMem]>;
> +  def int_AMDGPU_txq : Intrinsic<[llvm_v4f32_ty], [llvm_v4f32_ty,
> llvm_i32_ty, llvm_i32_ty], [IntrNoMem]>;
> +  def int_AMDGPU_txd : Intrinsic<[llvm_v4f32_ty], [llvm_v4f32_ty,
> llvm_v4f32_ty, llvm_v4f32_ty, llvm_i32_ty, llvm_i32_ty], [IntrNoMem]>;
> +  def int_AMDGPU_txl : Intrinsic<[llvm_v4f32_ty], [llvm_v4f32_ty,
> llvm_i32_ty, llvm_i32_ty], [IntrNoMem]>;
> +  def int_AMDGPU_trunc : Intrinsic<[llvm_float_ty], [llvm_float_ty],
> [IntrNoMem]>;
> +  def int_AMDGPU_ddx : Intrinsic<[llvm_v4f32_ty], [llvm_v4f32_ty,
> llvm_i32_ty, llvm_i32_ty], [IntrNoMem]>;
> +  def int_AMDGPU_ddy : Intrinsic<[llvm_v4f32_ty], [llvm_v4f32_ty,
> llvm_i32_ty, llvm_i32_ty], [IntrNoMem]>;
> +  def int_AMDGPU_imax : Intrinsic<[llvm_i32_ty], [llvm_i32_ty,
> llvm_i32_ty], [IntrNoMem]>;
> +  def int_AMDGPU_imin : Intrinsic<[llvm_i32_ty], [llvm_i32_ty,
> llvm_i32_ty], [IntrNoMem]>;
> +  def int_AMDGPU_umax : Intrinsic<[llvm_i32_ty], [llvm_i32_ty,
> llvm_i32_ty], [IntrNoMem]>;
> +  def int_AMDGPU_umin : Intrinsic<[llvm_i32_ty], [llvm_i32_ty,
> llvm_i32_ty], [IntrNoMem]>;
> +  def int_AMDGPU_cube : Intrinsic<[llvm_v4f32_ty], [llvm_v4f32_ty],
> [IntrNoMem]>;
> +}
> +
> +let TargetPrefix = "TGSI", isTarget = 1 in {
> +
> +  def int_TGSI_lit_z : Intrinsic<[llvm_float_ty], [llvm_float_ty,
> llvm_float_ty, llvm_float_ty],[]>;
> +}
> +
> +include "SIIntrinsics.td"
>
> Added: llvm/trunk/lib/Target/AMDGPU/AMDGPURegisterInfo.cpp
> URL:
> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AMDGPU/AMDGPURegisterInfo.cpp?rev=160270&view=auto
>
> ==============================================================================
> --- llvm/trunk/lib/Target/AMDGPU/AMDGPURegisterInfo.cpp (added)
> +++ llvm/trunk/lib/Target/AMDGPU/AMDGPURegisterInfo.cpp Mon Jul 16
> 09:17:08 2012
> @@ -0,0 +1,24 @@
> +//===-- AMDGPURegisterInfo.cpp - AMDGPU Register Information
> -------------===//
> +//
> +//                     The LLVM Compiler Infrastructure
> +//
> +// This file is distributed under the University of Illinois Open Source
> +// License. See LICENSE.TXT for details.
> +//
>
> +//===----------------------------------------------------------------------===//
> +//
> +// Parent TargetRegisterInfo class common to all hw codegen targets.
> +//
>
> +//===----------------------------------------------------------------------===//
> +
> +#include "AMDGPURegisterInfo.h"
> +#include "AMDGPUTargetMachine.h"
> +
> +using namespace llvm;
> +
> +AMDGPURegisterInfo::AMDGPURegisterInfo(AMDGPUTargetMachine &tm,
> +    const TargetInstrInfo &tii)
> +: AMDILRegisterInfo(tm, tii),
> +  TM(tm),
> +  TII(tii)
> +  { }
>
> Added: llvm/trunk/lib/Target/AMDGPU/AMDGPURegisterInfo.h
> URL:
> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AMDGPU/AMDGPURegisterInfo.h?rev=160270&view=auto
>
> ==============================================================================
> --- llvm/trunk/lib/Target/AMDGPU/AMDGPURegisterInfo.h (added)
> +++ llvm/trunk/lib/Target/AMDGPU/AMDGPURegisterInfo.h Mon Jul 16 09:17:08
> 2012
> @@ -0,0 +1,42 @@
> +//===-- AMDGPURegisterInfo.h - AMDGPURegisterInfo Interface -*- C++
> -*-----===//
> +//
> +//                     The LLVM Compiler Infrastructure
> +//
> +// This file is distributed under the University of Illinois Open Source
> +// License. See LICENSE.TXT for details.
> +//
>
> +//===----------------------------------------------------------------------===//
> +//
> +// This file contains the TargetRegisterInfo interface that is implemented
> +// by all hw codegen targets.
> +//
>
> +//===----------------------------------------------------------------------===//
> +
> +#ifndef AMDGPUREGISTERINFO_H_
> +#define AMDGPUREGISTERINFO_H_
> +
> +#include "AMDILRegisterInfo.h"
> +
> +namespace llvm {
> +
> +class AMDGPUTargetMachine;
> +class TargetInstrInfo;
> +
> +struct AMDGPURegisterInfo : public AMDILRegisterInfo
> +{
> +  AMDGPUTargetMachine &TM;
> +  const TargetInstrInfo &TII;
> +
> +  AMDGPURegisterInfo(AMDGPUTargetMachine &tm, const TargetInstrInfo &tii);
> +
> +  virtual BitVector getReservedRegs(const MachineFunction &MF) const = 0;
> +
> +  /// getISARegClass - rc is an AMDIL reg class.  This function returns
> the
> +  /// ISA reg class that is equivalent to the given AMDIL reg class.
> +  virtual const TargetRegisterClass *
> +    getISARegClass(const TargetRegisterClass * rc) const = 0;
> +};
> +
> +} // End namespace llvm
> +
> +#endif // AMDIDSAREGISTERINFO_H_
>
> Added: llvm/trunk/lib/Target/AMDGPU/AMDGPURegisterInfo.td
> URL:
> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AMDGPU/AMDGPURegisterInfo.td?rev=160270&view=auto
>
> ==============================================================================
> --- llvm/trunk/lib/Target/AMDGPU/AMDGPURegisterInfo.td (added)
> +++ llvm/trunk/lib/Target/AMDGPU/AMDGPURegisterInfo.td Mon Jul 16 09:17:08
> 2012
> @@ -0,0 +1,22 @@
> +//===-- AMDGPURegisterInfo.td - AMDGPU register info -------*- tablegen
> -*-===//
> +//
> +//                     The LLVM Compiler Infrastructure
> +//
> +// This file is distributed under the University of Illinois Open Source
> +// License. See LICENSE.TXT for details.
> +//
>
> +//===----------------------------------------------------------------------===//
> +//
> +// Tablegen register definitions common to all hw codegen targets.
> +//
>
> +//===----------------------------------------------------------------------===//
> +
> +let Namespace = "AMDGPU" in {
> +  def sel_x : SubRegIndex;
> +  def sel_y : SubRegIndex;
> +  def sel_z : SubRegIndex;
> +  def sel_w : SubRegIndex;
> +}
> +
> +include "R600RegisterInfo.td"
> +include "SIRegisterInfo.td"
>
> Added: llvm/trunk/lib/Target/AMDGPU/AMDGPUSubtarget.h
> URL:
> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AMDGPU/AMDGPUSubtarget.h?rev=160270&view=auto
>
> ==============================================================================
> --- llvm/trunk/lib/Target/AMDGPU/AMDGPUSubtarget.h (added)
> +++ llvm/trunk/lib/Target/AMDGPU/AMDGPUSubtarget.h Mon Jul 16 09:17:08 2012
> @@ -0,0 +1,36 @@
> +//=====-- AMDGPUSubtarget.h - Define Subtarget for the AMDIL ---*- C++
> -*-====//
> +//
> +//                     The LLVM Compiler Infrastructure
> +//
> +// This file is distributed under the University of Illinois Open Source
> +// License. See LICENSE.TXT for details.
> +//
>
> +//==-----------------------------------------------------------------------===//
> +//
> +// This file declares the AMDGPU specific subclass of TargetSubtarget.
> +//
>
> +//===----------------------------------------------------------------------===//
> +
> +#ifndef _AMDGPUSUBTARGET_H_
> +#define _AMDGPUSUBTARGET_H_
> +#include "AMDILSubtarget.h"
> +
> +namespace llvm {
> +
> +class AMDGPUSubtarget : public AMDILSubtarget
> +{
> +  InstrItineraryData InstrItins;
> +
> +public:
> +  AMDGPUSubtarget(StringRef TT, StringRef CPU, StringRef FS) :
> +    AMDILSubtarget(TT, CPU, FS)
> +  {
> +    InstrItins = getInstrItineraryForCPU(CPU);
> +  }
> +
> +  const InstrItineraryData &getInstrItineraryData() const { return
> InstrItins; }
> +};
> +
> +} // End namespace llvm
> +
> +#endif // AMDGPUSUBTARGET_H_
>
> Added: llvm/trunk/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
> URL:
> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp?rev=160270&view=auto
>
> ==============================================================================
> --- llvm/trunk/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp (added)
> +++ llvm/trunk/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp Mon Jul 16
> 09:17:08 2012
> @@ -0,0 +1,162 @@
> +//===-- AMDGPUTargetMachine.cpp - TargetMachine for hw codegen
> targets-----===//
> +//
> +//                     The LLVM Compiler Infrastructure
> +//
> +// This file is distributed under the University of Illinois Open Source
> +// License. See LICENSE.TXT for details.
> +//
>
> +//===----------------------------------------------------------------------===//
> +//
> +// The AMDGPU target machine contains all of the hardware specific
> information
> +// needed to emit code for R600 and SI GPUs.
> +//
>
> +//===----------------------------------------------------------------------===//
> +
> +#include "AMDGPUTargetMachine.h"
> +#include "AMDGPU.h"
> +#include "R600ISelLowering.h"
> +#include "R600InstrInfo.h"
> +#include "SIISelLowering.h"
> +#include "SIInstrInfo.h"
> +#include "llvm/Analysis/Passes.h"
> +#include "llvm/Analysis/Verifier.h"
> +#include "llvm/CodeGen/MachineFunctionAnalysis.h"
> +#include "llvm/CodeGen/MachineModuleInfo.h"
> +#include "llvm/CodeGen/Passes.h"
> +#include "llvm/MC/MCAsmInfo.h"
> +#include "llvm/PassManager.h"
> +#include "llvm/Support/TargetRegistry.h"
> +#include "llvm/Support/raw_os_ostream.h"
> +#include "llvm/Transforms/IPO.h"
> +#include "llvm/Transforms/Scalar.h"
> +
> +using namespace llvm;
> +
> +extern "C" void LLVMInitializeAMDGPUTarget() {
> +  // Register the target
> +  RegisterTargetMachine<AMDGPUTargetMachine> X(TheAMDGPUTarget);
> +}
> +
> +AMDGPUTargetMachine::AMDGPUTargetMachine(const Target &T, StringRef TT,
> +    StringRef CPU, StringRef FS,
> +  TargetOptions Options,
> +  Reloc::Model RM, CodeModel::Model CM,
> +  CodeGenOpt::Level OptLevel
> +)
> +:
> +  LLVMTargetMachine(T, TT, CPU, FS, Options, RM, CM, OptLevel),
> +  Subtarget(TT, CPU, FS),
> +  DataLayout(Subtarget.getDataLayout()),
> +  FrameLowering(TargetFrameLowering::StackGrowsUp,
> +      Subtarget.device()->getStackAlignment(), 0),
> +  IntrinsicInfo(this),
> +  InstrItins(&Subtarget.getInstrItineraryData()),
> +  mDump(false)
> +
> +{
> +  // TLInfo uses InstrInfo so it must be initialized after.
> +  if (Subtarget.device()->getGeneration() <= AMDILDeviceInfo::HD6XXX) {
> +    InstrInfo = new R600InstrInfo(*this);
> +    TLInfo = new R600TargetLowering(*this);
> +  } else {
> +    InstrInfo = new SIInstrInfo(*this);
> +    TLInfo = new SITargetLowering(*this);
> +  }
> +}
> +
> +AMDGPUTargetMachine::~AMDGPUTargetMachine()
> +{
> +}
> +
> +bool AMDGPUTargetMachine::addPassesToEmitFile(PassManagerBase &PM,
> +                                              formatted_raw_ostream &Out,
> +                                              CodeGenFileType FileType,
> +                                              bool DisableVerify,
> +                                              AnalysisID StartAfter,
> +                                              AnalysisID StopAfter) {
> +  // XXX: Hack here addPassesToEmitFile will fail, but this is Ok since
> we are
> +  // only using it to access addPassesToGenerateCode()
> +  bool fail = LLVMTargetMachine::addPassesToEmitFile(PM, Out, FileType,
> +                                                     DisableVerify);
> +  assert(fail);
> +
> +  const AMDILSubtarget &STM = getSubtarget<AMDILSubtarget>();
> +  std::string gpu = STM.getDeviceName();
> +  if (gpu == "SI") {
> +    PM.add(createSICodeEmitterPass(Out));
> +  } else if (Subtarget.device()->getGeneration() <=
> AMDILDeviceInfo::HD6XXX) {
> +    PM.add(createR600CodeEmitterPass(Out));
> +  } else {
> +    abort();
> +    return true;
> +  }
> +  PM.add(createGCInfoDeleter());
> +
> +  return false;
> +}
> +
> +namespace {
> +class AMDGPUPassConfig : public TargetPassConfig {
> +public:
> +  AMDGPUPassConfig(AMDGPUTargetMachine *TM, PassManagerBase &PM)
> +    : TargetPassConfig(TM, PM) {}
> +
> +  AMDGPUTargetMachine &getAMDGPUTargetMachine() const {
> +    return getTM<AMDGPUTargetMachine>();
> +  }
> +
> +  virtual bool addPreISel();
> +  virtual bool addInstSelector();
> +  virtual bool addPreRegAlloc();
> +  virtual bool addPostRegAlloc();
> +  virtual bool addPreSched2();
> +  virtual bool addPreEmitPass();
> +};
> +} // End of anonymous namespace
> +
> +TargetPassConfig *AMDGPUTargetMachine::createPassConfig(PassManagerBase
> &PM) {
> +  return new AMDGPUPassConfig(this, PM);
> +}
> +
> +bool
> +AMDGPUPassConfig::addPreISel()
> +{
> +  const AMDILSubtarget &ST = TM->getSubtarget<AMDILSubtarget>();
> +  if (ST.device()->getGeneration() <= AMDILDeviceInfo::HD6XXX) {
> +    addPass(createR600KernelParametersPass(
> +                     getAMDGPUTargetMachine().getTargetData()));
> +  }
> +  return false;
> +}
> +
> +bool AMDGPUPassConfig::addInstSelector() {
> +  addPass(createAMDILPeepholeOpt(*TM));
> +  addPass(createAMDILISelDag(getAMDGPUTargetMachine()));
> +  return false;
> +}
> +
> +bool AMDGPUPassConfig::addPreRegAlloc() {
> +  const AMDILSubtarget &ST = TM->getSubtarget<AMDILSubtarget>();
> +
> +  if (ST.device()->getGeneration() > AMDILDeviceInfo::HD6XXX) {
> +    addPass(createSIAssignInterpRegsPass(*TM));
> +  }
> +  addPass(createAMDGPUConvertToISAPass(*TM));
> +  return false;
> +}
> +
> +bool AMDGPUPassConfig::addPostRegAlloc() {
> +  return false;
> +}
> +
> +bool AMDGPUPassConfig::addPreSched2() {
> +  return false;
> +}
> +
> +bool AMDGPUPassConfig::addPreEmitPass() {
> +  addPass(createAMDILCFGPreparationPass(*TM));
> +  addPass(createAMDILCFGStructurizerPass(*TM));
> +
> +  return false;
> +}
> +
>
> Added: llvm/trunk/lib/Target/AMDGPU/AMDGPUTargetMachine.h
> URL:
> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AMDGPU/AMDGPUTargetMachine.h?rev=160270&view=auto
>
> ==============================================================================
> --- llvm/trunk/lib/Target/AMDGPU/AMDGPUTargetMachine.h (added)
> +++ llvm/trunk/lib/Target/AMDGPU/AMDGPUTargetMachine.h Mon Jul 16 09:17:08
> 2012
> @@ -0,0 +1,76 @@
> +//===-- AMDGPUTargetMachine.h - AMDGPU TargetMachine Interface --*- C++
> -*-===//
> +//
> +//                     The LLVM Compiler Infrastructure
> +//
> +// This file is distributed under the University of Illinois Open Source
> +// License. See LICENSE.TXT for details.
> +//
>
> +//===----------------------------------------------------------------------===//
> +//
> +//  The AMDGPU TargetMachine interface definition for hw codgen targets.
> +//
>
> +//===----------------------------------------------------------------------===//
> +
> +#ifndef AMDGPU_TARGET_MACHINE_H
> +#define AMDGPU_TARGET_MACHINE_H
> +
> +#include "AMDGPUInstrInfo.h"
> +#include "AMDGPUSubtarget.h"
> +#include "AMDILFrameLowering.h"
> +#include "AMDILIntrinsicInfo.h"
> +#include "R600ISelLowering.h"
> +#include "llvm/ADT/OwningPtr.h"
> +#include "llvm/Target/TargetData.h"
> +
> +namespace llvm {
> +
> +MCAsmInfo* createMCAsmInfo(const Target &T, StringRef TT);
> +
> +class AMDGPUTargetMachine : public LLVMTargetMachine {
> +
> +  AMDGPUSubtarget Subtarget;
> +  const TargetData DataLayout;
> +  AMDILFrameLowering FrameLowering;
> +  AMDILIntrinsicInfo IntrinsicInfo;
> +  const AMDGPUInstrInfo * InstrInfo;
> +  AMDGPUTargetLowering * TLInfo;
> +  const InstrItineraryData* InstrItins;
> +  bool mDump;
> +
> +public:
> +   AMDGPUTargetMachine(const Target &T, StringRef TT, StringRef FS,
> +                       StringRef CPU,
> +                       TargetOptions Options,
> +                       Reloc::Model RM, CodeModel::Model CM,
> +                       CodeGenOpt::Level OL);
> +   ~AMDGPUTargetMachine();
> +   virtual const AMDILFrameLowering* getFrameLowering() const {
> +     return &FrameLowering;
> +   }
> +   virtual const AMDILIntrinsicInfo* getIntrinsicInfo() const {
> +     return &IntrinsicInfo;
> +   }
> +   virtual const AMDGPUInstrInfo *getInstrInfo() const {return InstrInfo;}
> +   virtual const AMDGPUSubtarget *getSubtargetImpl() const {return
> &Subtarget; }
> +   virtual const AMDGPURegisterInfo *getRegisterInfo() const {
> +      return &InstrInfo->getRegisterInfo();
> +   }
> +   virtual AMDGPUTargetLowering * getTargetLowering() const {
> +      return TLInfo;
> +   }
> +   virtual const InstrItineraryData* getInstrItineraryData() const {
> +      return InstrItins;
> +   }
> +   virtual const TargetData* getTargetData() const { return &DataLayout; }
> +   virtual TargetPassConfig *createPassConfig(PassManagerBase &PM);
> +   virtual bool addPassesToEmitFile(PassManagerBase &PM,
> +                                              formatted_raw_ostream &Out,
> +                                              CodeGenFileType FileType,
> +                                              bool DisableVerify,
> +                                              AnalysisID StartAfter = 0,
> +                                              AnalysisID StopAfter = 0);
> +};
> +
> +} // End namespace llvm
> +
> +#endif // AMDGPU_TARGET_MACHINE_H
>
> Added: llvm/trunk/lib/Target/AMDGPU/AMDGPUUtil.cpp
> URL:
> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AMDGPU/AMDGPUUtil.cpp?rev=160270&view=auto
>
> ==============================================================================
> --- llvm/trunk/lib/Target/AMDGPU/AMDGPUUtil.cpp (added)
> +++ llvm/trunk/lib/Target/AMDGPU/AMDGPUUtil.cpp Mon Jul 16 09:17:08 2012
> @@ -0,0 +1,139 @@
> +//===-- AMDGPUUtil.cpp - AMDGPU Utility functions
> -------------------------===//
> +//
> +//                     The LLVM Compiler Infrastructure
> +//
> +// This file is distributed under the University of Illinois Open Source
> +// License. See LICENSE.TXT for details.
> +//
>
> +//===----------------------------------------------------------------------===//
> +//
> +// Common utility functions used by hw codegen targets
> +//
>
> +//===----------------------------------------------------------------------===//
> +
> +#include "AMDGPUUtil.h"
> +#include "AMDGPURegisterInfo.h"
> +#include "AMDIL.h"
> +#include "llvm/CodeGen/MachineFunction.h"
> +#include "llvm/CodeGen/MachineInstrBuilder.h"
> +#include "llvm/CodeGen/MachineRegisterInfo.h"
> +#include "llvm/Target/TargetInstrInfo.h"
> +#include "llvm/Target/TargetMachine.h"
> +#include "llvm/Target/TargetRegisterInfo.h"
> +
> +using namespace llvm;
> +
> +// Some instructions act as place holders to emulate operations that the
> GPU
> +// hardware does automatically. This function can be used to check if
> +// an opcode falls into this category.
> +bool AMDGPU::isPlaceHolderOpcode(unsigned opcode)
> +{
> +  switch (opcode) {
> +  default: return false;
> +  case AMDGPU::RETURN:
> +  case AMDGPU::LOAD_INPUT:
> +  case AMDGPU::LAST:
> +  case AMDGPU::MASK_WRITE:
> +  case AMDGPU::RESERVE_REG:
> +    return true;
> +  }
> +}
> +
> +bool AMDGPU::isTransOp(unsigned opcode)
> +{
> +  switch(opcode) {
> +    default: return false;
> +
> +    case AMDGPU::COS_r600:
> +    case AMDGPU::COS_eg:
> +    case AMDGPU::MULLIT:
> +    case AMDGPU::MUL_LIT_r600:
> +    case AMDGPU::MUL_LIT_eg:
> +    case AMDGPU::EXP_IEEE_r600:
> +    case AMDGPU::EXP_IEEE_eg:
> +    case AMDGPU::LOG_CLAMPED_r600:
> +    case AMDGPU::LOG_IEEE_r600:
> +    case AMDGPU::LOG_CLAMPED_eg:
> +    case AMDGPU::LOG_IEEE_eg:
> +      return true;
> +  }
> +}
> +
> +bool AMDGPU::isTexOp(unsigned opcode)
> +{
> +  switch(opcode) {
> +  default: return false;
> +  case AMDGPU::TEX_LD:
> +  case AMDGPU::TEX_GET_TEXTURE_RESINFO:
> +  case AMDGPU::TEX_SAMPLE:
> +  case AMDGPU::TEX_SAMPLE_C:
> +  case AMDGPU::TEX_SAMPLE_L:
> +  case AMDGPU::TEX_SAMPLE_C_L:
> +  case AMDGPU::TEX_SAMPLE_LB:
> +  case AMDGPU::TEX_SAMPLE_C_LB:
> +  case AMDGPU::TEX_SAMPLE_G:
> +  case AMDGPU::TEX_SAMPLE_C_G:
> +  case AMDGPU::TEX_GET_GRADIENTS_H:
> +  case AMDGPU::TEX_GET_GRADIENTS_V:
> +  case AMDGPU::TEX_SET_GRADIENTS_H:
> +  case AMDGPU::TEX_SET_GRADIENTS_V:
> +    return true;
> +  }
> +}
> +
> +bool AMDGPU::isReductionOp(unsigned opcode)
> +{
> +  switch(opcode) {
> +    default: return false;
> +    case AMDGPU::DOT4_r600:
> +    case AMDGPU::DOT4_eg:
> +      return true;
> +  }
> +}
> +
> +bool AMDGPU::isCubeOp(unsigned opcode)
> +{
> +  switch(opcode) {
> +    default: return false;
> +    case AMDGPU::CUBE_r600:
> +    case AMDGPU::CUBE_eg:
> +      return true;
> +  }
> +}
> +
> +
> +bool AMDGPU::isFCOp(unsigned opcode)
> +{
> +  switch(opcode) {
> +  default: return false;
> +  case AMDGPU::BREAK_LOGICALZ_f32:
> +  case AMDGPU::BREAK_LOGICALNZ_i32:
> +  case AMDGPU::BREAK_LOGICALZ_i32:
> +  case AMDGPU::BREAK_LOGICALNZ_f32:
> +  case AMDGPU::CONTINUE_LOGICALNZ_f32:
> +  case AMDGPU::IF_LOGICALNZ_i32:
> +  case AMDGPU::IF_LOGICALZ_f32:
> +  case AMDGPU::ELSE:
> +  case AMDGPU::ENDIF:
> +  case AMDGPU::ENDLOOP:
> +  case AMDGPU::IF_LOGICALNZ_f32:
> +  case AMDGPU::WHILELOOP:
> +    return true;
> +  }
> +}
> +
> +void AMDGPU::utilAddLiveIn(MachineFunction * MF,
> +                           MachineRegisterInfo & MRI,
> +                           const TargetInstrInfo * TII,
> +                           unsigned physReg, unsigned virtReg)
> +{
> +    if (!MRI.isLiveIn(physReg)) {
> +      MRI.addLiveIn(physReg, virtReg);
> +      MF->front().addLiveIn(physReg);
> +      BuildMI(MF->front(), MF->front().begin(), DebugLoc(),
> +              TII->get(TargetOpcode::COPY), virtReg)
> +                .addReg(physReg);
> +    } else {
> +      MRI.replaceRegWith(virtReg, MRI.getLiveInVirtReg(physReg));
> +    }
> +}
>
> Added: llvm/trunk/lib/Target/AMDGPU/AMDGPUUtil.h
> URL:
> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AMDGPU/AMDGPUUtil.h?rev=160270&view=auto
>
> ==============================================================================
> --- llvm/trunk/lib/Target/AMDGPU/AMDGPUUtil.h (added)
> +++ llvm/trunk/lib/Target/AMDGPU/AMDGPUUtil.h Mon Jul 16 09:17:08 2012
> @@ -0,0 +1,46 @@
> +//===-- AMDGPUUtil.h - AMDGPU Utility function declarations -----*- C++
> -*-===//
> +//
> +//                     The LLVM Compiler Infrastructure
> +//
> +// This file is distributed under the University of Illinois Open Source
> +// License. See LICENSE.TXT for details.
> +//
>
> +//===----------------------------------------------------------------------===//
> +//
> +// Declarations for utility functions common to all hw codegen targets.
> +//
>
> +//===----------------------------------------------------------------------===//
> +
> +#ifndef AMDGPU_UTIL_H
> +#define AMDGPU_UTIL_H
> +
> +namespace llvm {
> +
> +class MachineFunction;
> +class MachineRegisterInfo;
> +class TargetInstrInfo;
> +
> +namespace AMDGPU {
> +
> +bool isPlaceHolderOpcode(unsigned opcode);
> +
> +bool isTransOp(unsigned opcode);
> +bool isTexOp(unsigned opcode);
> +bool isReductionOp(unsigned opcode);
> +bool isCubeOp(unsigned opcode);
> +bool isFCOp(unsigned opcode);
> +
> +// XXX: Move these to AMDGPUInstrInfo.h
> +#define MO_FLAG_CLAMP (1 << 0)
> +#define MO_FLAG_NEG   (1 << 1)
> +#define MO_FLAG_ABS   (1 << 2)
> +#define MO_FLAG_MASK  (1 << 3)
> +
> +void utilAddLiveIn(MachineFunction * MF, MachineRegisterInfo & MRI,
> +    const TargetInstrInfo * TII, unsigned physReg, unsigned virtReg);
> +
> +} // End namespace AMDGPU
> +
> +} // End namespace llvm
> +
> +#endif // AMDGPU_UTIL_H
>
> Added: llvm/trunk/lib/Target/AMDGPU/AMDIL.h
> URL:
> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AMDGPU/AMDIL.h?rev=160270&view=auto
>
> ==============================================================================
> --- llvm/trunk/lib/Target/AMDGPU/AMDIL.h (added)
> +++ llvm/trunk/lib/Target/AMDGPU/AMDIL.h Mon Jul 16 09:17:08 2012
> @@ -0,0 +1,251 @@
> +//===-- AMDIL.h - Top-level interface for AMDIL representation --*- C++
> -*-===//
> +//
> +//                     The LLVM Compiler Infrastructure
> +//
> +// This file is distributed under the University of Illinois Open Source
> +// License. See LICENSE.TXT for details.
> +//
>
> +//==-----------------------------------------------------------------------===//
> +//
> +// This file contains the entry points for global functions defined in
> the LLVM
> +// AMDIL back-end.
> +//
>
> +//===----------------------------------------------------------------------===//
> +
> +#ifndef AMDIL_H_
> +#define AMDIL_H_
> +
> +#include "llvm/CodeGen/MachineFunction.h"
> +#include "llvm/Target/TargetMachine.h"
> +
> +#define AMDIL_MAJOR_VERSION 2
> +#define AMDIL_MINOR_VERSION 0
> +#define AMDIL_REVISION_NUMBER 74
> +#define ARENA_SEGMENT_RESERVED_UAVS 12
> +#define DEFAULT_ARENA_UAV_ID 8
> +#define DEFAULT_RAW_UAV_ID 7
> +#define GLOBAL_RETURN_RAW_UAV_ID 11
> +#define HW_MAX_NUM_CB 8
> +#define MAX_NUM_UNIQUE_UAVS 8
> +#define OPENCL_MAX_NUM_ATOMIC_COUNTERS 8
> +#define OPENCL_MAX_READ_IMAGES 128
> +#define OPENCL_MAX_WRITE_IMAGES 8
> +#define OPENCL_MAX_SAMPLERS 16
> +
> +// The next two values can never be zero, as zero is the ID that is
> +// used to assert against.
> +#define DEFAULT_LDS_ID     1
> +#define DEFAULT_GDS_ID     1
> +#define DEFAULT_SCRATCH_ID 1
> +#define DEFAULT_VEC_SLOTS  8
> +
> +// SC->CAL version matchings.
> +#define CAL_VERSION_SC_150               1700
> +#define CAL_VERSION_SC_149               1700
> +#define CAL_VERSION_SC_148               1525
> +#define CAL_VERSION_SC_147               1525
> +#define CAL_VERSION_SC_146               1525
> +#define CAL_VERSION_SC_145               1451
> +#define CAL_VERSION_SC_144               1451
> +#define CAL_VERSION_SC_143               1441
> +#define CAL_VERSION_SC_142               1441
> +#define CAL_VERSION_SC_141               1420
> +#define CAL_VERSION_SC_140               1400
> +#define CAL_VERSION_SC_139               1387
> +#define CAL_VERSION_SC_138               1387
> +#define CAL_APPEND_BUFFER_SUPPORT        1340
> +#define CAL_VERSION_SC_137               1331
> +#define CAL_VERSION_SC_136                982
> +#define CAL_VERSION_SC_135                950
> +#define CAL_VERSION_GLOBAL_RETURN_BUFFER  990
> +
> +#define OCL_DEVICE_RV710        0x0001
> +#define OCL_DEVICE_RV730        0x0002
> +#define OCL_DEVICE_RV770        0x0004
> +#define OCL_DEVICE_CEDAR        0x0008
> +#define OCL_DEVICE_REDWOOD      0x0010
> +#define OCL_DEVICE_JUNIPER      0x0020
> +#define OCL_DEVICE_CYPRESS      0x0040
> +#define OCL_DEVICE_CAICOS       0x0080
> +#define OCL_DEVICE_TURKS        0x0100
> +#define OCL_DEVICE_BARTS        0x0200
> +#define OCL_DEVICE_CAYMAN       0x0400
> +#define OCL_DEVICE_ALL          0x3FFF
> +
> +/// The number of function ID's that are reserved for
> +/// internal compiler usage.
> +const unsigned int RESERVED_FUNCS = 1024;
> +
> +#define AMDIL_OPT_LEVEL_DECL
> +#define  AMDIL_OPT_LEVEL_VAR
> +#define AMDIL_OPT_LEVEL_VAR_NO_COMMA
> +
> +namespace llvm {
> +class AMDILInstrPrinter;
> +class FunctionPass;
> +class MCAsmInfo;
> +class raw_ostream;
> +class Target;
> +class TargetMachine;
> +
> +/// Instruction selection passes.
> +FunctionPass*
> +  createAMDILISelDag(TargetMachine &TM AMDIL_OPT_LEVEL_DECL);
> +FunctionPass*
> +  createAMDILPeepholeOpt(TargetMachine &TM AMDIL_OPT_LEVEL_DECL);
> +
> +/// Pre emit passes.
> +FunctionPass*
> +  createAMDILCFGPreparationPass(TargetMachine &TM AMDIL_OPT_LEVEL_DECL);
> +FunctionPass*
> +  createAMDILCFGStructurizerPass(TargetMachine &TM AMDIL_OPT_LEVEL_DECL);
> +
> +extern Target TheAMDILTarget;
> +extern Target TheAMDGPUTarget;
> +} // end namespace llvm;
> +
> +#define GET_REGINFO_ENUM
> +#include "AMDGPUGenRegisterInfo.inc"
> +#define GET_INSTRINFO_ENUM
> +#include "AMDGPUGenInstrInfo.inc"
> +
> +/// Include device information enumerations
> +#include "AMDILDeviceInfo.h"
> +
> +namespace llvm {
> +/// OpenCL uses address spaces to differentiate between
> +/// various memory regions on the hardware. On the CPU
> +/// all of the address spaces point to the same memory,
> +/// however on the GPU, each address space points to
> +/// a seperate piece of memory that is unique from other
> +/// memory locations.
> +namespace AMDILAS {
> +enum AddressSpaces {
> +  PRIVATE_ADDRESS  = 0, // Address space for private memory.
> +  GLOBAL_ADDRESS   = 1, // Address space for global memory (RAT0, VTX0).
> +  CONSTANT_ADDRESS = 2, // Address space for constant memory.
> +  LOCAL_ADDRESS    = 3, // Address space for local memory.
> +  REGION_ADDRESS   = 4, // Address space for region memory.
> +  ADDRESS_NONE     = 5, // Address space for unknown memory.
> +  PARAM_D_ADDRESS  = 6, // Address space for direct addressible parameter
> memory (CONST0)
> +  PARAM_I_ADDRESS  = 7, // Address space for indirect addressible
> parameter memory (VTX1)
> +  USER_SGPR_ADDRESS = 8, // Address space for USER_SGPRS on SI
> +  LAST_ADDRESS     = 9
> +};
> +
> +// This union/struct combination is an easy way to read out the
> +// exact bits that are needed.
> +typedef union ResourceRec {
> +  struct {
> +#ifdef __BIG_ENDIAN__
> +    unsigned short isImage       : 1;  // Reserved for future use/llvm.
> +    unsigned short ResourceID    : 10; // Flag to specify the resourece
> ID for
> +                                       // the op.
> +    unsigned short HardwareInst  : 1;  // Flag to specify that this
> instruction
> +                                       // is a hardware instruction.
> +    unsigned short ConflictPtr   : 1;  // Flag to specify that the
> pointer has a
> +                                       // conflict.
> +    unsigned short ByteStore     : 1;  // Flag to specify if the op is a
> byte
> +                                       // store op.
> +    unsigned short PointerPath   : 1;  // Flag to specify if the op is on
> the
> +                                       // pointer path.
> +    unsigned short CacheableRead : 1;  // Flag to specify if the read is
> +                                       // cacheable.
> +#else
> +    unsigned short CacheableRead : 1;  // Flag to specify if the read is
> +                                       // cacheable.
> +    unsigned short PointerPath   : 1;  // Flag to specify if the op is on
> the
> +                                       // pointer path.
> +    unsigned short ByteStore     : 1;  // Flag to specify if the op is
> byte
> +                                       // store op.
> +    unsigned short ConflictPtr   : 1;  // Flag to specify that the
> pointer has
> +                                       // a conflict.
> +    unsigned short HardwareInst  : 1;  // Flag to specify that this
> instruction
> +                                       // is a hardware instruction.
> +    unsigned short ResourceID    : 10; // Flag to specify the resource ID
> for
> +                                       // the op.
> +    unsigned short isImage       : 1;  // Reserved for future use.
> +#endif
> +  } bits;
> +  unsigned short u16all;
> +} InstrResEnc;
> +
> +} // namespace AMDILAS
> +
> +// Enums corresponding to AMDIL condition codes for IL.  These
> +// values must be kept in sync with the ones in the .td file.
> +namespace AMDILCC {
> +enum CondCodes {
> +  // AMDIL specific condition codes. These correspond to the IL_CC_*
> +  // in AMDILInstrInfo.td and must be kept in the same order.
> +  IL_CC_D_EQ  =  0,   // DEQ instruction.
> +  IL_CC_D_GE  =  1,   // DGE instruction.
> +  IL_CC_D_LT  =  2,   // DLT instruction.
> +  IL_CC_D_NE  =  3,   // DNE instruction.
> +  IL_CC_F_EQ  =  4,   //  EQ instruction.
> +  IL_CC_F_GE  =  5,   //  GE instruction.
> +  IL_CC_F_LT  =  6,   //  LT instruction.
> +  IL_CC_F_NE  =  7,   //  NE instruction.
> +  IL_CC_I_EQ  =  8,   // IEQ instruction.
> +  IL_CC_I_GE  =  9,   // IGE instruction.
> +  IL_CC_I_LT  = 10,   // ILT instruction.
> +  IL_CC_I_NE  = 11,   // INE instruction.
> +  IL_CC_U_GE  = 12,   // UGE instruction.
> +  IL_CC_U_LT  = 13,   // ULE instruction.
> +  // Pseudo IL Comparison instructions here.
> +  IL_CC_F_GT  = 14,   //  GT instruction.
> +  IL_CC_U_GT  = 15,
> +  IL_CC_I_GT  = 16,
> +  IL_CC_D_GT  = 17,
> +  IL_CC_F_LE  = 18,   //  LE instruction
> +  IL_CC_U_LE  = 19,
> +  IL_CC_I_LE  = 20,
> +  IL_CC_D_LE  = 21,
> +  IL_CC_F_UNE = 22,
> +  IL_CC_F_UEQ = 23,
> +  IL_CC_F_ULT = 24,
> +  IL_CC_F_UGT = 25,
> +  IL_CC_F_ULE = 26,
> +  IL_CC_F_UGE = 27,
> +  IL_CC_F_ONE = 28,
> +  IL_CC_F_OEQ = 29,
> +  IL_CC_F_OLT = 30,
> +  IL_CC_F_OGT = 31,
> +  IL_CC_F_OLE = 32,
> +  IL_CC_F_OGE = 33,
> +  IL_CC_D_UNE = 34,
> +  IL_CC_D_UEQ = 35,
> +  IL_CC_D_ULT = 36,
> +  IL_CC_D_UGT = 37,
> +  IL_CC_D_ULE = 38,
> +  IL_CC_D_UGE = 39,
> +  IL_CC_D_ONE = 40,
> +  IL_CC_D_OEQ = 41,
> +  IL_CC_D_OLT = 42,
> +  IL_CC_D_OGT = 43,
> +  IL_CC_D_OLE = 44,
> +  IL_CC_D_OGE = 45,
> +  IL_CC_U_EQ  = 46,
> +  IL_CC_U_NE  = 47,
> +  IL_CC_F_O   = 48,
> +  IL_CC_D_O   = 49,
> +  IL_CC_F_UO  = 50,
> +  IL_CC_D_UO  = 51,
> +  IL_CC_L_LE  = 52,
> +  IL_CC_L_GE  = 53,
> +  IL_CC_L_EQ  = 54,
> +  IL_CC_L_NE  = 55,
> +  IL_CC_L_LT  = 56,
> +  IL_CC_L_GT  = 57,
> +  IL_CC_UL_LE = 58,
> +  IL_CC_UL_GE = 59,
> +  IL_CC_UL_EQ = 60,
> +  IL_CC_UL_NE = 61,
> +  IL_CC_UL_LT = 62,
> +  IL_CC_UL_GT = 63,
> +  COND_ERROR  = 64
> +};
> +
> +} // end namespace AMDILCC
> +} // end namespace llvm
> +#endif // AMDIL_H_
>
> Added: llvm/trunk/lib/Target/AMDGPU/AMDIL7XXDevice.cpp
> URL:
> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AMDGPU/AMDIL7XXDevice.cpp?rev=160270&view=auto
>
> ==============================================================================
> --- llvm/trunk/lib/Target/AMDGPU/AMDIL7XXDevice.cpp (added)
> +++ llvm/trunk/lib/Target/AMDGPU/AMDIL7XXDevice.cpp Mon Jul 16 09:17:08
> 2012
> @@ -0,0 +1,128 @@
> +//===-- AMDIL7XXDevice.cpp - Device Info for 7XX GPUs
> ---------------------===//
> +//
> +//                     The LLVM Compiler Infrastructure
> +//
> +// This file is distributed under the University of Illinois Open Source
> +// License. See LICENSE.TXT for details.
> +//
>
> +//==-----------------------------------------------------------------------===//
> +#include "AMDIL7XXDevice.h"
> +#include "AMDILDevice.h"
> +
> +using namespace llvm;
> +
> +AMDIL7XXDevice::AMDIL7XXDevice(AMDILSubtarget *ST) : AMDILDevice(ST)
> +{
> +  setCaps();
> +  std::string name = mSTM->getDeviceName();
> +  if (name == "rv710") {
> +    mDeviceFlag = OCL_DEVICE_RV710;
> +  } else if (name == "rv730") {
> +    mDeviceFlag = OCL_DEVICE_RV730;
> +  } else {
> +    mDeviceFlag = OCL_DEVICE_RV770;
> +  }
> +}
> +
> +AMDIL7XXDevice::~AMDIL7XXDevice()
> +{
> +}
> +
> +void AMDIL7XXDevice::setCaps()
> +{
> +  mSWBits.set(AMDILDeviceInfo::LocalMem);
> +}
> +
> +size_t AMDIL7XXDevice::getMaxLDSSize() const
> +{
> +  if (usesHardware(AMDILDeviceInfo::LocalMem)) {
> +    return MAX_LDS_SIZE_700;
> +  }
> +  return 0;
> +}
> +
> +size_t AMDIL7XXDevice::getWavefrontSize() const
> +{
> +  return AMDILDevice::HalfWavefrontSize;
> +}
> +
> +uint32_t AMDIL7XXDevice::getGeneration() const
> +{
> +  return AMDILDeviceInfo::HD4XXX;
> +}
> +
> +uint32_t AMDIL7XXDevice::getResourceID(uint32_t DeviceID) const
> +{
> +  switch (DeviceID) {
> +  default:
> +    assert(0 && "ID type passed in is unknown!");
> +    break;
> +  case GLOBAL_ID:
> +  case CONSTANT_ID:
> +  case RAW_UAV_ID:
> +  case ARENA_UAV_ID:
> +    break;
> +  case LDS_ID:
> +    if (usesHardware(AMDILDeviceInfo::LocalMem)) {
> +      return DEFAULT_LDS_ID;
> +    }
> +    break;
> +  case SCRATCH_ID:
> +    if (usesHardware(AMDILDeviceInfo::PrivateMem)) {
> +      return DEFAULT_SCRATCH_ID;
> +    }
> +    break;
> +  case GDS_ID:
> +    assert(0 && "GDS UAV ID is not supported on this chip");
> +    if (usesHardware(AMDILDeviceInfo::RegionMem)) {
> +      return DEFAULT_GDS_ID;
> +    }
> +    break;
> +  };
> +
> +  return 0;
> +}
> +
> +uint32_t AMDIL7XXDevice::getMaxNumUAVs() const
> +{
> +  return 1;
> +}
> +
> +AMDIL770Device::AMDIL770Device(AMDILSubtarget *ST): AMDIL7XXDevice(ST)
> +{
> +  setCaps();
> +}
> +
> +AMDIL770Device::~AMDIL770Device()
> +{
> +}
> +
> +void AMDIL770Device::setCaps()
> +{
> +  if (mSTM->isOverride(AMDILDeviceInfo::DoubleOps)) {
> +    mSWBits.set(AMDILDeviceInfo::FMA);
> +    mHWBits.set(AMDILDeviceInfo::DoubleOps);
> +  }
> +  mSWBits.set(AMDILDeviceInfo::BarrierDetect);
> +  mHWBits.reset(AMDILDeviceInfo::LongOps);
> +  mSWBits.set(AMDILDeviceInfo::LongOps);
> +  mSWBits.set(AMDILDeviceInfo::LocalMem);
> +}
> +
> +size_t AMDIL770Device::getWavefrontSize() const
> +{
> +  return AMDILDevice::WavefrontSize;
> +}
> +
> +AMDIL710Device::AMDIL710Device(AMDILSubtarget *ST) : AMDIL7XXDevice(ST)
> +{
> +}
> +
> +AMDIL710Device::~AMDIL710Device()
> +{
> +}
> +
> +size_t AMDIL710Device::getWavefrontSize() const
> +{
> +  return AMDILDevice::QuarterWavefrontSize;
> +}
>
> Added: llvm/trunk/lib/Target/AMDGPU/AMDIL7XXDevice.h
> URL:
> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AMDGPU/AMDIL7XXDevice.h?rev=160270&view=auto
>
> ==============================================================================
> --- llvm/trunk/lib/Target/AMDGPU/AMDIL7XXDevice.h (added)
> +++ llvm/trunk/lib/Target/AMDGPU/AMDIL7XXDevice.h Mon Jul 16 09:17:08 2012
> @@ -0,0 +1,71 @@
> +//==-- AMDIL7XXDevice.h - Define 7XX Device Device for AMDIL ---*- C++
> -*--===//
> +//
> +//                     The LLVM Compiler Infrastructure
> +//
> +// This file is distributed under the University of Illinois Open Source
> +// License. See LICENSE.TXT for details.
> +//
>
> +//==-----------------------------------------------------------------------===//
> +//
> +// Interface for the subtarget data classes.
> +//
>
> +//===----------------------------------------------------------------------===//
> +// This file will define the interface that each generation needs to
> +// implement in order to correctly answer queries on the capabilities of
> the
> +// specific hardware.
>
> +//===----------------------------------------------------------------------===//
> +#ifndef _AMDIL7XXDEVICEIMPL_H_
> +#define _AMDIL7XXDEVICEIMPL_H_
> +#include "AMDILDevice.h"
> +#include "AMDILSubtarget.h"
> +
> +namespace llvm {
> +class AMDILSubtarget;
> +
>
> +//===----------------------------------------------------------------------===//
> +// 7XX generation of devices and their respective sub classes
>
> +//===----------------------------------------------------------------------===//
> +
> +// The AMDIL7XXDevice class represents the generic 7XX device. All 7XX
> +// devices are derived from this class. The AMDIL7XX device will only
> +// support the minimal features that are required to be considered OpenCL
> 1.0
> +// compliant and nothing more.
> +class AMDIL7XXDevice : public AMDILDevice {
> +public:
> +  AMDIL7XXDevice(AMDILSubtarget *ST);
> +  virtual ~AMDIL7XXDevice();
> +  virtual size_t getMaxLDSSize() const;
> +  virtual size_t getWavefrontSize() const;
> +  virtual uint32_t getGeneration() const;
> +  virtual uint32_t getResourceID(uint32_t DeviceID) const;
> +  virtual uint32_t getMaxNumUAVs() const;
> +
> +protected:
> +  virtual void setCaps();
> +}; // AMDIL7XXDevice
> +
> +// The AMDIL770Device class represents the RV770 chip and it's
> +// derivative cards. The difference between this device and the base
> +// class is this device device adds support for double precision
> +// and has a larger wavefront size.
> +class AMDIL770Device : public AMDIL7XXDevice {
> +public:
> +  AMDIL770Device(AMDILSubtarget *ST);
> +  virtual ~AMDIL770Device();
> +  virtual size_t getWavefrontSize() const;
> +private:
> +  virtual void setCaps();
> +}; // AMDIL770Device
> +
> +// The AMDIL710Device class derives from the 7XX base class, but this
> +// class is a smaller derivative, so we need to overload some of the
> +// functions in order to correctly specify this information.
> +class AMDIL710Device : public AMDIL7XXDevice {
> +public:
> +  AMDIL710Device(AMDILSubtarget *ST);
> +  virtual ~AMDIL710Device();
> +  virtual size_t getWavefrontSize() const;
> +}; // AMDIL710Device
> +
> +} // namespace llvm
> +#endif // _AMDILDEVICEIMPL_H_
>
> Added: llvm/trunk/lib/Target/AMDGPU/AMDILAlgorithms.tpp
> URL:
> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AMDGPU/AMDILAlgorithms.tpp?rev=160270&view=auto
>
> ==============================================================================
> --- llvm/trunk/lib/Target/AMDGPU/AMDILAlgorithms.tpp (added)
> +++ llvm/trunk/lib/Target/AMDGPU/AMDILAlgorithms.tpp Mon Jul 16 09:17:08
> 2012
> @@ -0,0 +1,93 @@
> +//===------ AMDILAlgorithms.tpp - AMDIL Template Algorithms Header
> --------===//
> +//
> +//                     The LLVM Compiler Infrastructure
> +//
> +// This file is distributed under the University of Illinois Open Source
> +// License. See LICENSE.TXT for details.
> +//
>
> +//===----------------------------------------------------------------------===//
> +//
> +// This file provides templates algorithms that extend the STL
> algorithms, but
> +// are useful for the AMDIL backend
> +//
>
> +//===----------------------------------------------------------------------===//
> +
> +// A template function that loops through the iterators and passes the
> second
> +// argument along with each iterator to the function. If the function
> returns
> +// true, then the current iterator is invalidated and it moves back,
> before
> +// moving forward to the next iterator, otherwise it moves forward without
> +// issue. This is based on the for_each STL function, but allows a
> reference to
> +// the second argument
> +template<class InputIterator, class Function, typename Arg>
> +Function binaryForEach(InputIterator First, InputIterator Last, Function
> F,
> +                       Arg &Second)
> +{
> +  for ( ; First!=Last; ++First ) {
> +    F(*First, Second);
> +  }
> +  return F;
> +}
> +
> +template<class InputIterator, class Function, typename Arg>
> +Function safeBinaryForEach(InputIterator First, InputIterator Last,
> Function F,
> +                           Arg &Second)
> +{
> +  for ( ; First!=Last; ++First ) {
> +    if (F(*First, Second)) {
> +      --First;
> +    }
> +  }
> +  return F;
> +}
> +
> +// A template function that has two levels of looping before calling the
> +// function with the passed in argument. See binaryForEach for further
> +// explanation
> +template<class InputIterator, class Function, typename Arg>
> +Function binaryNestedForEach(InputIterator First, InputIterator Last,
> +                             Function F, Arg &Second)
> +{
> +  for ( ; First != Last; ++First) {
> +    binaryForEach(First->begin(), First->end(), F, Second);
> +  }
> +  return F;
> +}
> +template<class InputIterator, class Function, typename Arg>
> +Function safeBinaryNestedForEach(InputIterator First, InputIterator Last,
> +                                 Function F, Arg &Second)
> +{
> +  for ( ; First != Last; ++First) {
> +    safeBinaryForEach(First->begin(), First->end(), F, Second);
> +  }
> +  return F;
> +}
> +
> +// Unlike the STL, a pointer to the iterator itself is passed in with the
> 'safe'
> +// versions of these functions This allows the function to handle
> situations
> +// such as invalidated iterators
> +template<class InputIterator, class Function>
> +Function safeForEach(InputIterator First, InputIterator Last, Function F)
> +{
> +  for ( ; First!=Last; ++First )  F(&First)
> +    ; // Do nothing.
> +  return F;
> +}
> +
> +// A template function that has two levels of looping before calling the
> +// function with a pointer to the current iterator. See binaryForEach for
> +// further explanation
> +template<class InputIterator, class SecondIterator, class Function>
> +Function safeNestedForEach(InputIterator First, InputIterator Last,
> +                              SecondIterator S, Function F)
> +{
> +  for ( ; First != Last; ++First) {
> +    SecondIterator sf, sl;
> +    for (sf = First->begin(), sl = First->end();
> +         sf != sl; )  {
> +      if (!F(&sf)) {
> +        ++sf;
> +      }
> +    }
> +  }
> +  return F;
> +}
>
> Added: llvm/trunk/lib/Target/AMDGPU/AMDILBase.td
> URL:
> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AMDGPU/AMDILBase.td?rev=160270&view=auto
>
> ==============================================================================
> --- llvm/trunk/lib/Target/AMDGPU/AMDILBase.td (added)
> +++ llvm/trunk/lib/Target/AMDGPU/AMDILBase.td Mon Jul 16 09:17:08 2012
> @@ -0,0 +1,113 @@
> +//===- AMDIL.td - AMDIL Target Machine -------------*- tablegen -*-===//
> +//
> +//                     The LLVM Compiler Infrastructure
> +//
> +// This file is distributed under the University of Illinois Open Source
> +// License. See LICENSE.TXT for details.
> +//
>
> +//===----------------------------------------------------------------------===//
> +// Target-independent interfaces which we are implementing
>
> +//===----------------------------------------------------------------------===//
> +
> +include "llvm/Target/Target.td"
> +
> +// Dummy Instruction itineraries for pseudo instructions
> +def ALU_NULL : FuncUnit;
> +def NullALU : InstrItinClass;
> +
>
> +//===----------------------------------------------------------------------===//
> +// AMDIL Subtarget features.
>
> +//===----------------------------------------------------------------------===//
> +def FeatureFP64     : SubtargetFeature<"fp64",
> +        "CapsOverride[AMDILDeviceInfo::DoubleOps]",
> +        "true",
> +        "Enable 64bit double precision operations">;
> +def FeatureByteAddress    : SubtargetFeature<"byte_addressable_store",
> +        "CapsOverride[AMDILDeviceInfo::ByteStores]",
> +        "true",
> +        "Enable byte addressable stores">;
> +def FeatureBarrierDetect : SubtargetFeature<"barrier_detect",
> +        "CapsOverride[AMDILDeviceInfo::BarrierDetect]",
> +        "true",
> +        "Enable duplicate barrier detection(HD5XXX or later).">;
> +def FeatureImages : SubtargetFeature<"images",
> +        "CapsOverride[AMDILDeviceInfo::Images]",
> +        "true",
> +        "Enable image functions">;
> +def FeatureMultiUAV : SubtargetFeature<"multi_uav",
> +        "CapsOverride[AMDILDeviceInfo::MultiUAV]",
> +        "true",
> +        "Generate multiple UAV code(HD5XXX family or later)">;
> +def FeatureMacroDB : SubtargetFeature<"macrodb",
> +        "CapsOverride[AMDILDeviceInfo::MacroDB]",
> +        "true",
> +        "Use internal macrodb, instead of macrodb in driver">;
> +def FeatureNoAlias : SubtargetFeature<"noalias",
> +        "CapsOverride[AMDILDeviceInfo::NoAlias]",
> +        "true",
> +        "assert that all kernel argument pointers are not aliased">;
> +def FeatureNoInline : SubtargetFeature<"no-inline",
> +        "CapsOverride[AMDILDeviceInfo::NoInline]",
> +        "true",
> +        "specify whether to not inline functions">;
> +
> +def Feature64BitPtr : SubtargetFeature<"64BitPtr",
> +        "mIs64bit",
> +        "false",
> +        "Specify if 64bit addressing should be used.">;
> +
> +def Feature32on64BitPtr : SubtargetFeature<"64on32BitPtr",
> +        "mIs32on64bit",
> +        "false",
> +        "Specify if 64bit sized pointers with 32bit addressing should be
> used.">;
> +def FeatureDebug : SubtargetFeature<"debug",
> +        "CapsOverride[AMDILDeviceInfo::Debug]",
> +        "true",
> +        "Debug mode is enabled, so disable hardware accelerated address
> spaces.">;
> +def FeatureDumpCode : SubtargetFeature <"DumpCode",
> +        "mDumpCode",
> +        "true",
> +        "Dump MachineInstrs in the CodeEmitter">;
> +
> +
>
> +//===----------------------------------------------------------------------===//
> +// Register File, Calling Conv, Instruction Descriptions
>
> +//===----------------------------------------------------------------------===//
> +
> +
> +include "AMDILRegisterInfo.td"
> +include "AMDILCallingConv.td"
> +include "AMDILInstrInfo.td"
> +
> +def AMDILInstrInfo : InstrInfo {}
> +
>
> +//===----------------------------------------------------------------------===//
> +// AMDIL processors supported.
>
> +//===----------------------------------------------------------------------===//
> +//include "Processors.td"
> +
>
> +//===----------------------------------------------------------------------===//
> +// Declare the target which we are implementing
>
> +//===----------------------------------------------------------------------===//
> +def AMDILAsmWriter : AsmWriter {
> +    string AsmWriterClassName = "AsmPrinter";
> +    int Variant = 0;
> +}
> +
> +def AMDILAsmParser : AsmParser {
> +    string AsmParserClassName = "AsmParser";
> +    int Variant = 0;
> +
> +    string CommentDelimiter = ";";
> +
> +    string RegisterPrefix = "r";
> +
> +}
> +
> +
> +def AMDIL : Target {
> +  // Pull in Instruction Info:
> +  let InstructionSet = AMDILInstrInfo;
> +  let AssemblyWriters = [AMDILAsmWriter];
> +  let AssemblyParsers = [AMDILAsmParser];
> +}
>
> Added: llvm/trunk/lib/Target/AMDGPU/AMDILCFGStructurizer.cpp
> URL:
> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AMDGPU/AMDILCFGStructurizer.cpp?rev=160270&view=auto
>
> ==============================================================================
> --- llvm/trunk/lib/Target/AMDGPU/AMDILCFGStructurizer.cpp (added)
> +++ llvm/trunk/lib/Target/AMDGPU/AMDILCFGStructurizer.cpp Mon Jul 16
> 09:17:08 2012
> @@ -0,0 +1,3236 @@
> +//===-- AMDILCFGStructurizer.cpp - CFG Structurizer
> -----------------------===//
> +//
> +//                     The LLVM Compiler Infrastructure
> +//
> +// This file is distributed under the University of Illinois Open Source
> +// License. See LICENSE.TXT for details.
> +//
>
> +//==-----------------------------------------------------------------------===//
> +
> +#define DEBUGME 0
> +#define DEBUG_TYPE "structcfg"
> +
> +#include "AMDIL.h"
> +#include "AMDILInstrInfo.h"
> +#include "AMDILRegisterInfo.h"
> +#include "AMDILUtilityFunctions.h"
> +#include "llvm/ADT/SCCIterator.h"
> +#include "llvm/ADT/SmallVector.h"
> +#include "llvm/ADT/Statistic.h"
> +#include "llvm/Analysis/DominatorInternals.h"
> +#include "llvm/Analysis/Dominators.h"
> +#include "llvm/CodeGen/MachineDominators.h"
> +#include "llvm/CodeGen/MachineDominators.h"
> +#include "llvm/CodeGen/MachineFunction.h"
> +#include "llvm/CodeGen/MachineFunctionAnalysis.h"
> +#include "llvm/CodeGen/MachineFunctionPass.h"
> +#include "llvm/CodeGen/MachineFunctionPass.h"
> +#include "llvm/CodeGen/MachineInstrBuilder.h"
> +#include "llvm/CodeGen/MachineJumpTableInfo.h"
> +#include "llvm/CodeGen/MachineLoopInfo.h"
> +#include "llvm/CodeGen/MachineRegisterInfo.h"
> +#include "llvm/Target/TargetInstrInfo.h"
> +
> +#define FirstNonDebugInstr(A) A->begin()
> +using namespace llvm;
> +
> +// TODO: move-begin.
> +
>
> +//===----------------------------------------------------------------------===//
> +//
> +// Statistics for CFGStructurizer.
> +//
>
> +//===----------------------------------------------------------------------===//
> +
> +STATISTIC(numSerialPatternMatch,    "CFGStructurizer number of serial
> pattern "
> +    "matched");
> +STATISTIC(numIfPatternMatch,        "CFGStructurizer number of if pattern
> "
> +    "matched");
> +STATISTIC(numLoopbreakPatternMatch, "CFGStructurizer number of loop-break
> "
> +    "pattern matched");
> +STATISTIC(numLoopcontPatternMatch,  "CFGStructurizer number of
> loop-continue "
> +    "pattern matched");
> +STATISTIC(numLoopPatternMatch,      "CFGStructurizer number of loop
> pattern "
> +    "matched");
> +STATISTIC(numClonedBlock,           "CFGStructurizer cloned blocks");
> +STATISTIC(numClonedInstr,           "CFGStructurizer cloned
> instructions");
> +
>
> +//===----------------------------------------------------------------------===//
> +//
> +// Miscellaneous utility for CFGStructurizer.
> +//
>
> +//===----------------------------------------------------------------------===//
> +namespace llvmCFGStruct
> +{
> +#define SHOWNEWINSTR(i) \
> +  if (DEBUGME) errs() << "New instr: " << *i << "\n"
> +
> +#define SHOWNEWBLK(b, msg) \
> +if (DEBUGME) { \
> +  errs() << msg << "BB" << b->getNumber() << "size " << b->size(); \
> +  errs() << "\n"; \
> +}
> +
> +#define SHOWBLK_DETAIL(b, msg) \
> +if (DEBUGME) { \
> +  if (b) { \
> +  errs() << msg << "BB" << b->getNumber() << "size " << b->size(); \
> +  b->print(errs()); \
> +  errs() << "\n"; \
> +  } \
> +}
> +
> +#define INVALIDSCCNUM -1
> +#define INVALIDREGNUM 0
> +
> +template<class LoopinfoT>
> +void PrintLoopinfo(const LoopinfoT &LoopInfo, llvm::raw_ostream &OS) {
> +  for (typename LoopinfoT::iterator iter = LoopInfo.begin(),
> +       iterEnd = LoopInfo.end();
> +       iter != iterEnd; ++iter) {
> +    (*iter)->print(OS, 0);
> +  }
> +}
> +
> +template<class NodeT>
> +void ReverseVector(SmallVector<NodeT *, DEFAULT_VEC_SLOTS> &Src) {
> +  size_t sz = Src.size();
> +  for (size_t i = 0; i < sz/2; ++i) {
> +    NodeT *t = Src[i];
> +    Src[i] = Src[sz - i - 1];
> +    Src[sz - i - 1] = t;
> +  }
> +}
> +
> +} //end namespace llvmCFGStruct
> +
> +
>
> +//===----------------------------------------------------------------------===//
> +//
> +// MachinePostDominatorTree
> +//
>
> +//===----------------------------------------------------------------------===//
> +
> +namespace llvm {
> +
> +/// PostDominatorTree Class - Concrete subclass of DominatorTree that is
> used
> +/// to compute the a post-dominator tree.
> +///
> +struct MachinePostDominatorTree : public MachineFunctionPass {
> +  static char ID; // Pass identification, replacement for typeid
> +  DominatorTreeBase<MachineBasicBlock> *DT;
> +  MachinePostDominatorTree() : MachineFunctionPass(ID)
> +  {
> +    DT = new DominatorTreeBase<MachineBasicBlock>(true); //true indicate
> +    // postdominator
> +  }
> +
> +  ~MachinePostDominatorTree();
> +
> +  virtual bool runOnMachineFunction(MachineFunction &MF);
> +
> +  virtual void getAnalysisUsage(AnalysisUsage &AU) const {
> +    AU.setPreservesAll();
> +    MachineFunctionPass::getAnalysisUsage(AU);
> +  }
> +
> +  inline const std::vector<MachineBasicBlock *> &getRoots() const {
> +    return DT->getRoots();
> +  }
> +
> +  inline MachineDomTreeNode *getRootNode() const {
> +    return DT->getRootNode();
> +  }
> +
> +  inline MachineDomTreeNode *operator[](MachineBasicBlock *BB) const {
> +    return DT->getNode(BB);
> +  }
> +
> +  inline MachineDomTreeNode *getNode(MachineBasicBlock *BB) const {
> +    return DT->getNode(BB);
> +  }
> +
> +  inline bool dominates(MachineDomTreeNode *A, MachineDomTreeNode *B)
> const {
> +    return DT->dominates(A, B);
> +  }
> +
> +  inline bool dominates(MachineBasicBlock *A, MachineBasicBlock *B) const
> {
> +    return DT->dominates(A, B);
> +  }
> +
> +  inline bool
> +  properlyDominates(const MachineDomTreeNode *A, MachineDomTreeNode *B)
> const {
> +    return DT->properlyDominates(A, B);
> +  }
> +
> +  inline bool
> +  properlyDominates(MachineBasicBlock *A, MachineBasicBlock *B) const {
> +    return DT->properlyDominates(A, B);
> +  }
> +
> +  inline MachineBasicBlock *
> +  findNearestCommonDominator(MachineBasicBlock *A, MachineBasicBlock *B) {
> +    return DT->findNearestCommonDominator(A, B);
> +  }
> +
> +  virtual void print(llvm::raw_ostream &OS, const Module *M = 0) const {
> +    DT->print(OS);
> +  }
> +};
> +} //end of namespace llvm
> +
> +char MachinePostDominatorTree::ID = 0;
> +static RegisterPass<MachinePostDominatorTree>
> +machinePostDominatorTreePass("machinepostdomtree",
> +                             "MachinePostDominator Tree Construction",
> +                             true, true);
> +
> +//const PassInfo *const llvm::MachinePostDominatorsID
> +//= &machinePostDominatorTreePass;
> +
> +bool MachinePostDominatorTree::runOnMachineFunction(MachineFunction &F) {
> +  DT->recalculate(F);
> +  //DEBUG(DT->dump());
> +  return false;
> +}
> +
> +MachinePostDominatorTree::~MachinePostDominatorTree() {
> +  delete DT;
> +}
> +
>
> +//===----------------------------------------------------------------------===//
> +//
> +// supporting data structure for CFGStructurizer
> +//
>
> +//===----------------------------------------------------------------------===//
> +
> +namespace llvmCFGStruct
> +{
> +template<class PassT>
> +struct CFGStructTraits {
> +};
> +
> +template <class InstrT>
> +class BlockInformation {
> +public:
> +  bool isRetired;
> +  int  sccNum;
> +  //SmallVector<InstrT*, DEFAULT_VEC_SLOTS> succInstr;
> +  //Instructions defining the corresponding successor.
> +  BlockInformation() : isRetired(false), sccNum(INVALIDSCCNUM) {}
> +};
> +
> +template <class BlockT, class InstrT, class RegiT>
> +class LandInformation {
> +public:
> +  BlockT *landBlk;
> +  std::set<RegiT> breakInitRegs;  //Registers that need to "reg = 0",
> before
> +                                  //WHILELOOP(thisloop) init before
> entering
> +                                  //thisloop.
> +  std::set<RegiT> contInitRegs;   //Registers that need to "reg = 0",
> after
> +                                  //WHILELOOP(thisloop) init after
> entering
> +                                  //thisloop.
> +  std::set<RegiT> endbranchInitRegs; //Init before entering this loop, at
> loop
> +                                     //land block, branch cond on this
> reg.
> +  std::set<RegiT> breakOnRegs;       //registers that need to "if (reg)
> break
> +                                     //endif" after ENDLOOP(thisloop)
> break
> +                                     //outerLoopOf(thisLoop).
> +  std::set<RegiT> contOnRegs;       //registers that need to "if (reg)
> continue
> +                                    //endif" after ENDLOOP(thisloop)
> continue on
> +                                    //outerLoopOf(thisLoop).
> +  LandInformation() : landBlk(NULL) {}
> +};
> +
> +} //end of namespace llvmCFGStruct
> +
>
> +//===----------------------------------------------------------------------===//
> +//
> +// CFGStructurizer
> +//
>
> +//===----------------------------------------------------------------------===//
> +
> +namespace llvmCFGStruct
> +{
> +// bixia TODO: port it to BasicBlock, not just MachineBasicBlock.
> +template<class PassT>
> +class  CFGStructurizer
> +{
> +public:
> +  typedef enum {
> +    Not_SinglePath = 0,
> +    SinglePath_InPath = 1,
> +    SinglePath_NotInPath = 2
> +  } PathToKind;
> +
> +public:
> +  typedef typename PassT::InstructionType         InstrT;
> +  typedef typename PassT::FunctionType            FuncT;
> +  typedef typename PassT::DominatortreeType       DomTreeT;
> +  typedef typename PassT::PostDominatortreeType   PostDomTreeT;
> +  typedef typename PassT::DomTreeNodeType         DomTreeNodeT;
> +  typedef typename PassT::LoopinfoType            LoopInfoT;
> +
> +  typedef GraphTraits<FuncT *>                    FuncGTraits;
> +  //typedef FuncGTraits::nodes_iterator BlockIterator;
> +  typedef typename FuncT::iterator                BlockIterator;
> +
> +  typedef typename FuncGTraits::NodeType          BlockT;
> +  typedef GraphTraits<BlockT *>                   BlockGTraits;
> +  typedef GraphTraits<Inverse<BlockT *> >         InvBlockGTraits;
> +  //typedef BlockGTraits::succ_iterator InstructionIterator;
> +  typedef typename BlockT::iterator               InstrIterator;
> +
> +  typedef CFGStructTraits<PassT>                  CFGTraits;
> +  typedef BlockInformation<InstrT>                BlockInfo;
> +  typedef std::map<BlockT *, BlockInfo *>         BlockInfoMap;
> +
> +  typedef int                                     RegiT;
> +  typedef typename PassT::LoopType                LoopT;
> +  typedef LandInformation<BlockT, InstrT, RegiT>  LoopLandInfo;
> +        typedef std::map<LoopT *, LoopLandInfo *> LoopLandInfoMap;
> +        //landing info for loop break
> +  typedef SmallVector<BlockT *, 32>               BlockTSmallerVector;
> +
> +public:
> +  CFGStructurizer();
> +  ~CFGStructurizer();
> +
> +  /// Perform the CFG structurization
> +  bool run(FuncT &Func, PassT &Pass, const AMDILRegisterInfo *tri);
> +
> +  /// Perform the CFG preparation
> +  bool prepare(FuncT &Func, PassT &Pass, const AMDILRegisterInfo *tri);
> +
> +private:
> +  void   orderBlocks();
> +  void   printOrderedBlocks(llvm::raw_ostream &OS);
> +  int patternMatch(BlockT *CurBlock);
> +  int patternMatchGroup(BlockT *CurBlock);
> +
> +  int serialPatternMatch(BlockT *CurBlock);
> +  int ifPatternMatch(BlockT *CurBlock);
> +  int switchPatternMatch(BlockT *CurBlock);
> +  int loopendPatternMatch(BlockT *CurBlock);
> +  int loopPatternMatch(BlockT *CurBlock);
> +
> +  int loopbreakPatternMatch(LoopT *LoopRep, BlockT *LoopHeader);
> +  int loopcontPatternMatch(LoopT *LoopRep, BlockT *LoopHeader);
> +  //int loopWithoutBreak(BlockT *);
> +
> +  void handleLoopbreak (BlockT *ExitingBlock, LoopT *ExitingLoop,
> +                        BlockT *ExitBlock, LoopT *exitLoop, BlockT
> *landBlock);
> +  void handleLoopcontBlock(BlockT *ContingBlock, LoopT *contingLoop,
> +                           BlockT *ContBlock, LoopT *contLoop);
> +  bool isSameloopDetachedContbreak(BlockT *Src1Block, BlockT *Src2Block);
> +  int handleJumpintoIf(BlockT *HeadBlock, BlockT *TrueBlock,
> +                       BlockT *F...
>
> [Message clipped]
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20120716/b2f095ed/attachment.html>