[LLVMdev] Casting between address spaces and address space semantics

Mon P Wang wangmp at apple.com
Mon Jul 21 10:54:58 PDT 2008


Hi Matthijs,

Thanks for giving some code so we can discuss this in more concrete  
detail.  In terms of the information we need, I think you have it  
right.  We just need a description of how the different address spaces  
relate and I don't see much of an issue with how you implemented to  
InstructionCombining.

As you also mentioned, I don't like that we pass a reference to  
TargetData with the AddrspacesDescriptor and that we have a static  
default implementation store in the class.  It is unclear to me if  
this is the right class for it as not as it describes the structure  
and alignment while address spaces seems distinct from that.  I feel  
that this information should be part of the TargetMachine as the  
address spaces description is part of a target machine but I may be  
wrong here so if someone has a different opinion here, please chime in.

In terms of text based or not, I don't have a strong opinion.  I don't  
see a problem with a text based format as these relationships are  
pretty simple since there can't be weird overlaps.  With only the  
choice of disjoint, subset, and equivalent, the description shouldn't  
be too long and the tool can compute the transitive closure of the  
description and generate a reasonable efficient table for lookup.  I  
don't see a problem with defining one's own class to store this  
information either.

As an aside, I think that the default relationship for two different  
address spaces should be disjoint.  We should not encourage people to  
define equivalent address spaces as in general, I don't see why  
someone would want to do that except in rare circumstances.

   -- Mon Ping


On Jul 21, 2008, at 8:05 AM, Matthijs Kooijman wrote:

> Hi all,
>
>> If I read the standard correctly, the properties of these address  
>> spaces can
>> be fully captured by defining the relationship between every pair  
>> of address
>> spaces (disjoint, identical, subset/superset).
>>
>> I think it would make sense to make these relationships backend/ 
>> platform
>> specific, but for clang and the optimization passes to properly  
>> work with
>> address spaces, these relationships should be known. It seems to me  
>> that the
>> proper way to handle this, is to include this information in the  
>> TargetData
>> class (or whatever platform-specific class is appropriate, I'm not  
>> really into
>> this currently).
>
> To have something to actually talk about, I thought I'd create some  
> code. I've
> got a patch attached which consists roughly of two parts: it adds  
> the concept
> of address space relations to TargetData, and teaches  
> InstructionCombining to
> remove casts based on those relations.
>
> The instcombine is pretty straightforward: If a bitcast changes the  
> address
> space of a pointer in a way that is redundant, and all of the users  
> of the
> bitcasts can be rewritten to use the original pointer as well  
> (currently this
> means loads, stores and geps only), this is done.
>
> The TargetData part is a lot more tricky. I originally planned to  
> simply add a
> virtual method to TargetData and create a specific subclass for our  
> compiler,
> when I discovered that this is not how TargetData is supposed to work.
> Currently, TargetData consists only of a TargetDescription, which is  
> basically
> only sizes and alignment info for different types.
>
> I solved this by creating a new class AddrspacesDescriptor, which  
> has a single
> virtual method that defines the relationship between address spaces.  
> This
> class contains a default implementation and can be subclassed to get a
> different implementation.
>
> A reference to a AddrspaceDescriptor is passed to TargetData's  
> constructor,
> with a default. This means that all of the current code uses the  
> default
> implementation. I'm not so sure what would be the correct way to  
> make the
> various tool select other configurations, currently.
>
> Also, I'm quite positive that TargetData is not really the right  
> place to put
> something like this. At least not in the way it is currently  
> written, passing
> around references to AddrspacesDescriptor is far from ideal.  
> However, I do not
> think we should be trying to define these relations between address  
> spaces in
> a text-based format similar to the current TargetDescription. In  
> that way, we
> would either loose a lot of flexibility, or end up with a huge
> TargetDescription (with one element for each combination of address
> spaces...).
>
> So, any comments on the patch and/or hints on how to do things  
> better? Any
> thoughts on how selecting a address spaces configuration should work  
> in the
> different tools?
>
> Gr.
>
> Matthijs
>
> Index: lib/Transforms/Scalar/InstructionCombining.cpp
> ===================================================================
> --- lib/Transforms/Scalar/InstructionCombining.cpp	(revision 53716)
> +++ lib/Transforms/Scalar/InstructionCombining.cpp	(working copy)
> @@ -400,6 +400,8 @@
>     unsigned GetOrEnforceKnownAlignment(Value *V,
>                                         unsigned PrefAlign = 0);
>
> +    /// Propagate address spaces through bitcasts
> +    Instruction *PropagateAddressSpace(BitCastInst &CI);
>   };
> }
>
> @@ -7841,6 +7843,81 @@
>   return 0;
> }
>
> +/// Try to propagate the address space from the source of the  
> bitcast into all
> +/// of its uses. CI casts between two pointer types with different  
> address spaces.
> +Instruction *InstCombiner::PropagateAddressSpace(BitCastInst &CI) {
> +  // Determine the new type. This is the old dest type with the  
> address space
> +  // changed. We can't simply use the source type, since the  
> element type can be
> +  // changed in the cast in addition to the address space.
> +  const PointerType *DestTy = cast<PointerType>(CI.getDestTy());
> +  const PointerType *SrcTy  = cast<PointerType>(CI.getSrcTy());
> +  const PointerType *NTy = PointerType::get(DestTy- 
> >getElementType(), SrcTy->getAddressSpace());
> +
> +  // Only propagate the address space when the source and dest  
> address spaces
> +  // are equivalent, or the source address space is a subset of the  
> target
> +  // address space.
> +  AddrspacesDescriptor::AddrspaceRelation Rel;
> +  Rel = TD->getAddrspaceRelation(SrcTy->getAddressSpace(),
> +                                 DestTy->getAddressSpace());
> +  if (Rel != AddrspacesDescriptor::Equivalent
> +      && Rel != AddrspacesDescriptor::Subset)
> +    return 0;
> +
> +  // We can propagate the address space if we can modify the type  
> of CI. This
> +  // requires that any uses of it should be modifiable. For now,  
> we'll just
> +  // propagate if all uses are loads and stores.
> +  for (User::use_iterator I = CI.use_begin(), E = CI.use_end(); I ! 
> = E; ++I) {
> +    if (isa<StoreInst>(*I)) {
> +      if (I.getOperandNo() != StoreInst::getPointerOperandIndex())
> +        return 0; // Can only handle store to this pointer
> +    } else if (isa<GetElementPtrInst>(*I)) {
> +      if (I.getOperandNo() !=  
> GetElementPtrInst::getPointerOperandIndex())
> +        return 0;
> +    } else if (!isa<LoadInst>(*I)) {
> +      // Can't handle other instructions
> +      return 0;
> +    }
> +  }
> +  // Insert a new bitcast, if needed
> +  Value *NewVal = InsertBitCastBefore(CI.getOperand(0), NTy, CI);
> +
> +  for (User::use_iterator I = CI.use_begin(), E = CI.use_end(); I ! 
> = E; ++I) {
> +    Instruction *NewInst;
> +    Instruction *OldInst = cast<Instruction>(*I);
> +    // Declare this here, so we can reuse the dyn_cast
> +    GetElementPtrInst *GEP = NULL;
> +    // Create a new instruction
> +    if (isa<StoreInst>(*I))
> +      NewInst = new StoreInst(I->getOperand(0), NewVal);
> +    else if ((GEP = dyn_cast<GetElementPtrInst>(*I))) {
> +      // Insert a new GEP
> +      std::vector<Value*> Indices(GEP->idx_begin(), GEP->idx_end());
> +      NewInst = GetElementPtrInst::Create(NewVal, Indices.begin(),  
> Indices.end());
> +    } else // Must be LoadInst
> +      NewInst = new LoadInst(NewVal);
> +
> +    // Replace the old instruction
> +    InsertNewInstBefore(NewInst, *OldInst);
> +    NewInst->takeName(OldInst);
> +
> +    // Put this in a Value*, since InsertBitCastBefore could return  
> a Value*
> +    // below.
> +    Value *ReplaceWith = NewInst;
> +    if (GEP) {
> +      // Now insert a bitcast as well. We couldn't do this before,  
> because the
> +      // above two lines don't hold for the GEPs
> +      const Type* GEPTy = PointerType::get(GEP->getType()- 
> >getElementType(), DestTy->getAddressSpace());
> +      ReplaceWith = InsertBitCastBefore(NewInst, GEPTy, *GEP);
> +    }
> +    ReplaceInstUsesWith(*OldInst, ReplaceWith);
> +    EraseInstFromFunction(*OldInst);
> +
> +  }
> +
> +  // Return the old instruction, so it can be removed.
> +  return &CI;
> +}
> +
> Instruction *InstCombiner::visitBitCast(BitCastInst &CI) {
>   // If the operands are integer typed then apply the integer  
> transforms,
>   // otherwise just apply the common ones.
> @@ -7873,7 +7950,9 @@
>     // If the address spaces don't match, don't eliminate the  
> bitcast, which is
>     // required for changing types.
>     if (SrcPTy->getAddressSpace() != DstPTy->getAddressSpace())
> -      return 0;
> +      // If we're casting to address space zero, see if we can't  
> just propagate
> +      // the address space into the uses of CI.
> +      return PropagateAddressSpace(CI);
>
>     // If we are casting a malloc or alloca to a pointer to a type  
> of the same
>     // size, rewrite the allocation instruction to allocate the  
> "right" type.
> Index: lib/Target/TargetData.cpp
> ===================================================================
> --- lib/Target/TargetData.cpp	(revision 53716)
> +++ lib/Target/TargetData.cpp	(working copy)
> @@ -231,8 +231,8 @@
>   }
> }
>
> -TargetData::TargetData(const Module *M)
> -  : ImmutablePass((intptr_t)&ID) {
> +TargetData::TargetData(const Module *M, const AddrspacesDescriptor  
> &Addrspaces)
> +  : ImmutablePass((intptr_t)&ID), Addrspaces(Addrspaces) {
>   init(M->getDataLayout());
> }
>
> @@ -604,3 +604,5 @@
> unsigned TargetData::getPreferredAlignmentLog(const GlobalVariable  
> *GV) const {
>   return Log2_32(getPreferredAlignment(GV));
> }
> +
> +const AddrspacesDescriptor TargetData::DefaultAddrspaces;
> Index: include/llvm/Target/TargetData.h
> ===================================================================
> --- include/llvm/Target/TargetData.h	(revision 53716)
> +++ include/llvm/Target/TargetData.h	(working copy)
> @@ -63,12 +63,41 @@
>   std::ostream &dump(std::ostream &os) const;
> };
>
> +/// This class is used to describe relations between different  
> address spaces.
> +/// The default implementation defines all different address spaces  
> as
> +/// equivalent.
> +///
> +/// Create a subclass and pass it to TargetData's constructors to  
> define
> +/// different behaviour.
> +class AddrspacesDescriptor {
> +public:
> +  typedef enum {
> +    Equivalent,
> +    Disjoint,
> +    Subset,
> +    Superset
> +  } AddrspaceRelation;
> +
> +  /// getAddrspaceRelation - Returns the relation between the given  
> address
> +  /// spaces. Should be read left to right, for example Subset  
> means that
> +  /// Addrspace1 is a subset of Addrspace2.
> +  virtual AddrspaceRelation getAddrspaceRelation(
> +    int Addrspace1,
> +    int Addrspace2) const {
> +      return Equivalent;
> +  }
> +
> +  virtual ~AddrspacesDescriptor() {}
> +};
> +
> class TargetData : public ImmutablePass {
> private:
>   bool          LittleEndian;          ///< Defaults to false
>   unsigned char PointerMemSize;        ///< Pointer size in bytes
>   unsigned char PointerABIAlign;       ///< Pointer ABI alignment
>   unsigned char PointerPrefAlign;      ///< Pointer preferred  
> alignment
> +  const AddrspacesDescriptor &Addrspaces; ///< Address spaces  
> descriptor
> +  static const AddrspacesDescriptor DefaultAddrspaces; ///< Default  
> address spaces descriptor
>
>   //! Where the primitive type alignment data is stored.
>   /*!
> @@ -109,20 +138,23 @@
>   ///
>   /// @note This has to exist, because this is a pass, but it should  
> never be
>   /// used.
> -  TargetData() : ImmutablePass(intptr_t(&ID)) {
> +  TargetData() : ImmutablePass(intptr_t(&ID)),  
> Addrspaces(DefaultAddrspaces) {
>     assert(0 && "ERROR: Bad TargetData ctor used.  "
>            "Tool did not specify a TargetData to use?");
>     abort();
>   }
>
>   /// Constructs a TargetData from a specification string. See init().
> -  explicit TargetData(const std::string &TargetDescription)
> -    : ImmutablePass(intptr_t(&ID)) {
> +  explicit TargetData(const std::string &TargetDescription,
> +                      const AddrspacesDescriptor &Addrspaces
> +                      = DefaultAddrspaces)
> +    : ImmutablePass(intptr_t(&ID)), Addrspaces(Addrspaces) {
>     init(TargetDescription);
>   }
>
>   /// Initialize target data from properties stored in the module.
> -  explicit TargetData(const Module *M);
> +  explicit TargetData(const Module *M, const AddrspacesDescriptor  
> &Addrspaces
> +                                       = DefaultAddrspaces);
>
>   TargetData(const TargetData &TD) :
>     ImmutablePass(intptr_t(&ID)),
> @@ -130,6 +162,7 @@
>     PointerMemSize(TD.PointerMemSize),
>     PointerABIAlign(TD.PointerABIAlign),
>     PointerPrefAlign(TD.PointerPrefAlign),
> +    Addrspaces(TD.Addrspaces),
>     Alignments(TD.Alignments)
>   { }
>
> @@ -240,6 +273,15 @@
>   /// requested alignment (if the global has one).
>   unsigned getPreferredAlignmentLog(const GlobalVariable *GV) const;
>
> +  /// getAddrspaceRelation - Returns the relation between the given  
> address
> +  /// spaces. Should be read left to right, for example Subset  
> means that
> +  /// Addrspace1 is a subset of Addrspace2.
> +  AddrspacesDescriptor::AddrspaceRelation getAddrspaceRelation(int  
> Addrspace1,
> +                                                               int  
> Addrspace2) {
> +
> +    return Addrspaces.getAddrspaceRelation(Addrspace1, Addrspace2);
> +  }
> +
>   static char ID; // Pass identification, replacement for typeid
> };
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev




More information about the llvm-dev mailing list