[llvm-commits] [llvm] r52254 - in /llvm/trunk: include/llvm/Target/TargetLowering.h lib/CodeGen/SelectionDAG/DAGCombiner.cpp lib/CodeGen/SelectionDAG/LegalizeDAG.cpp lib/Target/X86/X86InstrSSE.td test/CodeGen/X86/2008-06-13-NotVolatileLoadStore.ll test/CodeGen/X86/2008-06-13-VolatileLoadStore.ll

Fri Jun 13 13:50:53 PDT 2008

Holy cow! That's the longest commit message ever! :-)

Is it possible to ever transform volatile instructions? Would it make  
sense to preprocess all nodes and add any that should never be  
modified into a  set? All uses of these nodes can also be added to the  
set. DAG combiner should never add these nodes to its worklist. Would  
this reduce complexity and / or runtime of dag combiner?

Just thinking out aloud. The idea is probably overly conservative but  
it would be nice to move all such checks to a central place.

Evan

On Jun 13, 2008, at 12:07 PM, Duncan Sands wrote:

> Author: baldrick
> Date: Fri Jun 13 14:07:40 2008
> New Revision: 52254
>
> URL: http://llvm.org/viewvc/llvm-project?rev=52254&view=rev
> Log:
> Disable some DAG combiner optimizations that may be
> wrong for volatile loads and stores.  In fact this
> is almost all of them!  There are three types of
> problems: (1) it is wrong to change the width of
> a volatile memory access.  These may be used to
> do memory mapped i/o, in which case a load can have
> an effect even if the result is not used.  Consider
> loading an i32 but only using the lower 8 bits.  It
> is wrong to change this into a load of an i8, because
> you are no longer tickling the other three bytes.  It
> is also unwise to make a load/store wider.  For
> example, changing an i16 load into an i32 load is
> wrong no matter how aligned things are, since the
> fact of loading an additional 2 bytes can have
> i/o side-effects.  (2) it is wrong to change the
> number of volatile load/stores: they may be counted
> by the hardware.  (3) it is wrong to change a volatile
> load/store that requires one memory access into one
> that requires several.  For example on x86-32, you
> can store a double in one processor operation, but to
> store an i64 requires two (two i32 stores).  In a
> multi-threaded program you may want to bitcast an i64
> to a double and store as a double because that will
> occur atomically, and be indivisible to other threads.
> So it would be wrong to convert the store-of-double
> into a store of an i64, because this will become two
> i32 stores - no longer atomic.  My policy here is
> to say that the number of processor operations for
> an illegal operation is undefined.  So it is alright
> to change a store of an i64 (requires at least two
> stores; but could be validly lowered to memcpy for
> example) into a store of double (one processor op).
> In short, if the new store is legal and has the same
> size then I say that the transform is ok.  It would
> also be possible to say that transforms are always
> ok if before they were illegal, whether after they
> are illegal or not, but that's more awkward to do
> and I doubt it buys us anything much.
> However this exposed an interesting thing - on x86-32
> a store of i64 is considered legal!  That is because
> operations are marked legal by default, regardless of
> whether the type is legal or not.  In some ways this
> is clever: before type legalization this means that
> operations on illegal types are considered legal;
> after type legalization there are no illegal types
> so now operations are only legal if they really are.
> But I consider this to be too cunning for mere mortals.
> Better to do things explicitly by testing AfterLegalize.
> So I have changed things so that operations with illegal
> types are considered illegal - indeed they can never
> map to a machine operation.  However this means that
> the DAG combiner is more conservative because before
> it was "accidentally" performing transforms where the
> type was illegal because the operation was nonetheless
> marked legal.  So in a few such places I added a check
> on AfterLegalize, which I suppose was actually just
> forgotten before.  This causes the DAG combiner to do
> slightly more than it used to, which resulted in the X86
> backend blowing up because it got a slightly surprising
> node it wasn't expecting, so I tweaked it.
>
> Added:
>    llvm/trunk/test/CodeGen/X86/2008-06-13-NotVolatileLoadStore.ll
>    llvm/trunk/test/CodeGen/X86/2008-06-13-VolatileLoadStore.ll
> Modified:
>    llvm/trunk/include/llvm/Target/TargetLowering.h
>    llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
>    llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp
>    llvm/trunk/lib/Target/X86/X86InstrSSE.td
>
> Modified: llvm/trunk/include/llvm/Target/TargetLowering.h
> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Target/TargetLowering.h?rev=52254&r1=52253&r2=52254&view=diff
>
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> ======================================================================
> --- llvm/trunk/include/llvm/Target/TargetLowering.h (original)
> +++ llvm/trunk/include/llvm/Target/TargetLowering.h Fri Jun 13  
> 14:07:40 2008
> @@ -132,7 +132,7 @@
>     assert(RC && "This value type is not natively supported!");
>     return RC;
>   }
> -
> +
>   /// isTypeLegal - Return true if the target has native support for  
> the
>   /// specified value type.  This means that it has a register that  
> directly
>   /// holds it without promotions or expansions.
> @@ -179,7 +179,7 @@
>   const ValueTypeActionImpl &getValueTypeActions() const {
>     return ValueTypeActions;
>   }
> -
> +
>   /// getTypeAction - Return how we should legalize values of this  
> type, either
>   /// it is already legal (return 'Legal') or we need to promote it  
> to a larger
>   /// type (return 'Promote'), or we need to expand it into multiple  
> registers
> @@ -291,15 +291,15 @@
>            "Table isn't big enough!");
>     return (LegalizeAction)((OpActions[Op] >> (2*VT.getSimpleVT()))  
> & 3);
>   }
> -
> +
>   /// isOperationLegal - Return true if the specified operation is  
> legal on this
>   /// target.
>   bool isOperationLegal(unsigned Op, MVT VT) const {
> -    return VT.isSimple() &&
> +    return (VT == MVT::Other || isTypeLegal(VT)) &&
>       (getOperationAction(Op, VT) == Legal ||
>        getOperationAction(Op, VT) == Custom);
>   }
> -
> +
>   /// getLoadXAction - Return how this load with extension should be  
> treated:
>   /// either it is legal, needs to be promoted to a larger size,  
> needs to be
>   /// expanded to some other code sequence, or the target has a  
> custom expander
> @@ -335,7 +335,7 @@
>   /// isTruncStoreLegal - Return true if the specified store with  
> truncation is
>   /// legal on this target.
>   bool isTruncStoreLegal(MVT ValVT, MVT MemVT) const {
> -    return MemVT.isSimple() &&
> +    return isTypeLegal(ValVT) && MemVT.isSimple() &&
>       (getTruncStoreAction(ValVT, MemVT) == Legal ||
>        getTruncStoreAction(ValVT, MemVT) == Custom);
>   }
> @@ -373,7 +373,7 @@
>     return (LegalizeAction)((IndexedModeActions[1][IdxMode] >>
>                              (2*VT.getSimpleVT())) & 3);
>   }
> -
> +
>   /// isIndexedStoreLegal - Return true if the specified indexed  
> load is legal
>   /// on this target.
>   bool isIndexedStoreLegal(unsigned IdxMode, MVT VT) const {
> @@ -398,7 +398,7 @@
>   /// isConvertLegal - Return true if the specified conversion is  
> legal
>   /// on this target.
>   bool isConvertLegal(MVT FromVT, MVT ToVT) const {
> -    return FromVT.isSimple() && ToVT.isSimple() &&
> +    return isTypeLegal(FromVT) && isTypeLegal(ToVT) &&
>       (getConvertAction(FromVT, ToVT) == Legal ||
>        getConvertAction(FromVT, ToVT) == Custom);
>   }
>
> Modified: llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp?rev=52254&r1=52253&r2=52254&view=diff
>
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> ======================================================================
> --- llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp (original)
> +++ llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp Fri Jun 13  
> 14:07:40 2008
> @@ -1535,7 +1535,8 @@
>     AddToWorkList(Lo.Val);
>     SDOperand LoOpt = combine(Lo.Val);
>     if (LoOpt.Val && LoOpt.Val != Lo.Val &&
> -        TLI.isOperationLegal(LoOpt.getOpcode(),  
> LoOpt.getValueType()))
> +        (!AfterLegalize ||
> +         TLI.isOperationLegal(LoOpt.getOpcode(),  
> LoOpt.getValueType())))
>       return CombineTo(N, LoOpt, LoOpt);
>   }
>
> @@ -1545,7 +1546,8 @@
>     AddToWorkList(Hi.Val);
>     SDOperand HiOpt = combine(Hi.Val);
>     if (HiOpt.Val && HiOpt != Hi &&
> -        TLI.isOperationLegal(HiOpt.getOpcode(),  
> HiOpt.getValueType()))
> +        (!AfterLegalize ||
> +         TLI.isOperationLegal(HiOpt.getOpcode(),  
> HiOpt.getValueType())))
>       return CombineTo(N, HiOpt, HiOpt);
>   }
>   return SDOperand();
> @@ -1736,7 +1738,8 @@
>     unsigned BitWidth = N1.getValueSizeInBits();
>     if (DAG.MaskedValueIsZero(N1, APInt::getHighBitsSet(BitWidth,
>                                      BitWidth -  
> EVT.getSizeInBits())) &&
> -        (!AfterLegalize || TLI.isLoadXLegal(ISD::ZEXTLOAD, EVT))) {
> +        ((!AfterLegalize && !LN0->isVolatile()) ||
> +         TLI.isLoadXLegal(ISD::ZEXTLOAD, EVT))) {
>       SDOperand ExtLoad = DAG.getExtLoad(ISD::ZEXTLOAD, VT, LN0- 
> >getChain(),
>                                          LN0->getBasePtr(), LN0- 
> >getSrcValue(),
>                                          LN0->getSrcValueOffset(),  
> EVT,
> @@ -1757,7 +1760,8 @@
>     unsigned BitWidth = N1.getValueSizeInBits();
>     if (DAG.MaskedValueIsZero(N1, APInt::getHighBitsSet(BitWidth,
>                                      BitWidth -  
> EVT.getSizeInBits())) &&
> -        (!AfterLegalize || TLI.isLoadXLegal(ISD::ZEXTLOAD, EVT))) {
> +        ((!AfterLegalize && !LN0->isVolatile()) ||
> +         TLI.isLoadXLegal(ISD::ZEXTLOAD, EVT))) {
>       SDOperand ExtLoad = DAG.getExtLoad(ISD::ZEXTLOAD, VT, LN0- 
> >getChain(),
>                                          LN0->getBasePtr(), LN0- 
> >getSrcValue(),
>                                          LN0->getSrcValueOffset(),  
> EVT,
> @@ -1774,18 +1778,19 @@
>   if (N1C && N0.getOpcode() == ISD::LOAD) {
>     LoadSDNode *LN0 = cast<LoadSDNode>(N0);
>     if (LN0->getExtensionType() != ISD::SEXTLOAD &&
> -        LN0->isUnindexed() && N0.hasOneUse()) {
> +        LN0->isUnindexed() && N0.hasOneUse() &&
> +        // Do not change the width of a volatile load.
> +        !LN0->isVolatile()) {
>       MVT EVT = MVT::Other;
>       uint32_t ActiveBits = N1C->getAPIntValue().getActiveBits();
>       if (ActiveBits > 0 && APIntOps::isMask(ActiveBits, N1C- 
> >getAPIntValue()))
>         EVT = MVT::getIntegerVT(ActiveBits);
>
>       MVT LoadedVT = LN0->getMemoryVT();
> -      if (EVT != MVT::Other && LoadedVT.bitsGT(EVT) &&
> -          // Loading a non-byte sized integer is only valid if the  
> extra bits
> -          // in memory that complete the byte are zero, which is  
> not known here.
> -          // TODO: remove isSimple check when apint codegen support  
> lands.
> -          EVT.isSimple() && EVT.isByteSized() &&
> +      // Do not generate loads of extended integer types since  
> these can be
> +      // expensive (and would be wrong if the type is not byte  
> sized).
> +      if (EVT != MVT::Other && LoadedVT.bitsGT(EVT) &&  
> EVT.isSimple() &&
> +          EVT.isByteSized() && // Exclude MVT::i1, which is simple.
>           (!AfterLegalize || TLI.isLoadXLegal(ISD::ZEXTLOAD, EVT))) {
>         MVT PtrType = N0.getOperand(1).getValueType();
>         // For big endian targets, we need to add an offset to the  
> pointer to
> @@ -1957,7 +1962,7 @@
> // idioms for rotate, and if the target supports rotation  
> instructions, generate
> // a rot[lr].
> SDNode *DAGCombiner::MatchRotate(SDOperand LHS, SDOperand RHS) {
> -  // Must be a legal type.  Expanded an promoted things won't work  
> with rotates.
> +  // Must be a legal type.  Expanded 'n promoted things won't work  
> with rotates.
>   MVT VT = LHS.getValueType();
>   if (!TLI.isTypeLegal(VT)) return 0;
>
> @@ -1965,7 +1970,7 @@
>   bool HasROTL = TLI.isOperationLegal(ISD::ROTL, VT);
>   bool HasROTR = TLI.isOperationLegal(ISD::ROTR, VT);
>   if (!HasROTL && !HasROTR) return 0;
> -
> +
>   // Match "(X shl/srl V1) & V2" where V2 may not be present.
>   SDOperand LHSShift;   // The shift.
>   SDOperand LHSMask;    // AND value if any.
> @@ -2385,13 +2390,12 @@
>   if (N1C && N0.getOpcode() == ISD::SHL && N1 == N0.getOperand(1)) {
>     unsigned LowBits = VT.getSizeInBits() - (unsigned)N1C->getValue();
>     MVT EVT = MVT::getIntegerVT(LowBits);
> -    // TODO: turn on when apint codegen support lands.
> -    // if (!AfterLegalize ||  
> TLI.isOperationLegal(ISD::SIGN_EXTEND_INREG, EVT))
> -    if (EVT.isSimple() &&  
> TLI.isOperationLegal(ISD::SIGN_EXTEND_INREG, EVT))
> +    if (EVT.isSimple() && // TODO: remove when apint codegen  
> support lands.
> +        (!AfterLegalize ||  
> TLI.isOperationLegal(ISD::SIGN_EXTEND_INREG, EVT)))
>       return DAG.getNode(ISD::SIGN_EXTEND_INREG, VT, N0.getOperand(0),
>                          DAG.getValueType(EVT));
>   }
> -
> +
>   // fold (sra (sra x, c1), c2) -> (sra x, c1+c2)
>   if (N1C && N0.getOpcode() == ISD::SRA) {
>     if (ConstantSDNode *C1 =  
> dyn_cast<ConstantSDNode>(N0.getOperand(1))) {
> @@ -2417,13 +2421,12 @@
>         MVT::getIntegerVT(VTValSize - N1C->getValue());
>       // Determine the residual right-shift amount.
>       unsigned ShiftAmt = N1C->getValue() - N01C->getValue();
> -
> +
>       // If the shift is not a no-op (in which case this should be  
> just a sign
>       // extend already), the truncated to type is legal,  
> sign_extend is legal
>       // on that type, and the the truncate to that type is both  
> legal and free,
>       // perform the transform.
>       if (ShiftAmt &&
> -          TLI.isTypeLegal(TruncVT) &&
>           TLI.isOperationLegal(ISD::SIGN_EXTEND, TruncVT) &&
>           TLI.isOperationLegal(ISD::TRUNCATE, VT) &&
>           TLI.isTruncateFree(VT, TruncVT)) {
> @@ -2633,7 +2636,7 @@
>   // If we can fold this based on the true/false value, do so.
>   if (SimplifySelectOps(N, N1, N2))
>     return SDOperand(N, 0);  // Don't revisit N.
> -
> +
>   // fold selects based on a setcc into other things, such as min/ 
> max/abs
>   if (N0.getOpcode() == ISD::SETCC) {
>     // FIXME:
> @@ -2821,7 +2824,8 @@
>
>   // fold (sext (load x)) -> (sext (truncate (sextload x)))
>   if (ISD::isNON_EXTLoad(N0.Val) &&
> -      (!AfterLegalize||TLI.isLoadXLegal(ISD::SEXTLOAD,  
> N0.getValueType()))){
> +      ((!AfterLegalize && !cast<LoadSDNode>(N0)->isVolatile()) ||
> +       TLI.isLoadXLegal(ISD::SEXTLOAD, N0.getValueType()))) {
>     bool DoXform = true;
>     SmallVector<SDNode*, 4> SetCCs;
>     if (!N0.hasOneUse())
> @@ -2862,7 +2866,8 @@
>       ISD::isUNINDEXEDLoad(N0.Val) && N0.hasOneUse()) {
>     LoadSDNode *LN0 = cast<LoadSDNode>(N0);
>     MVT EVT = LN0->getMemoryVT();
> -    if (!AfterLegalize || TLI.isLoadXLegal(ISD::SEXTLOAD, EVT)) {
> +    if ((!AfterLegalize && !LN0->isVolatile()) ||
> +        TLI.isLoadXLegal(ISD::SEXTLOAD, EVT)) {
>       SDOperand ExtLoad = DAG.getExtLoad(ISD::SEXTLOAD, VT, LN0- 
> >getChain(),
>                                          LN0->getBasePtr(), LN0- 
> >getSrcValue(),
>                                          LN0->getSrcValueOffset(),  
> EVT,
> @@ -2944,7 +2949,8 @@
>
>   // fold (zext (load x)) -> (zext (truncate (zextload x)))
>   if (ISD::isNON_EXTLoad(N0.Val) &&
> -      (!AfterLegalize||TLI.isLoadXLegal(ISD::ZEXTLOAD,  
> N0.getValueType()))) {
> +      ((!AfterLegalize && !cast<LoadSDNode>(N0)->isVolatile()) ||
> +       TLI.isLoadXLegal(ISD::ZEXTLOAD, N0.getValueType()))) {
>     bool DoXform = true;
>     SmallVector<SDNode*, 4> SetCCs;
>     if (!N0.hasOneUse())
> @@ -2985,15 +2991,18 @@
>       ISD::isUNINDEXEDLoad(N0.Val) && N0.hasOneUse()) {
>     LoadSDNode *LN0 = cast<LoadSDNode>(N0);
>     MVT EVT = LN0->getMemoryVT();
> -    SDOperand ExtLoad = DAG.getExtLoad(ISD::ZEXTLOAD, VT, LN0- 
> >getChain(),
> -                                       LN0->getBasePtr(), LN0- 
> >getSrcValue(),
> -                                       LN0->getSrcValueOffset(), EVT,
> -                                       LN0->isVolatile(),
> -                                       LN0->getAlignment());
> -    CombineTo(N, ExtLoad);
> -    CombineTo(N0.Val, DAG.getNode(ISD::TRUNCATE, N0.getValueType(),  
> ExtLoad),
> -              ExtLoad.getValue(1));
> -    return SDOperand(N, 0);   // Return N so it doesn't get  
> rechecked!
> +    if ((!AfterLegalize && !LN0->isVolatile()) ||
> +        TLI.isLoadXLegal(ISD::ZEXTLOAD, EVT)) {
> +      SDOperand ExtLoad = DAG.getExtLoad(ISD::ZEXTLOAD, VT, LN0- 
> >getChain(),
> +                                         LN0->getBasePtr(), LN0- 
> >getSrcValue(),
> +                                         LN0->getSrcValueOffset(),  
> EVT,
> +                                         LN0->isVolatile(),
> +                                         LN0->getAlignment());
> +      CombineTo(N, ExtLoad);
> +      CombineTo(N0.Val, DAG.getNode(ISD::TRUNCATE,  
> N0.getValueType(), ExtLoad),
> +                ExtLoad.getValue(1));
> +      return SDOperand(N, 0);   // Return N so it doesn't get  
> rechecked!
> +    }
>   }
>
>   // zext(setcc x,y,cc) -> select_cc x, y, 1, 0, cc
> @@ -3061,7 +3070,8 @@
>
>   // fold (aext (load x)) -> (aext (truncate (extload x)))
>   if (ISD::isNON_EXTLoad(N0.Val) && N0.hasOneUse() &&
> -      (!AfterLegalize||TLI.isLoadXLegal(ISD::EXTLOAD,  
> N0.getValueType()))) {
> +      ((!AfterLegalize && !cast<LoadSDNode>(N0)->isVolatile()) ||
> +       TLI.isLoadXLegal(ISD::EXTLOAD, N0.getValueType()))) {
>     LoadSDNode *LN0 = cast<LoadSDNode>(N0);
>     SDOperand ExtLoad = DAG.getExtLoad(ISD::EXTLOAD, VT, LN0- 
> >getChain(),
>                                        LN0->getBasePtr(), LN0- 
> >getSrcValue(),
> @@ -3177,11 +3187,12 @@
>     }
>   }
>
> -  if (ISD::isNON_EXTLoad(N0.Val) && N0.hasOneUse() &&
> -      // Do not allow folding to a non-byte-sized integer here.   
> These only
> -      // load correctly if the extra bits in memory that complete  
> the byte
> -      // are zero, which is not known here.
> -      VT.isByteSized()) {
> +  // Do not generate loads of extended integer types since these  
> can be
> +  // expensive (and would be wrong if the type is not byte sized).
> +  if (ISD::isNON_EXTLoad(N0.Val) && N0.hasOneUse() && VT.isSimple()  
> &&
> +      VT.isByteSized() && // Exclude MVT::i1, which is simple.
> +      // Do not change the width of a volatile load.
> +      !cast<LoadSDNode>(N0)->isVolatile()) {
>     assert(N0.getValueType().getSizeInBits() > EVTBits &&
>            "Cannot truncate to larger type!");
>     LoadSDNode *LN0 = cast<LoadSDNode>(N0);
> @@ -3281,7 +3292,8 @@
>   if (ISD::isEXTLoad(N0.Val) &&
>       ISD::isUNINDEXEDLoad(N0.Val) &&
>       EVT == cast<LoadSDNode>(N0)->getMemoryVT() &&
> -      (!AfterLegalize || TLI.isLoadXLegal(ISD::SEXTLOAD, EVT))) {
> +      ((!AfterLegalize && !cast<LoadSDNode>(N0)->isVolatile()) ||
> +       TLI.isLoadXLegal(ISD::SEXTLOAD, EVT))) {
>     LoadSDNode *LN0 = cast<LoadSDNode>(N0);
>     SDOperand ExtLoad = DAG.getExtLoad(ISD::SEXTLOAD, VT, LN0- 
> >getChain(),
>                                        LN0->getBasePtr(), LN0- 
> >getSrcValue(),
> @@ -3296,7 +3308,8 @@
>   if (ISD::isZEXTLoad(N0.Val) && ISD::isUNINDEXEDLoad(N0.Val) &&
>       N0.hasOneUse() &&
>       EVT == cast<LoadSDNode>(N0)->getMemoryVT() &&
> -      (!AfterLegalize || TLI.isLoadXLegal(ISD::SEXTLOAD, EVT))) {
> +      ((!AfterLegalize && !cast<LoadSDNode>(N0)->isVolatile()) ||
> +       TLI.isLoadXLegal(ISD::SEXTLOAD, EVT))) {
>     LoadSDNode *LN0 = cast<LoadSDNode>(N0);
>     SDOperand ExtLoad = DAG.getExtLoad(ISD::SEXTLOAD, VT, LN0- 
> >getChain(),
>                                        LN0->getBasePtr(), LN0- 
> >getSrcValue(),
> @@ -3372,16 +3385,20 @@
>   const MachineFrameInfo *MFI =  
> DAG.getMachineFunction().getFrameInfo();
>   if (ISD::isNON_EXTLoad(LD2) &&
>       LD2->hasOneUse() &&
> +      // If both are volatile this would reduce the number of  
> volatile loads.
> +      // If one is volatile it might be ok, but play conservative  
> and bail out.
> +      !cast<LoadSDNode>(LD1)->isVolatile() &&
> +      !cast<LoadSDNode>(LD2)->isVolatile() &&
>       TLI.isConsecutiveLoad(LD2, LD1, LD1VT.getSizeInBits()/8, 1,  
> MFI)) {
>     LoadSDNode *LD = cast<LoadSDNode>(LD1);
>     unsigned Align = LD->getAlignment();
>     unsigned NewAlign = TLI.getTargetMachine().getTargetData()->
>       getABITypeAlignment(VT.getTypeForMVT());
> -    if ((!AfterLegalize || TLI.isTypeLegal(VT)) &&
> -        TLI.isOperationLegal(ISD::LOAD, VT) && NewAlign <= Align)
> +    if (NewAlign <= Align &&
> +        (!AfterLegalize || TLI.isOperationLegal(ISD::LOAD, VT)))
>       return DAG.getLoad(VT, LD->getChain(), LD->getBasePtr(),
>                          LD->getSrcValue(), LD->getSrcValueOffset(),
> -                         LD->isVolatile(), Align);
> +                         false, Align);
>   }
>   return SDOperand();
> }
> @@ -3426,7 +3443,9 @@
>   // fold (conv (load x)) -> (load (conv*)x)
>   // If the resultant load doesn't need a higher alignment than the  
> original!
>   if (ISD::isNormalLoad(N0.Val) && N0.hasOneUse() &&
> -      TLI.isOperationLegal(ISD::LOAD, VT)) {
> +      // Do not change the width of a volatile load.
> +      !cast<LoadSDNode>(N0)->isVolatile() &&
> +      (!AfterLegalize || TLI.isOperationLegal(ISD::LOAD, VT))) {
>     LoadSDNode *LN0 = cast<LoadSDNode>(N0);
>     unsigned Align = TLI.getTargetMachine().getTargetData()->
>       getABITypeAlignment(VT.getTypeForMVT());
> @@ -3441,7 +3460,7 @@
>       return Load;
>     }
>   }
> -
> +
>   // Fold bitconvert(fneg(x)) -> xor(bitconvert(x), signbit)
>   // Fold bitconvert(fabs(x)) -> and(bitconvert(x), ~signbit)
>   // This often reduces constant pool loads.
> @@ -3946,7 +3965,8 @@
>
>   // fold (fpext (load x)) -> (fpext (fptrunc (extload x)))
>   if (ISD::isNON_EXTLoad(N0.Val) && N0.hasOneUse() &&
> -      (!AfterLegalize||TLI.isLoadXLegal(ISD::EXTLOAD,  
> N0.getValueType()))) {
> +      ((!AfterLegalize && !cast<LoadSDNode>(N0)->isVolatile()) ||
> +       TLI.isLoadXLegal(ISD::EXTLOAD, N0.getValueType()))) {
>     LoadSDNode *LN0 = cast<LoadSDNode>(N0);
>     SDOperand ExtLoad = DAG.getExtLoad(ISD::EXTLOAD, VT, LN0- 
> >getChain(),
>                                        LN0->getBasePtr(), LN0- 
> >getSrcValue(),
> @@ -3960,8 +3980,7 @@
>               ExtLoad.getValue(1));
>     return SDOperand(N, 0);   // Return N so it doesn't get rechecked!
>   }
> -
> -
> +
>   return SDOperand();
> }
>
> @@ -4500,7 +4519,7 @@
>                                  ST->isVolatile(), Align);
>     }
>   }
> -
> +
>   // If this is a store of a bit convert, store the input value if the
>   // resultant store does not need a higher alignment than the  
> original.
>   if (Value.getOpcode() == ISD::BIT_CONVERT && !ST- 
> >isTruncatingStore() &&
> @@ -4509,13 +4528,19 @@
>     MVT SVT = Value.getOperand(0).getValueType();
>     unsigned OrigAlign = TLI.getTargetMachine().getTargetData()->
>       getABITypeAlignment(SVT.getTypeForMVT());
> -    if (Align <= OrigAlign && TLI.isOperationLegal(ISD::STORE, SVT))
> +    if (Align <= OrigAlign &&
> +        ((!AfterLegalize && !ST->isVolatile()) ||
> +         TLI.isOperationLegal(ISD::STORE, SVT)))
>       return DAG.getStore(Chain, Value.getOperand(0), Ptr, ST- 
> >getSrcValue(),
>                           ST->getSrcValueOffset(), ST->isVolatile(),  
> Align);
>   }
> -
> +
>   // Turn 'store float 1.0, Ptr' -> 'store int 0x12345678, Ptr'
>   if (ConstantFPSDNode *CFP = dyn_cast<ConstantFPSDNode>(Value)) {
> +    // NOTE: If the original store is volatile, this transform must  
> not increase
> +    // the number of stores.  For example, on x86-32 an f64 can be  
> stored in one
> +    // processor operation but an i64 (which is not legal) requires  
> two.  So the
> +    // transform should not be done in this case.
>     if (Value.getOpcode() != ISD::TargetConstantFP) {
>       SDOperand Tmp;
>       switch (CFP->getValueType(0).getSimpleVT()) {
> @@ -4525,7 +4550,8 @@
>       case MVT::ppcf128:
>         break;
>       case MVT::f32:
> -        if (!AfterLegalize || TLI.isTypeLegal(MVT::i32)) {
> +        if ((!AfterLegalize && !ST->isVolatile()) ||
> +            TLI.isOperationLegal(ISD::STORE, MVT::i32)) {
>           Tmp = DAG.getConstant((uint32_t)CFP->getValueAPF().
>                               convertToAPInt().getZExtValue(),  
> MVT::i32);
>           return DAG.getStore(Chain, Tmp, Ptr, ST->getSrcValue(),
> @@ -4534,13 +4560,15 @@
>         }
>         break;
>       case MVT::f64:
> -        if (!AfterLegalize || TLI.isTypeLegal(MVT::i64)) {
> +        if ((!AfterLegalize && !ST->isVolatile()) ||
> +            TLI.isOperationLegal(ISD::STORE, MVT::i64)) {
>           Tmp = DAG.getConstant(CFP->getValueAPF().convertToAPInt().
>                                   getZExtValue(), MVT::i64);
>           return DAG.getStore(Chain, Tmp, Ptr, ST->getSrcValue(),
>                               ST->getSrcValueOffset(), ST- 
> >isVolatile(),
>                               ST->getAlignment());
> -        } else if (TLI.isTypeLegal(MVT::i32)) {
> +        } else if (!ST->isVolatile() &&
> +                   TLI.isOperationLegal(ISD::STORE, MVT::i32)) {
>           // Many FP stores are not made apparent until after  
> legalize, e.g. for
>           // argument passing.  Since this is so common, custom  
> legalize the
>           // 64-bit integer store into two 32-bit stores.
> @@ -4638,19 +4666,18 @@
>       return Chain;
>     }
>   }
> -
> +
>   // If this is an FP_ROUND or TRUNC followed by a store, fold this  
> into a
>   // truncating store.  We can do this even if this is already a  
> truncstore.
>   if ((Value.getOpcode() == ISD::FP_ROUND || Value.getOpcode() ==  
> ISD::TRUNCATE)
> -      && TLI.isTypeLegal(Value.getOperand(0).getValueType()) &&
> -      Value.Val->hasOneUse() && ST->isUnindexed() &&
> +      && Value.Val->hasOneUse() && ST->isUnindexed() &&
>       TLI.isTruncStoreLegal(Value.getOperand(0).getValueType(),
>                             ST->getMemoryVT())) {
>     return DAG.getTruncStore(Chain, Value.getOperand(0), Ptr, ST- 
> >getSrcValue(),
>                              ST->getSrcValueOffset(), ST- 
> >getMemoryVT(),
>                              ST->isVolatile(), ST->getAlignment());
>   }
> -
> +
>   return SDOperand();
> }
>
> @@ -4731,7 +4758,8 @@
>       // original load.
>       unsigned NewAlign = TLI.getTargetMachine().getTargetData()->
>         getABITypeAlignment(LVT.getTypeForMVT());
> -      if (!TLI.isOperationLegal(ISD::LOAD, LVT) || NewAlign > Align)
> +      if (NewAlign > Align ||
> +          (AfterLegalize && !TLI.isOperationLegal(ISD::LOAD, LVT)))
>         return SDOperand();
>       Align = NewAlign;
>     }
> @@ -5136,6 +5164,9 @@
>     // This triggers in things like "select bool X, 10.0, 123.0"  
> after the FP
>     // constants have been dropped into the constant pool.
>     if (LHS.getOpcode() == ISD::LOAD &&
> +        // Do not let this transformation reduce the number of  
> volatile loads.
> +        !cast<LoadSDNode>(LHS)->isVolatile() &&
> +        !cast<LoadSDNode>(RHS)->isVolatile() &&
>         // Token chains must be identical.
>         LHS.getOperand(0) == RHS.getOperand(0)) {
>       LoadSDNode *LLD = cast<LoadSDNode>(LHS);
>
> Modified: llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp
> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp?rev=52254&r1=52253&r2=52254&view=diff
>
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> ======================================================================
> --- llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp (original)
> +++ llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp Fri Jun 13  
> 14:07:40 2008
> @@ -2384,7 +2384,7 @@
>             Result = DAG.getStore(Tmp1, Tmp3, Tmp2, ST->getSrcValue(),
>                                   SVOffset, isVolatile, Alignment);
>             break;
> -          } else if (getTypeAction(MVT::i32) == Legal) {
> +          } else if (getTypeAction(MVT::i32) == Legal && !ST- 
> >isVolatile()) {
>             // Otherwise, if the target supports 32-bit registers,  
> use 2 32-bit
>             // stores.  If the target supports neither 32- nor 64- 
> bits, this
>             // xform is certainly not worth it.
>
> Modified: llvm/trunk/lib/Target/X86/X86InstrSSE.td
> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86InstrSSE.td?rev=52254&r1=52253&r2=52254&view=diff
>
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> ======================================================================
> --- llvm/trunk/lib/Target/X86/X86InstrSSE.td (original)
> +++ llvm/trunk/lib/Target/X86/X86InstrSSE.td Fri Jun 13 14:07:40 2008
> @@ -2384,6 +2384,8 @@
>             (MOVZDI2PDIrm addr:$src)>;
> def : Pat<(v4i32 (X86vzmovl (bc_v4i32 (loadv4f32 addr:$src)))),
>             (MOVZDI2PDIrm addr:$src)>;
> +def : Pat<(v4i32 (X86vzmovl (bc_v4i32 (loadv2i64 addr:$src)))),
> +            (MOVZDI2PDIrm addr:$src)>;
>
> def MOVZQI2PQIrm : I<0x7E, MRMSrcMem, (outs VR128:$dst), (ins i64mem: 
> $src),
>                      "movq\t{$src, $dst|$dst, $src}",
>
> Added: llvm/trunk/test/CodeGen/X86/2008-06-13-NotVolatileLoadStore.ll
> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/2008-06-13-NotVolatileLoadStore.ll?rev=52254&view=auto
>
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> ======================================================================
> --- llvm/trunk/test/CodeGen/X86/2008-06-13-NotVolatileLoadStore.ll  
> (added)
> +++ llvm/trunk/test/CodeGen/X86/2008-06-13-NotVolatileLoadStore.ll  
> Fri Jun 13 14:07:40 2008
> @@ -0,0 +1,23 @@
> +; RUN: llvm-as < %s | llc -march=x86 | not grep movsd
> +; RUN: llvm-as < %s | llc -march=x86 | grep movw
> +; RUN: llvm-as < %s | llc -march=x86 | grep addw
> +; These transforms are turned off for volatile loads and stores.
> +; Check that they weren't turned off for all loads and stores!
> +
> + at atomic = global double 0.000000e+00		; <double*> [#uses=1]
> + at atomic2 = global double 0.000000e+00		; <double*> [#uses=1]
> + at ioport = global i32 0		; <i32*> [#uses=1]
> + at ioport2 = global i32 0		; <i32*> [#uses=1]
> +
> +define i16 @f(i64 %x) {
> +	%b = bitcast i64 %x to double		; <double> [#uses=1]
> +	store double %b, double* @atomic
> +	store double 0.000000e+00, double* @atomic2
> +	%l = load i32* @ioport		; <i32> [#uses=1]
> +	%t = trunc i32 %l to i16		; <i16> [#uses=1]
> +	%l2 = load i32* @ioport2		; <i32> [#uses=1]
> +	%tmp = lshr i32 %l2, 16		; <i32> [#uses=1]
> +	%t2 = trunc i32 %tmp to i16		; <i16> [#uses=1]
> +	%f = add i16 %t, %t2		; <i16> [#uses=1]
> +	ret i16 %f
> +}
>
> Added: llvm/trunk/test/CodeGen/X86/2008-06-13-VolatileLoadStore.ll
> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/2008-06-13-VolatileLoadStore.ll?rev=52254&view=auto
>
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> ======================================================================
> --- llvm/trunk/test/CodeGen/X86/2008-06-13-VolatileLoadStore.ll  
> (added)
> +++ llvm/trunk/test/CodeGen/X86/2008-06-13-VolatileLoadStore.ll Fri  
> Jun 13 14:07:40 2008
> @@ -0,0 +1,22 @@
> +; RUN: llvm-as < %s | llc -march=x86 | grep movsd | count 5
> +; RUN: llvm-as < %s | llc -march=x86 | grep movl | count 2
> +
> + at atomic = global double 0.000000e+00		; <double*> [#uses=1]
> + at atomic2 = global double 0.000000e+00		; <double*> [#uses=1]
> + at anything = global i64 0		; <i64*> [#uses=1]
> + at ioport = global i32 0		; <i32*> [#uses=2]
> +
> +define i16 @f(i64 %x, double %y) {
> +	%b = bitcast i64 %x to double		; <double> [#uses=1]
> +	volatile store double %b, double* @atomic ; one processor  
> operation only
> +	volatile store double 0.000000e+00, double* @atomic2 ; one  
> processor operation only
> +	%b2 = bitcast double %y to i64		; <i64> [#uses=1]
> +	volatile store i64 %b2, i64* @anything ; may transform to store of  
> double
> +	%l = volatile load i32* @ioport		; must not narrow
> +	%t = trunc i32 %l to i16		; <i16> [#uses=1]
> +	%l2 = volatile load i32* @ioport		; must not narrow
> +	%tmp = lshr i32 %l2, 16		; <i32> [#uses=1]
> +	%t2 = trunc i32 %tmp to i16		; <i16> [#uses=1]
> +	%f = add i16 %t, %t2		; <i16> [#uses=1]
> +	ret i16 %f
> +}
>
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits