[LLVMdev] Problems expanding fcmp to a libcall

Richard Osborne rlsosborne at googlemail.com
Thu Jul 3 15:07:33 PDT 2008


Evan Cheng wrote:
> On Jul 1, 2008, at 3:42 PM, Richard Osborne wrote:
>
>   
>> Evan Cheng wrote:
>>     
>>> On Jun 25, 2008, at 5:13 AM, Richard Osborne wrote:
>>>
>>>
>>>       
>>>> Evan Cheng wrote:
>>>>
>>>>         
>>>>> On Jun 23, 2008, at 5:35 AM, Richard Osborne wrote:
>>>>>
>>>>>
>>>>>           
>>>>>> I'm trying to write a backend for a target with no hardware  
>>>>>> floating
>>>>>> point support. I've added a single i32 register class. I'm  
>>>>>> wanting  all
>>>>>> floating point operations to be lowered to library function  
>>>>>> calls.  For
>>>>>> the most part LLVM seems to get this right. For example
>>>>>>
>>>>>> define double @div(double %a, double %b) {
>>>>>> %result = fdiv double %a, %b
>>>>>> ret double %result
>>>>>> }
>>>>>>
>>>>>> is expanded to a ISD::CALL of __divdf3 which is then lowered via  
>>>>>> the
>>>>>> LowerOperation hook of my backend.
>>>>>>
>>>>>> However I run into problems with fcmp. With the following code:
>>>>>>
>>>>>> define i1 @fmp(double %a) {
>>>>>> %result = fcmp uno double %a, 0.000000e+00
>>>>>> ret i1 %result
>>>>>> }
>>>>>>
>>>>>> the fcmp is expanded to the a call to __unorddf2 which is then
>>>>>> lowered via the LowerOperation hook of my backend. However for  
>>>>>> some
>>>>>> reason
>>>>>> there remains a ISD::CALL node with __unorddf2 in the DAG after
>>>>>> legalization. This
>>>>>> then causes selection to fail with
>>>>>>
>>>>>> Cannot yet select: 0x13b7cc0: i32,i32,ch = call 0x13b76e0,   
>>>>>> 0x13b7800,
>>>>>> 0x13b7800, 0x13b7800, 0x13b77a0, 0x13b78f0, 0x13b79a0, 0x13b80d0,
>>>>>> 0x13b7a00, 0x13b78f0, 0x13b79a0, 0x13b80d0, 0x13b7a00
>>>>>>
>>>>>> Are there any additional steps I need to take in my target, or  
>>>>>> could
>>>>>> this be a bug in the Legalization phase?
>>>>>>
>>>>>>
>>>>>>             
>>>>> This sounds like a bug in your target. Why not custom lower the f32
>>>>> setcc nodes directly to the desired target nodes rather than doing
>>>>> this  two stage lowering?
>>>>>
>>>>> Evan
>>>>>
>>>>>
>>>>>           
>>>> At the moment I'm not doing any custom lowering in my target - the
>>>> lowering I was describing was what I observed the SectionDAG was   
>>>> doing.
>>>> I was under the impression that LLVM's soft float support meant  
>>>> that  if
>>>> I didn't call addRegisterClass() with any FP types then floating  
>>>> point
>>>> operations would be expanded into libcalls and it would all just
>>>> work(tm). And for the most part it does work - addition, division,  
>>>> etc
>>>> on floating point types are all lower correctly by the SelectionDAG
>>>> without any further intervention.
>>>>
>>>> However it fails fcmp. I was wanting to understand if this was   
>>>> expected
>>>> and if so what I should do about it. It sounds like I need to custom
>>>> lower the nodes directly. I would certainly be nice if this wasn't
>>>> necessary.
>>>>
>>>>         
>>> Ok, I am not sure I understand your original question then.  
>>> Legalizer  is converting the setcc node into a call to __unorddf2.  
>>> Is that what  you want?
>>>       
>> Yes, this is exactly what I want
>>     
>>> But you also stated:
>>>
>>>
>>>       
>>>>>> he fcmp is expanded to the a call to __unorddf2 which is then
>>>>>> lowered via the LowerOperation hook of my backend.
>>>>>>
>>>>>>             
>>> Does that mean you are then lowering the call to some other   
>>> operations?  That means your lowering code is somehow not removing  
>>> the  call code. Perhaps you are not updating all the uses.  
>>> Nevertheless  this is not the right approach, you should instead  
>>> custom lower the  setcc node directly to the target specific node.
>>>
>>>       
>> When the LowerOperation method is called with a call node I create a  
>> new
>> chain of operations and the return the a new operand. However, as  
>> you say,
>> printing the DAG shows that not all the uses are replaced.
>>     
>
> Are you examining the DAG before you lower the ISD::CALL node?
>   
Right before the call to LowerOperation for the ISD::CALL node the DAG 
looks like this:

SelectionDAG has 23 nodes:
  0x97a3c78: ch = EntryToken
    0x97a43e0: <multiple use>
    0x97a41a0: <multiple use>
  0x97a3cd8: i32 = extract_element 0x97a43e0, 0x97a41a0
  0x97a3d30: ch = ArgFlags < zext orig-align:8 >
  0x97a3d68: ch = ArgFlags < zext orig-align:1 >
    0x97a3c78: <multiple use>
    0x97a41a0: <multiple use>
    0x97a41a0: <multiple use>
    0x97a41a0: <multiple use>
    0x97a4168: i32 = ExternalSymbol '__unorddf2'
    0x97a3cd8: <multiple use>
    0x97a3d30: <multiple use>
    0x97a4428: <multiple use>
    0x97a3d68: <multiple use>
    0x97a3cd8: <multiple use>
    0x97a3d30: <multiple use>
    0x97a4428: <multiple use>
    0x97a3d68: <multiple use>
  0x97a3e58: i32,i32,ch = call 0x97a3c78, 0x97a41a0, 0x97a41a0, 
0x97a41a0, 0x97a4168, 0x97a3cd8, 0x97a3d30, 0x97a4428, 0x97a3d68, 
0x97a3cd8, 0x97a3d30, 0x97a4428, 0x97a3d68
    0x97a4200: <multiple use>
    0x97a4200: <multiple use>
  0x97a3ff0: f64 = merge_values 0x97a4200, 0x97a4200
  0x97a41a0: i32 = Constant <0>
      0x97a3c78: <multiple use>
      0x97a42a8: i32 = Register  #1024
    0x97a42e0: i32,ch = CopyFromReg 0x97a3c78, 0x97a42a8
      0x97a3c78: <multiple use>
      0x97a4350: i32 = Register  #1025
    0x97a4388: i32,ch = CopyFromReg 0x97a3c78, 0x97a4350
  0x97a4200: f64 = build_pair 0x97a42e0, 0x97a4388
    0x97a4200: <multiple use>
  0x97a43e0: i64 = bit_convert 0x97a4200
    0x97a43e0: <multiple use>
    0x97a3da8: i32 = Constant <1>
  0x97a4428: i32 = extract_element 0x97a43e0, 0x97a3da8
  0x97a4490: ch = setuo
    0x97a4200: <multiple use>
    0x97a4200: <multiple use>
    0x97a4490: <multiple use>
  0x97a44c8: i32 = setcc 0x97a4200, 0x97a4200, 0x97a4490
    0x97a3e58: <multiple use>
    0x97a3e58: <multiple use>
  0x97a53f8: f64 = build_pair 0x97a3e58, 0x97a3e58:1
    0x97a3c78: <multiple use>
        0x97a4200: <multiple use>
        0x97a4200: <multiple use>
        0x97a4490: <multiple use>
      0x97a3de8: i1 = setcc 0x97a4200, 0x97a4200, 0x97a4490
    0x97a40f0: i32 = any_extend 0x97a3de8
    0x97a4538: ch = ArgFlags < >
  0x97a4570: ch = ret 0x97a3c78, 0x97a40f0, 0x97a4538

There is only one call to LowerOperation with a ISD::CALL node. At the 
end of legalization the DAG looks like this:

SelectionDAG has 32 nodes:
  0x9c06c78: ch = EntryToken
    0x9c073e0: <multiple use>
    0x9c071a0: <multiple use>
  0x9c06ed8: i32 = extract_element 0x9c073e0, 0x9c071a0
  0x9c06f30: ch = ArgFlags < zext orig-align:8 >
  0x9c06f68: ch = ArgFlags < zext orig-align:1 >
  0x9c071a0: i32 = Constant <0>
    0x9c06c78: <multiple use>
    0x9c072a8: i32 = Register  #1024
  0x9c072e0: i32,ch = CopyFromReg 0x9c06c78, 0x9c072a8
    0x9c06c78: <multiple use>
    0x9c07350: i32 = Register  #1025
  0x9c07388: i32,ch = CopyFromReg 0x9c06c78, 0x9c07350
      0x9c072e0: <multiple use>
      0x9c07388: <multiple use>
    0x9c07200: f64 = build_pair 0x9c072e0, 0x9c07388
  0x9c073e0: i64 = bit_convert 0x9c07200
    0x9c073e0: <multiple use>
    0x9c06e28: i32 = Constant <1>
  0x9c07428: i32 = extract_element 0x9c073e0, 0x9c06e28
    0x9c08620: <multiple use>
    0x9c08690: <multiple use>
    0x9c07388: <multiple use>
    0x9c08620: <multiple use>
  0x9c08520: ch,flag = CopyToReg 0x9c08620, 0x9c08690, 0x9c07388, 
0x9c08620:1
  0x9c08550: i32 = Constant <4>
  0x9c085e8: i32 = Register  r0
      0x9c06c78: <multiple use>
      0x9c08550: <multiple use>
    0x9c08590: ch,flag = callseq_start 0x9c06c78, 0x9c08550
    0x9c085e8: <multiple use>
    0x9c072e0: <multiple use>
  0x9c08620: ch,flag = CopyToReg 0x9c08590, 0x9c085e8, 0x9c072e0
  0x9c08690: i32 = Register  r1
  0x9c08720: i32 = Register  r2
    0x9c08520: <multiple use>
    0x9c08720: <multiple use>
    0x9c072e0: <multiple use>
    0x9c08520: <multiple use>
  0x9c08758: ch,flag = CopyToReg 0x9c08520, 0x9c08720, 0x9c072e0, 
0x9c08520:1
  0x9c087e0: i32 = Register  r3
    0x9c08758: <multiple use>
    0x9c087e0: <multiple use>
    0x9c07388: <multiple use>
    0x9c08758: <multiple use>
  0x9c08818: ch,flag = CopyToReg 0x9c08758, 0x9c087e0, 0x9c07388, 
0x9c08758:1
    0x9c08818: <multiple use>
    0x9c088a0: i32 = TargetExternalSymbol '__unorddf2'
    0x9c085e8: <multiple use>
    0x9c08690: <multiple use>
    0x9c08720: <multiple use>
    0x9c087e0: <multiple use>
    0x9c08818: <multiple use>
  0x9c088d8: ch,flag = BL 0x9c08818, 0x9c088a0, 0x9c085e8, 0x9c08690, 
0x9c08720, 0x9c087e0, 0x9c08818:1
      0x9c088d8: <multiple use>
      0x9c08550: <multiple use>
      0x9c071a0: <multiple use>
      0x9c088d8: <multiple use>
    0x9c08998: ch,flag = callseq_end 0x9c088d8, 0x9c08550, 0x9c071a0, 
0x9c088d8:1
    0x9c085e8: <multiple use>
        0x9c06c78: <multiple use>
        0x9c071a0: <multiple use>
        0x9c071a0: <multiple use>
        0x9c071a0: <multiple use>
        0x9c07168: i32 = ExternalSymbol '__unorddf2'
        0x9c06ed8: <multiple use>
        0x9c06f30: <multiple use>
        0x9c07428: <multiple use>
        0x9c06f68: <multiple use>
        0x9c06ed8: <multiple use>
        0x9c06f30: <multiple use>
        0x9c07428: <multiple use>
        0x9c06f68: <multiple use>
      0x9c06fa0: i32,i32,ch = call 0x9c06c78, 0x9c071a0, 0x9c071a0, 
0x9c071a0, 0x9c07168, 0x9c06ed8, 0x9c06f30, 0x9c07428, 0x9c06f68, 
0x9c06ed8, 0x9c06f30, 0x9c07428, 0x9c06f68
      0x9c071a0: <multiple use>
      0x9c08498: ch = setne
    0x9c074c8: i32 = setcc 0x9c06fa0, 0x9c071a0, 0x9c08498
  0x9c08c98: ch,flag = CopyToReg 0x9c08998, 0x9c085e8, 0x9c074c8
    0x9c08c98: <multiple use>
    0x9c071a0: <multiple use>
    0x9c08c98: <multiple use>
  0x9c08d08: ch = RETSP 0x9c08c98, 0x9c071a0, 0x9c08c98:1

Where BL and RETSP are target specific nodes for a call and and return 
respectively.
>   
>> I tracked this down to SelectionDAGLegalize::ExpandLibCall().
>> Here the TargetLowering's LowerCallTo is called to create the call  
>> node,
>> which returns a pair of operands for the chain and the result. The  
>> chain
>> is legalized but the result isn't. Modifying the code to call  
>> LegalizeOp on
>> the result seems to fix the problem I was having: the code
>> compiles and I see a call to __unorddf2 in the assembly.
>>
>> I'm not entirely sure if this fix is correct but it seems to work so  
>> far for
>> my target. See the attached diff.
>
> This seems to break the convention. It should be the responsibility of  
> the caller to further legalize the results.
>
> Evan
That makes sense. In that case I believe 
SelectionDAGLegalize::LegalizeSetCCOperands
should be legalizing the result. The description of this function says 
it tries to create a
legal LHS and RHS but it this case it fails to return a legal LHS. The 
following patch allows me to
compile my original file.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: legalize.diff
Type: text/x-patch
Size: 335 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20080703/8a855025/attachment.bin>


More information about the llvm-dev mailing list