[llvm-dev] Specify special cases of delay slots in the back end

Sat Feb 11 04:39:29 PST 2017

   Hello.
     Hal, the problem I have is that it doesn't advance at the next available instruction 
- it always gets the same store. This might be because I did not specify in a file like 
[Target]Schedule.td the  functional units, processor and instruction itineraries.
     Regarding the Stalls argument to my method 
[Target]DispatchGroupSBHazardRecognizer::getHazardType() I always get the argument Stalls 
= 0. This is no surprise since in PostRASchedulerList.cpp we have only one call to it, in 
method SchedulePostRATDList::ListScheduleTopDown():
       ScheduleHazardRecognizer::HazardType HT =
         HazardRec->getHazardType(CurSUnit, 0/*no stalls*/);


    Let me state what I have added to my back end to enable scheduling with hazards:
       - inspiring from lib/Target/PowerPC/PPCHazardRecognizers.h, I have created a class 
[Target]DispatchGroupSBHazardRecognizer : public ScoreboardHazardRecognizer (I use 
ScoreboardHazardRecognizer because I hope in the near future to make my class employ in 
"out-of-order" execution USEFUL program instructions instead of NOP to handle my data 
hazards), implementing for it only a method:
              HazardType getHazardType(SUnit *SU, int Stalls);
          In this method I check if the current SU is a vector store and the previous 
instruction updates the register used by the store, which in my processor is a data 
hazard, in which case I give:
               return NoopHazard;
           and otherwise, I give:
               return ScoreboardHazardRecognizer::getHazardType(SU, Stalls);

       - I implemented in [Target]InstrInfo.cpp 2 more methods:
            - CreateTargetPostRAHazardRecognizer() to register the 
[Target]DispatchGroupSBHazardRecognizer()
            - insertNoop() which returns the target's NOP

       - note that my vector (and scalar) instructions are inspired from the Mips back 
end, which has MSAInst (and MipsInst) with NoItinerary InstrItinClass. Currently I am not 
using a [Target]Schedule.td specifying functional units, processor and instruction 
itineraries. This might be a problem - I guess ScoreboardHazardRecognizer relies on this 
information.

     In principle, should I maybe use the post-RA MI-scheduler instead of the standard 
post-RA scheduler (maybe also 
http://llvm.org/docs/doxygen/html/classllvm_1_1MachineSchedStrategy.html ) to deal with my 
hazards ?
Following http://llvm.org/devmtg/2014-10/Slides/Estes-MISchedulerTutorial.pdf, the 
MI-scheduler also handles hazards, but I guess it's less documented, although the AArch64 
is using it.

   Thank you,
     Alex


On 2/10/2017 11:33 PM, Hal Finkel wrote:
> Hi Alex,
>
> All of this makes sense, but are you correctly handling the Stalls argument to
> getHazardType? What are you doing with it?
>
>  -Hal
>
>
> On 02/10/2017 02:42 PM, Alex Susu via llvm-dev wrote:
>>   Hello.
>>    I am progressing a bit with difficulty with the post RA scheduler
>> (PostRASchedulerList.cpp with ScoreboardHazardRecognizer) - the problem I have is that
>> it doesn't advance at the next available instruction when the overridden
>> ScoreboardHazardRecognizer::getHazardType() method returns NoopHazard and it gets stuck
>> at the same instruction (store in my runs).
>>
>>    Just to make sure: I am trying to use the post-RA (Register Allocation) scheduler to
>> avoid data hazards by inserting, if possible, other USEFUL instructions from the program
>> instead of (just) NOPs. Is this out-of-order scheduling (e.g., using the
>> ScoreboardHazardRecognizer) that employs useful program instructions instead of NOPs
>> working well with the post-RA scheduler?
>>     Otherwise, if the post RA scheduler only inserts NOPs, since I have issues using it,
>> I could as well insert NOPs in the [Target]AsmPrinter.cpp module .
>>
>>   Thank you,
>>     Alex
>>
>> On 2/10/2017 1:42 AM, Hal Finkel wrote:
>>>
>>> On 02/09/2017 04:46 PM, Alex Susu via llvm-dev wrote:
>>>>   Hello.
>>>>     Hal, thank you for the information.
>>>>     I managed to get inspired from PPCHazardRecognizers.cpp. So I created my very simple
>>>> [Target]HazardRecognizers.cpp pass that is also derived from ScoreboardHazardRecognizer.
>>>> My class only implements the method getHazardType(), which checks if, as stated in my
>>>> first email, for example, I have a store instruction that is storing the value updated
>>>> by the instruction immediately above, which is NOT ok, since for my processor this is a
>>>> data hazard and in this case I have to insert a NOP in between by making getHazardType()
>>>> to:
>>>>       return NoopHazard; // this basically emits noop
>>>>
>>>>     However, to my surprise, my very simple post-RA scheduler (using my class derived
>>>> from ScoreboardHazardRecognizer) is cycling FOREVER after this return NoopHazard, by
>>>> calling getHazardType() again and again for this SAME store instruction I found in the
>>>> first place with the data hazard problem. So, llc is no longer finishing - I have to
>>>> stop the process because of this strange behavior.
>>>>     I was expecting after the first call to getHazardType() with the respective store
>>>> instruction (and return NoopHazard) that the scheduler would move forward to the other
>>>> instructions in the DAG/basic-block.
>>>
>>> It should emit a nop if all available instructions return NoopHazard.
>>>
>>>>
>>>>     Do you have an idea what can I do to fix this problem?
>>>
>>> I'm not sure. I recall running into a situation like this years ago, but I don't recall
>>> now how I resolved it. Are you correctly handling the Stalls argument to getHazardType?
>>>
>>>  -Hal
>>>
>>>>
>>>>   Thank you very much,
>>>>     Alex
>>>>
>>>> On 2/3/2017 10:25 PM, Hal Finkel wrote:
>>>>> Hi Alex,
>>>>>
>>>>> You can program a post-RA scheduler which will return NoopHazard in the appropriate
>>>>> circumstances. You can look at the PowerPC target (e.g.
>>>>> lib/Target/PowerPC/PPCHazardRecognizers.cpp) as an example.
>>>>>
>>>>>  -Hal
>>>>>
>>>>>
>>>>> On 02/02/2017 05:03 PM, Alex Susu via llvm-dev wrote:
>>>>>>   Hello.
>>>>>>     I see there is little information on specifying instructions with delay slots.
>>>>>>     So could you please tell me how can I insert NOPs (BEFORE or after an instruction)
>>>>>> or how to make an aware instruction scheduler in order to avoid miscalculations due to
>>>>>> the delay slot effect?
>>>>>>
>>>>>>     More exactly, I have the following constraints on my (SIMD) processor:
>>>>>>       - certain stores or loads, must be executed 1 cycle after the instruction
>>>>>> generating their input operands ends. For example, if I have:
>>>>>>          R1 = R2 + R3
>>>>>>          LS[R10] = R1 // this will not produce the correct result because it does not
>>>>>> see the updated value of R1 from the previous instruction
>>>>>>        To make this code execute correctly we need to insert a NOP:
>>>>>>          R1 = R2 + R3
>>>>>>          NOP // or other instruction to fill the delay slot
>>>>>>          LS[R10] = R1
>>>>>>
>>>>>>       - a compare instruction requires to add a NOP after it, before the predicated
>>>>>> block (something like a conditional JMP instruction) starts.
>>>>>>
>>>>>>
>>>>>>   Thank you,
>>>>>>     Alex
>>>>>> _______________________________________________
>>>>>> LLVM Developers mailing list
>>>>>> llvm-dev at lists.llvm.org
>>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>>
>>>> _______________________________________________
>>>> LLVM Developers mailing list
>>>> llvm-dev at lists.llvm.org
>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>