[llvm-dev] Specify special cases of delay slots in the back end
Hal Finkel via llvm-dev
llvm-dev at lists.llvm.org
Fri Feb 10 13:33:01 PST 2017
Hi Alex,
All of this makes sense, but are you correctly handling the Stalls
argument to getHazardType? What are you doing with it?
-Hal
On 02/10/2017 02:42 PM, Alex Susu via llvm-dev wrote:
> Hello.
> I am progressing a bit with difficulty with the post RA scheduler
> (PostRASchedulerList.cpp with ScoreboardHazardRecognizer) - the
> problem I have is that it doesn't advance at the next available
> instruction when the overridden
> ScoreboardHazardRecognizer::getHazardType() method returns NoopHazard
> and it gets stuck at the same instruction (store in my runs).
>
> Just to make sure: I am trying to use the post-RA (Register
> Allocation) scheduler to avoid data hazards by inserting, if possible,
> other USEFUL instructions from the program instead of (just) NOPs. Is
> this out-of-order scheduling (e.g., using the
> ScoreboardHazardRecognizer) that employs useful program instructions
> instead of NOPs working well with the post-RA scheduler?
> Otherwise, if the post RA scheduler only inserts NOPs, since I
> have issues using it, I could as well insert NOPs in the
> [Target]AsmPrinter.cpp module .
>
> Thank you,
> Alex
>
> On 2/10/2017 1:42 AM, Hal Finkel wrote:
>>
>> On 02/09/2017 04:46 PM, Alex Susu via llvm-dev wrote:
>>> Hello.
>>> Hal, thank you for the information.
>>> I managed to get inspired from PPCHazardRecognizers.cpp. So I
>>> created my very simple
>>> [Target]HazardRecognizers.cpp pass that is also derived from
>>> ScoreboardHazardRecognizer.
>>> My class only implements the method getHazardType(), which checks
>>> if, as stated in my
>>> first email, for example, I have a store instruction that is storing
>>> the value updated
>>> by the instruction immediately above, which is NOT ok, since for my
>>> processor this is a
>>> data hazard and in this case I have to insert a NOP in between by
>>> making getHazardType()
>>> to:
>>> return NoopHazard; // this basically emits noop
>>>
>>> However, to my surprise, my very simple post-RA scheduler (using
>>> my class derived
>>> from ScoreboardHazardRecognizer) is cycling FOREVER after this
>>> return NoopHazard, by
>>> calling getHazardType() again and again for this SAME store
>>> instruction I found in the
>>> first place with the data hazard problem. So, llc is no longer
>>> finishing - I have to
>>> stop the process because of this strange behavior.
>>> I was expecting after the first call to getHazardType() with the
>>> respective store
>>> instruction (and return NoopHazard) that the scheduler would move
>>> forward to the other
>>> instructions in the DAG/basic-block.
>>
>> It should emit a nop if all available instructions return NoopHazard.
>>
>>>
>>> Do you have an idea what can I do to fix this problem?
>>
>> I'm not sure. I recall running into a situation like this years ago,
>> but I don't recall
>> now how I resolved it. Are you correctly handling the Stalls argument
>> to getHazardType?
>>
>> -Hal
>>
>>>
>>> Thank you very much,
>>> Alex
>>>
>>> On 2/3/2017 10:25 PM, Hal Finkel wrote:
>>>> Hi Alex,
>>>>
>>>> You can program a post-RA scheduler which will return NoopHazard in
>>>> the appropriate
>>>> circumstances. You can look at the PowerPC target (e.g.
>>>> lib/Target/PowerPC/PPCHazardRecognizers.cpp) as an example.
>>>>
>>>> -Hal
>>>>
>>>>
>>>> On 02/02/2017 05:03 PM, Alex Susu via llvm-dev wrote:
>>>>> Hello.
>>>>> I see there is little information on specifying instructions
>>>>> with delay slots.
>>>>> So could you please tell me how can I insert NOPs (BEFORE or
>>>>> after an instruction)
>>>>> or how to make an aware instruction scheduler in order to avoid
>>>>> miscalculations due to
>>>>> the delay slot effect?
>>>>>
>>>>> More exactly, I have the following constraints on my (SIMD)
>>>>> processor:
>>>>> - certain stores or loads, must be executed 1 cycle after
>>>>> the instruction
>>>>> generating their input operands ends. For example, if I have:
>>>>> R1 = R2 + R3
>>>>> LS[R10] = R1 // this will not produce the correct result
>>>>> because it does not
>>>>> see the updated value of R1 from the previous instruction
>>>>> To make this code execute correctly we need to insert a NOP:
>>>>> R1 = R2 + R3
>>>>> NOP // or other instruction to fill the delay slot
>>>>> LS[R10] = R1
>>>>>
>>>>> - a compare instruction requires to add a NOP after it,
>>>>> before the predicated
>>>>> block (something like a conditional JMP instruction) starts.
>>>>>
>>>>>
>>>>> Thank you,
>>>>> Alex
>>>>> _______________________________________________
>>>>> LLVM Developers mailing list
>>>>> llvm-dev at lists.llvm.org
>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> llvm-dev at lists.llvm.org
>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
--
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory
More information about the llvm-dev
mailing list