[PATCH] D146774: [AMDGPU][IGLP] WIP/Demo: Add rules to SchedGroups

Tue Apr 4 12:42:47 PDT 2023

jrbyrnes added a comment.

In D146774#4239607 <https://reviews.llvm.org/D146774#4239607>, @kerbowa wrote:

> I like the general approach. It seems like things could get unwieldy with larger SchedGroups. You would need to have lots of checks vs Collection.size() which could be somewhat hard to work with.
>
> Maybe allowing rules to look into other already built SchedGroups could be helpful? That way you could avoid some of the Collection.size() stuff.
>
> Have you tried to implement Po Yen's proposal, or do you have a general idea about how this patch may try to implement it?

Thanks for the thoughts Austin.

First -- regarding Po Yen's proposal.
I have a general idea of how this patch may to implement it. Based on his presentation, it seems like we may want the following sched groups:

A. (1) ds_write, (1) buffer_load, (1), mfma

- WAR dep on ds_write and buffer_load

- no dep on buffer_load and mfma

- sched group is interlinked (add edge between ds_w, buffer_load and buffer_load, mfma) (already partially linked)

B. (3) ds_read, (2) mfma

- RAW deps on the ds_reads and mfmas

- sched group is interlinked (already linked)

The A,B sched groups would be in different sync_pipelines (each sync_pipeline is just a repeat of the corresponding schedgroup # DAG MFMA / # SG MFMA times). We may want to include some MFMA sched groups to preserve their initial ordering (the more rules we have on SchedGroups, the fewer legal permutation for PipelineSolver).

This is just a general idea towards how it could work. For the actual optimization, I would want to look at the actual assembly produced and encode that into sched groups. Moreover, I would want to look into regressions.

Second -- regarding the collection.size() issue:

Yes, the collections.size() calls will probably become hard to manage. However, I think without grouping together related (by a rule) schedgroups, there will be an opposite problem of finding the corresponding schedgroup that is being referenced by a rule. Both these issues can likely be resolved with composite SchedGroups, or having the collection be an aggregate of structs, each with their own size / mask / rule(s).  This way we can group together related collections (each easily sized), and allow cross inspection via rules.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D146774/new/

https://reviews.llvm.org/D146774