[PATCH] D116465: [SPIRV 6/6] Add 2 essential passes and the simplest tests

Wed Feb 2 10:10:25 PST 2022

rengolin added a comment.

In D116465#3290957 <https://reviews.llvm.org/D116465#3290957>, @zuban32 wrote:

> Then let me describe the problem in more details, I'd like us all to be on the same page with this.

Awesome! This is much clearer to me, thanks!

> The latter seems to be a minor problem - the reason for it to exist is SPIRV binary format requiring global numbering of all the defined values, i.e. if there're 2 functions each containing only 1 instruction then these instructions can't be both numbered %0.

Do you have a maximum ID? If not, than in theory, you just need a static global auto-increment ID (like a get_next_id() on some target emission class) and off you go emitting globally unique IDs.

> The former one attempts to solve a bigger problem. Basically the SPIRV binary is structured as following:
>
>   ...
>   types, constants
>   ...
>   functions definitions

I see, so your "global" function contain the type and constant declarations, that the functions use in their own declarations.

So, you're not "moving" anything, just creating a placeholder for those definitions and indexing them.

> in our approach each MachineFunction has its own copies of type decls and constants (and other stuff like that) during the whole translation process, but eventually they need to be deduplicated and 'globalized'. That can be done either on MIR level or during binary emission, but it has done be done anyway.

IIUC, they're in each function because you generate for each function you pass through, so:

  func1
    %0 = OpTypeInt 32
    %1 = OpTypeInt 64
    ...
  func2
    %0 = OpTypeFloat 32
    %1 = OpTypeFloat 64
    ...

and then you need to get all types of the same "type" and call a single ID.

So, these passes seem *necessary* if you want a lean and tidy representation, but not necessary if you can cope with duplicate definitions while still keeping unique IDs.

For example, using the auto-increment proposal above, we could have:

  func1
    %0 = OpTypeInt 32
    %1 = OpTypeInt 64
    ...
  func2
    %2 = OpTypeFloat 32
    %3 = OpTypeInt 64
    ...

Note two definitions of `Int 64` but with two different IDs. Would that be a binary format error for (native?) types? Would that be an error for constants, too?

If each type can only be defined once, you could have a list of all possible types and defined them in the beginning (in your "global" function), from %0 to %N. Then, while emitting each function, you just use those IDs and you don't need to deduplicate.

Of course, this doesn't work with constants, but I'm guessing it's more likely that, if there is a rule of single-declaration for types, there may not be for constants (otherwise compilers would suffer regardless of representation).

In that sense, you wouldn't need either pass, just initialisation of the known types, declaration of the global function with them, and reuse in each function, using the global auto-increment for all the IDs.

Does that fit with the spec?

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D116465/new/

https://reviews.llvm.org/D116465