[PATCH] D99123: [SampleFDO] Flow Sensitive Sample FDO (FSAFDO)

Mon Mar 29 10:38:37 PDT 2021

davidxl added inline comments.

================
Comment at: llvm/include/llvm/Support/FSAFDODiscriminator.h:19
+
+#define PASS_1_DIS_BIT_BEG 8
+#define PASS_1_DIS_BIT_END 13
----------------
snehasish wrote:
> A few questions about the discriminator bits:
> 
> * Depending on the transformation in the target pass the requirement of bits may be different, i.e. 5 bits for each may be too many or too few. Do you have any data to share about how many bits are used by each?
> 
> * How do we alert authors of new target optimizations (or code refactoring) additional discriminator bits are needed to disambiguate? Would a late stage analysis only pass which enumerates different instructions with the same debug+discriminator info be useful to commit?
> 
> * If I understand correctly, we bump the bit for each level of cloning. This seems to be a less efficient coding scheme, max 5 bits where by enumeration you could identify 31 clones? Have you considered other coding schemes?
> A few questions about the discriminator bits:
> 
> * Depending on the transformation in the target pass the requirement of bits may be different, i.e. 5 bits for each may be too many or too few. Do you have any data to share about how many bits are used by each?

I assume most of the transformations produce few clones except for unrolling (which depends on unroll factor).

> 
> * How do we alert authors of new target optimizations (or code refactoring) additional discriminator bits are needed to disambiguate? Would a late stage analysis only pass which enumerates different instructions with the same debug+discriminator info be useful to commit?

The problem with this is that the authors won't have any means to change anything.  Rong and I discussed about this. Longer term when this becomes and issue, increasing the size of the discriminator container type will be the way to go.

> 
> * If I understand correctly, we bump the bit for each level of cloning. This seems to be a less efficient coding scheme, max 5 bits where by enumeration you could identify 31 clones? Have you considered other coding schemes?

The biggest advantage of fixed width is simplicity,  I think.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D99123/new/

https://reviews.llvm.org/D99123