[PATCH] D99123: [SampleFDO] Flow Sensitive Sample FDO (FSAFDO)
David Li via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Mar 29 10:38:37 PDT 2021
davidxl added inline comments.
================
Comment at: llvm/include/llvm/Support/FSAFDODiscriminator.h:19
+
+#define PASS_1_DIS_BIT_BEG 8
+#define PASS_1_DIS_BIT_END 13
----------------
snehasish wrote:
> A few questions about the discriminator bits:
>
> * Depending on the transformation in the target pass the requirement of bits may be different, i.e. 5 bits for each may be too many or too few. Do you have any data to share about how many bits are used by each?
>
> * How do we alert authors of new target optimizations (or code refactoring) additional discriminator bits are needed to disambiguate? Would a late stage analysis only pass which enumerates different instructions with the same debug+discriminator info be useful to commit?
>
> * If I understand correctly, we bump the bit for each level of cloning. This seems to be a less efficient coding scheme, max 5 bits where by enumeration you could identify 31 clones? Have you considered other coding schemes?
> A few questions about the discriminator bits:
>
> * Depending on the transformation in the target pass the requirement of bits may be different, i.e. 5 bits for each may be too many or too few. Do you have any data to share about how many bits are used by each?
I assume most of the transformations produce few clones except for unrolling (which depends on unroll factor).
>
> * How do we alert authors of new target optimizations (or code refactoring) additional discriminator bits are needed to disambiguate? Would a late stage analysis only pass which enumerates different instructions with the same debug+discriminator info be useful to commit?
The problem with this is that the authors won't have any means to change anything. Rong and I discussed about this. Longer term when this becomes and issue, increasing the size of the discriminator container type will be the way to go.
>
> * If I understand correctly, we bump the bit for each level of cloning. This seems to be a less efficient coding scheme, max 5 bits where by enumeration you could identify 31 clones? Have you considered other coding schemes?
The biggest advantage of fixed width is simplicity, I think.
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D99123/new/
https://reviews.llvm.org/D99123
More information about the llvm-commits
mailing list