[PATCH] D153993: [Headers][doc] Add load/store/cmp/cvt intrinsic descriptions to avx2intrin.h

Paul Robinson via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Fri Jun 30 07:16:05 PDT 2023


probinson added inline comments.


================
Comment at: clang/lib/Headers/avx2intrin.h:3474
+///   IF __M[j+31] == 1
+///     result[j+31:j] := Load32(__X+(i*4))
+///   ELSE
----------------
pengfei wrote:
> probinson wrote:
> > pengfei wrote:
> > > probinson wrote:
> > > > pengfei wrote:
> > > > > A more intrinsic guide format is `MEM[__X+j:j]`
> > > > LoadXX is the syntax in the gather intrinsics, e.g. _mm_mask_i32gather_pd. I'd prefer to be consistent.
> > > I think the problem here is the measurement is easily confusing.
> > > From C point of view, `__X` is a `int` pointer, so we should `+ i` rather than `i * 4`
> > > From the other part of the code, we are measuring in bits, but here `i * 4` is a byte offset.
> > Well, the pseudo-code is clearly not C. If you look at the gather code, it computes a byte address using an offset multiplied by an explicit scale factor. I am doing exactly the same here.
> > 
> > The syntax `MEM[__X+j:j]` is mixing a byte address with a bit offset, which I think is more confusing. To be fully consistent, using `[]` with bit offsets only, it should be
> > ```
> > k := __X*8 + i*32
> > result[j+31:j] := MEM[k+31:k]
> > ```
> > which I think obscures more than it explains.
> Yeah, it's not C code here. But we are easy to fall into C concepts, e.g., why assuming __X is measuring in bytes?
> That's why I think it's clear to make both in bits.
> I made a mistake here, I wanted to propose `MEM[__X+j+31: __X+j]`. It matches with [[ https://www.intel.com/content/www/us/en/docs/intrinsics-guide/index.html#ig_expand=4057,4058,4059,4053,665,3890,5959,5910,3870,4280&text=_mm256_maskload_epi32 | Intrinsic Guide ]].
> 
We assume `__X` is in bytes because that's how addresses work on X86. Adding a bit offset to a byte address makes no sense. I see that is how existing Intel documentation works, which does not make it correct.

To "make both in bits" means multiplying `__X` by 8, as in the example in my previous comment. Or coming up with a different syntax that makes the difference clear.
`MEM(__X)[j+31:j]` or even `MEM[__X][j+31:j]` would be preferable.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D153993/new/

https://reviews.llvm.org/D153993



More information about the cfe-commits mailing list