[PATCH] D147713: [RISCV] Combine concat_vectors of loads into strided loads
Luke Lau via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Fri Apr 7 07:40:13 PDT 2023
luke added inline comments.
================
Comment at: llvm/lib/Target/RISCV/RISCVISelLowering.cpp:11475-11477
+ if (!allowsMemoryAccessForAlignment(*DAG.getContext(), DAG.getDataLayout(),
+ WideVecVT, *MMO))
+ break;
----------------
luke wrote:
> Is it legal to increase the alignment here?
> E.g. for these loads
>
> ```
> %0 = load <4 x i8>, ptr %pix1, align 1
> %add.ptr = getelementptr inbounds i8, ptr %pix1, i64 %idx.ext
> %2 = load <4 x i8>, ptr %add.ptr, align 1
> ```
>
> Can we use an align of 4 * 1:
>
> ```
> %0 = call <2 x i32> @llvm.riscv.strided.load ptr %pix1, i64 %idx.ext, align 4
> ```
>
I have a feeling the answer is no, which would mean that we can't combine this in x264 SAD:
```c
#include <stdint.h>
#include <stdlib.h>
typedef uint8_t pixel;
#define PIXEL_SAD_C( name, lx, ly ) \
int name( pixel *pix1, intptr_t i_stride_pix1, \
pixel *pix2, intptr_t i_stride_pix2 ) \
{ \
int i_sum = 0; \
for( int y = 0; y < ly; y++ ) \
{ \
for( int x = 0; x < lx; x++ ) \
{ \
i_sum += abs( pix1[x] - pix2[x] ); \
} \
pix1 += i_stride_pix1; \
pix2 += i_stride_pix2; \
} \
return i_sum; \
}
PIXEL_SAD_C(x264_pixel_sad_4x4, 4, 4)
```
There's no guarantee here that `pix1`/`pix2`/`i_stride_pix1`/`i_stride_pix2` are word aligned so we can't use vlse32. Unless we know it has fast unaligned access?
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D147713/new/
https://reviews.llvm.org/D147713
More information about the llvm-commits
mailing list