[llvm] [AArch64][GlobalISel] Expand abs.v4i8 to v4i16 and abs.v2s16 to v2s32 (PR #81231)

Thu Feb 29 04:51:29 PST 2024

davemgreen wrote:

Hi - sorry I did mean to get back to this earlier. @chuongg3 committed support for v4i8 load and store yesterday. They get bitcast to the correct type, so naturally end up in the first 4 elements of the vector. The codegen is still pretty awful to be honest: https://godbolt.org/z/nrc6553xq, but that can hopefully be improved with some better combines. It should just be a `load s`, `abs .8b` and `store s`, but you can see the expansion of the input isnt being cleaned up, and the regbank probably isn't being selected in the load/store yet.

I still think the illegal ANYEXT is coming from elsewhere (the returned value), and would be best fixed by legalizing the ANYEXT somehow. We believe the combine in matchScalarizeVectorUnmerge should ideally only be generating extracts from legal operations.

If anyone like Amara wants to review this and push it anyway then this won't be the first thing that we widenScalar as opposed to moreElements, but I'm still hoping that we can make moreElements work more optimally in the general case.

https://github.com/llvm/llvm-project/pull/81231