[PATCH] D67085: [ARM] Fix loads and stores for v4i1 and v8i1

Dave Green via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Sep 2 12:34:49 PDT 2019


dmgreen created this revision.
dmgreen added reviewers: t.p.northover, samparker, simon_tatham, SjoerdMeijer, ostannard.
Herald added subscribers: hiraditya, kristof.beyls, javed.absar.
Herald added a project: LLVM.

These predicate vectors can usually be loaded and stored with a single instruction, a VSTR_P0. However this instruction will store the entire P0 predicate, 16 bits with each lane of the the v4i1/v8i1 representing 4/2 bits.

As far as I understand, when llvm says "store this v4i1", it really does need to store 4 bits (or 8, that being the size of a byte, with this bottom 4 as the interesting bits). For example a bitcast from a v8i1 to a i8 is defined as a store followed by a load, which is how the code is expanded.

So this instead lowers the v4i1/v8i1 load/store through some shuffles to get the bits into the correct positions. This, as you might imagine, is not as efficient as a single instruction. But I believe it is needed for correctness. v16i1 can still use the VSTR_P0. And stack loads/stores are still using the VSTR_P0 even for v4i1/v8i1 (as can be seen by the test not changing). This is fine as they are self-consistent, it is only "externally observable loads/stores" (from our point of view) that need to be corrected.

The test changes here are in pred-bitcast (which is no longer incorrect), pred-ldst (which is obviously a lot larger, but I don't believe will be generated a lot), and masked ld/st. The masked ld/st test we should be able to optimise better with a few folds, and we should not be generating masked ld/st only to expand them like this.


https://reviews.llvm.org/D67085

Files:
  llvm/lib/Target/ARM/ARMISelLowering.cpp
  llvm/lib/Target/ARM/ARMInstrMVE.td
  llvm/test/CodeGen/Thumb2/mve-masked-ldst.ll
  llvm/test/CodeGen/Thumb2/mve-masked-load.ll
  llvm/test/CodeGen/Thumb2/mve-masked-store.ll
  llvm/test/CodeGen/Thumb2/mve-pred-bitcast.ll
  llvm/test/CodeGen/Thumb2/mve-pred-loadstore.ll

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D67085.218385.patch
Type: text/x-patch
Size: 155624 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20190902/6a17344d/attachment.bin>


More information about the llvm-commits mailing list