[PATCH] D130397: [RISCV] Custom type legalize i32 loads by sign extending.

Craig Topper via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Aug 9 16:56:47 PDT 2022


craig.topper added a comment.

In D130397#3700198 <https://reviews.llvm.org/D130397#3700198>, @asb wrote:

> This gives a better idea of the impact:
> output_rv64imafdc_lp64_O0/mode-dependent-address.s: 3 lines added, 2 removed (+1 net)
> output_rv64imafdc_lp64_O0/pr53645.s: 112 lines added, 96 removed (+16 net)
> output_rv64imafdc_lp64_O1/pr23135.s: 147 lines added, 132 removed (+15 net)
> output_rv64imafdc_lp64_O1/pr53645.s: 282 lines added, 250 removed (+32 net)
> output_rv64imafdc_lp64_O2/loop-5.s: 6 lines added, 5 removed (+1 net)
> output_rv64imafdc_lp64_O2/pr53645.s: 278 lines added, 248 removed (+30 net)
> output_rv64imafdc_lp64_O3/loop-5.s: 6 lines added, 5 removed (+1 net)
> output_rv64imafdc_lp64_O3/memset-2.s: 1129 lines added, 1083 removed (+46 net)
> output_rv64imafdc_lp64_O3/pr53645.s: 278 lines added, 248 removed (+30 net)
> output_rv64imafdc_lp64_Os/pr53645.s: 278 lines added, 248 removed (+30 net)
> output_rv64imafdc_lp64d_O0/mode-dependent-address.s: 3 lines added, 2 removed (+1 net)
> output_rv64imafdc_lp64d_O0/pr53645.s: 112 lines added, 96 removed (+16 net)
> output_rv64imafdc_lp64d_O1/pr23135.s: 147 lines added, 132 removed (+15 net)
> output_rv64imafdc_lp64d_O1/pr53645.s: 282 lines added, 250 removed (+32 net)
> output_rv64imafdc_lp64d_O2/loop-5.s: 6 lines added, 5 removed (+1 net)
> output_rv64imafdc_lp64d_O2/pr53645.s: 278 lines added, 248 removed (+30 net)
> output_rv64imafdc_lp64d_O3/loop-5.s: 6 lines added, 5 removed (+1 net)
> output_rv64imafdc_lp64d_O3/memset-2.s: 1129 lines added, 1083 removed (+46 net)
> output_rv64imafdc_lp64d_O3/pr53645.s: 278 lines added, 248 removed (+30 net)
> output_rv64imafdc_lp64d_Os/pr53645.s: 278 lines added, 248 removed (+30 net)
>
> It's probably worth a quick check if there are obvious reasons for the additions, but the overall impact seems positive so if there's not an obvious deficiency I don't have an objection to declaring these cases are just noise due to taking a different codegen path.

I explored these differences. Some notes.

memset-2.s -  Directly caused by the isTruncateFree change. We are now sharing a 64-bit constant between i64 and i32 stores by truncating. Somehow this caused some repeated rematerialization of LUI instructions. Despite on the surface the change reducing register pressure. The basic block is quite large with many calls to memset.

pr53645.s - test includes a vector value from a load passed across basic blocks that get scalarized. this increases the use count of the broken down scalars but the other basic block only wants 1 element. There's also a visitation order issue with urem by constant expansion interacting with SimplifyDemandedBits.

mode-dependent-address.s - we have an i32 load used by sext_inreg and an and with 255. This and was used to form a zextload, but it didn't remove the and, but prevented the sext_inreg from making a sextload. Seems related to isTruncateFree.

pr23135.s - DAGCombiner's ForwardStoreValueToDirectLoad needs to support sextload by creating sext_inreg.

loop-5.s - We need to both sext and zext a load. We used to use lwu and sext.w, now we use lw and slli+srli.

I only have a good answer on how to fix pr23135.s


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D130397/new/

https://reviews.llvm.org/D130397



More information about the llvm-commits mailing list