[PATCH] D35035: [InstCombine] Prevent memcpy generation for small data size

Mon Nov 26 10:41:26 PST 2018

spatel added a comment.

In D35035#1308250 <https://reviews.llvm.org/D35035#1308250>, @mehdi_amini wrote:

> In D35035#1253023 <https://reviews.llvm.org/D35035#1253023>, @spatel wrote:
>
> > That case (and a couple of similar tests that I tried) are handled by -sroa, so they probably never made it to instcombine in the 1st place. I don't know anything about SROA, but hopefully, it's making that transform using some principled logic. :)
>
>
> FYI: @chandlerc mentioned to me (in the context of https://bugs.llvm.org/show_bug.cgi?id=39780 ) that SROA expects instcombine to handle memcpy and promote them to registers.

Is it possible to improve SROA to understand llvm.memcpy instead? IIUC, the memcpy intrinsic is considered canonical form, and the consensus in this review was that we would not expand memcpy in instcombine if that did not cause regressions.

D54887 <https://reviews.llvm.org/D54887> is proposing a similar change as here, but it's more liberal. It would allow expansion to load/store of weird types (non-power-of-2) as long as they fit in a single integer register. That seems dangerous given that the backend isn't very good at handling those illegal types. Example:

  target datalayout = "e-p:64:64:64-p1:32:32:32-p2:16:16:16-n8:16:32:64"
  declare void @llvm.memcpy.p0i8.p0i8.i32(i8* nocapture, i8* nocapture, i32, i1) nounwind

  define void @copy_3_bytes(i8* %d, i8* %s) {
    call void @llvm.memcpy.p0i8.p0i8.i32(i8* %d, i8* %s, i32 3, i1 false)
    ret void
  }

  $ opt -instcombine -S memcpy.ll
  ...
  define void @copy_3_bytes(i8* %d, i8* %s) {
    %1 = bitcast i8* %s to i24*
    %2 = bitcast i8* %d to i24*
    %3 = load i24, i24* %1, align 1
    store i24 %3, i24* %2, align 1
    ret void
  }

Repository:
  rL LLVM

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D35035/new/

https://reviews.llvm.org/D35035