[PATCH] D133668: [HLSL] Use _BitInt(16) for int16_t to avoid promote to int.

Chris Bieneman via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Sun Oct 9 13:24:09 PDT 2022


beanz added a comment.

Avoiding argument promotion is one part of what we need, but not all of it. For example if you take this trial code:

  const RWBuffer<int16_t2> In;
  RWBuffer<int16_t> Out;
  
  [numthreads(1,1,1)]
  void main(uint GI : SV_GroupIndex) {
    Out[GI] = In[GI].x + In[GI].y;
  }

Following C rules, clang promotes the `short` math to `int` math, so the IR for `main` looks like:

  ; Function Attrs: noinline norecurse nounwind optnone
  define internal void @"?main@@YAXI at Z"(i32 noundef %GI) #2 {
  entry:
    %GI.addr = alloca i32, align 4
    store i32 %GI, ptr %GI.addr, align 4
    %0 = load i32, ptr %GI.addr, align 4
    %call = call noundef nonnull align 4 dereferenceable(4) ptr @"??A?$RWBuffer at T?$__vector at F$01 at __clang@@@hlsl@@QBAAAT?$__vector at F$01 at __clang@@I at Z"(ptr noundef nonnull align 4 dereferenceable(4) @In, i32 noundef %0)
    %1 = load <2 x i16>, ptr %call, align 4
    %2 = extractelement <2 x i16> %1, i32 0
    %conv = sext i16 %2 to i32
    %3 = load i32, ptr %GI.addr, align 4
    %call1 = call noundef nonnull align 4 dereferenceable(4) ptr @"??A?$RWBuffer at T?$__vector at F$01 at __clang@@@hlsl@@QBAAAT?$__vector at F$01 at __clang@@I at Z"(ptr noundef nonnull align 4 dereferenceable(4) @In, i32 noundef %3)
    %4 = load <2 x i16>, ptr %call1, align 4
    %5 = extractelement <2 x i16> %4, i32 1
    %conv2 = sext i16 %5 to i32
    %add = add nsw i32 %conv, %conv2
    %conv3 = trunc i32 %add to i16
    %6 = load i32, ptr %GI.addr, align 4
    %call4 = call noundef nonnull align 2 dereferenceable(2) ptr @"??A?$RWBuffer at F@hlsl@@QAAAAFI at Z"(ptr noundef nonnull align 4 dereferenceable(4) @"?Out@@3V?$RWBuffer at F@hlsl@@A", i32 noundef %6)
    store i16 %conv3, ptr %call4, align 2
    ret void
  }

Because of the implicit vector nature of HLSL, these promotions and truncations would be extremely expensive. Using `_BitInt` allows us a language-header only solution that opts HLSL's `[u]int16_t` out of `Sema::UsualUnaryConversions`.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D133668/new/

https://reviews.llvm.org/D133668



More information about the cfe-commits mailing list