[clang] [lld] [llvm] [WebAssembly] Enable nontrapping-fptoint and bulk-memory by default. (PR #112049)
Dan Gohman via llvm-commits
llvm-commits at lists.llvm.org
Wed Oct 16 08:08:10 PDT 2024
https://github.com/sunfishcode updated https://github.com/llvm/llvm-project/pull/112049
>From 7d55b35158ceb1a5d35ac62ecfe404f6a374e526 Mon Sep 17 00:00:00 2001
From: Dan Gohman <dev at sunfishcode.online>
Date: Fri, 11 Oct 2024 13:31:13 -0700
Subject: [PATCH 1/8] [WebAssembly] Enable nontrapping-fptoint and bulk-memory
by default.
We were prepared to enable these features [back in February], but they
got pulled for what appear to be unrelated reasons. So let's have another
try at enabling them!
Another motivation here is that it'd be convenient for the
[Trail1 proposal] if "trail1" is a superset of "generic".
[back in February]: https://github.com/WebAssembly/tool-conventions/issues/158#issuecomment-1931119512
[Trail1 proposal]: https://github.com/llvm/llvm-project/pull/112035
---
clang/docs/ReleaseNotes.rst | 9 +++++++++
clang/lib/Basic/Targets/WebAssembly.cpp | 2 ++
llvm/docs/ReleaseNotes.md | 9 +++++++++
llvm/lib/Target/WebAssembly/WebAssembly.td | 3 ++-
4 files changed, 22 insertions(+), 1 deletion(-)
diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index 36cf1d8a14db9f..60eeb7f72647cf 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -630,6 +630,15 @@ NetBSD Support
WebAssembly Support
^^^^^^^^^^^^^^^^^^^
+The default target CPU, "generic", now enables the `-mnontrapping-fptoint`
+and `-mbulk-memory` flags, which correspond to the [Bulk Memory Operations]
+and [Non-trapping float-to-int Conversions] language features, which are
+[widely implemented in engines].
+
+[Bulk Memory Operations]: https://github.com/WebAssembly/bulk-memory-operations/blob/master/proposals/bulk-memory-operations/Overview.md
+[Non-trapping float-to-int Conversions]: https://github.com/WebAssembly/spec/blob/master/proposals/nontrapping-float-to-int-conversion/Overview.md
+[widely implemented in engines]: https://webassembly.org/features/
+
AVR Support
^^^^^^^^^^^
diff --git a/clang/lib/Basic/Targets/WebAssembly.cpp b/clang/lib/Basic/Targets/WebAssembly.cpp
index 5ac9421663adea..1ac8f14f8cb175 100644
--- a/clang/lib/Basic/Targets/WebAssembly.cpp
+++ b/clang/lib/Basic/Targets/WebAssembly.cpp
@@ -151,8 +151,10 @@ bool WebAssemblyTargetInfo::initFeatureMap(
llvm::StringMap<bool> &Features, DiagnosticsEngine &Diags, StringRef CPU,
const std::vector<std::string> &FeaturesVec) const {
auto addGenericFeatures = [&]() {
+ Features["bulk-memory"] = true;
Features["multivalue"] = true;
Features["mutable-globals"] = true;
+ Features["nontrapping-fptoint"] = true;
Features["reference-types"] = true;
Features["sign-ext"] = true;
};
diff --git a/llvm/docs/ReleaseNotes.md b/llvm/docs/ReleaseNotes.md
index 6e37852b4574ed..49218465cbc0dc 100644
--- a/llvm/docs/ReleaseNotes.md
+++ b/llvm/docs/ReleaseNotes.md
@@ -169,6 +169,15 @@ Changes to the RISC-V Backend
Changes to the WebAssembly Backend
----------------------------------
+The default target CPU, "generic", now enables the `-mnontrapping-fptoint`
+and `-mbulk-memory` flags, which correspond to the [Bulk Memory Operations]
+and [Non-trapping float-to-int Conversions] language features, which are
+[widely implemented in engines].
+
+[Bulk Memory Operations]: https://github.com/WebAssembly/bulk-memory-operations/blob/master/proposals/bulk-memory-operations/Overview.md
+[Non-trapping float-to-int Conversions]: https://github.com/WebAssembly/spec/blob/master/proposals/nontrapping-float-to-int-conversion/Overview.md
+[widely implemented in engines]: https://webassembly.org/features/
+
Changes to the Windows Target
-----------------------------
diff --git a/llvm/lib/Target/WebAssembly/WebAssembly.td b/llvm/lib/Target/WebAssembly/WebAssembly.td
index c632d4a74355d8..1e22707db23e91 100644
--- a/llvm/lib/Target/WebAssembly/WebAssembly.td
+++ b/llvm/lib/Target/WebAssembly/WebAssembly.td
@@ -110,7 +110,8 @@ def : ProcessorModel<"mvp", NoSchedModel, []>;
// consideration given to available support in relevant engines and tools, and
// the importance of the features.
def : ProcessorModel<"generic", NoSchedModel,
- [FeatureMultivalue, FeatureMutableGlobals,
+ [FeatureBulkMemory, FeatureMultivalue,
+ FeatureMutableGlobals, FeatureNontrappingFPToInt,
FeatureReferenceTypes, FeatureSignExt]>;
// Latest and greatest experimental version of WebAssembly. Bugs included!
>From 84dfea09fac4bedcd4d5e7780deb498e10190d76 Mon Sep 17 00:00:00 2001
From: Dan Gohman <dev at sunfishcode.online>
Date: Fri, 11 Oct 2024 16:58:39 -0700
Subject: [PATCH 2/8] Update tests.
---
.../test/CodeGen/WebAssembly/cfg-stackify-eh-legacy.ll | 10 +++++-----
llvm/test/CodeGen/WebAssembly/target-features-cpus.ll | 8 +++++++-
llvm/test/MC/WebAssembly/extern-functype-intrinsic.ll | 4 ++--
llvm/test/MC/WebAssembly/libcall.ll | 2 +-
4 files changed, 15 insertions(+), 9 deletions(-)
diff --git a/llvm/test/CodeGen/WebAssembly/cfg-stackify-eh-legacy.ll b/llvm/test/CodeGen/WebAssembly/cfg-stackify-eh-legacy.ll
index cef92f459e4aa3..24a08267db6fbf 100644
--- a/llvm/test/CodeGen/WebAssembly/cfg-stackify-eh-legacy.ll
+++ b/llvm/test/CodeGen/WebAssembly/cfg-stackify-eh-legacy.ll
@@ -1,9 +1,9 @@
; REQUIRES: asserts
-; RUN: llc < %s -disable-wasm-fallthrough-return-opt -wasm-disable-explicit-locals -wasm-keep-registers -disable-block-placement -verify-machineinstrs -fast-isel=false -machine-sink-split-probability-threshold=0 -cgp-freq-ratio-to-skip-merge=1000 -wasm-enable-eh -exception-model=wasm -mattr=+exception-handling | FileCheck %s
-; RUN: llc < %s -disable-wasm-fallthrough-return-opt -disable-block-placement -verify-machineinstrs -fast-isel=false -machine-sink-split-probability-threshold=0 -cgp-freq-ratio-to-skip-merge=1000 -wasm-enable-eh -exception-model=wasm -mattr=+exception-handling
-; RUN: llc < %s -O0 -disable-wasm-fallthrough-return-opt -wasm-disable-explicit-locals -wasm-keep-registers -verify-machineinstrs -wasm-enable-eh -exception-model=wasm -mattr=+exception-handling | FileCheck %s --check-prefix=NOOPT
-; RUN: llc < %s -disable-wasm-fallthrough-return-opt -wasm-disable-explicit-locals -wasm-keep-registers -disable-block-placement -verify-machineinstrs -fast-isel=false -machine-sink-split-probability-threshold=0 -cgp-freq-ratio-to-skip-merge=1000 -wasm-enable-eh -exception-model=wasm -mattr=+exception-handling -wasm-disable-ehpad-sort -stats 2>&1 | FileCheck %s --check-prefix=NOSORT
-; RUN: llc < %s -disable-wasm-fallthrough-return-opt -disable-block-placement -verify-machineinstrs -fast-isel=false -machine-sink-split-probability-threshold=0 -cgp-freq-ratio-to-skip-merge=1000 -wasm-enable-eh -exception-model=wasm -mattr=+exception-handling -wasm-disable-ehpad-sort | FileCheck %s --check-prefix=NOSORT-LOCALS
+; RUN: llc < %s -disable-wasm-fallthrough-return-opt -wasm-disable-explicit-locals -wasm-keep-registers -disable-block-placement -verify-machineinstrs -fast-isel=false -machine-sink-split-probability-threshold=0 -cgp-freq-ratio-to-skip-merge=1000 -wasm-enable-eh -exception-model=wasm -mattr=+exception-handling,bulk-memory | FileCheck %s
+; RUN: llc < %s -disable-wasm-fallthrough-return-opt -disable-block-placement -verify-machineinstrs -fast-isel=false -machine-sink-split-probability-threshold=0 -cgp-freq-ratio-to-skip-merge=1000 -wasm-enable-eh -exception-model=wasm -mattr=+exception-handling,bulk-memory
+; RUN: llc < %s -O0 -disable-wasm-fallthrough-return-opt -wasm-disable-explicit-locals -wasm-keep-registers -verify-machineinstrs -wasm-enable-eh -exception-model=wasm -mattr=+exception-handling,-bulk-memory | FileCheck %s --check-prefix=NOOPT
+; RUN: llc < %s -disable-wasm-fallthrough-return-opt -wasm-disable-explicit-locals -wasm-keep-registers -disable-block-placement -verify-machineinstrs -fast-isel=false -machine-sink-split-probability-threshold=0 -cgp-freq-ratio-to-skip-merge=1000 -wasm-enable-eh -exception-model=wasm -mattr=+exception-handling,-bulk-memory -wasm-disable-ehpad-sort -stats 2>&1 | FileCheck %s --check-prefix=NOSORT
+; RUN: llc < %s -disable-wasm-fallthrough-return-opt -disable-block-placement -verify-machineinstrs -fast-isel=false -machine-sink-split-probability-threshold=0 -cgp-freq-ratio-to-skip-merge=1000 -wasm-enable-eh -exception-model=wasm -mattr=+exception-handling,-bulk-memory -wasm-disable-ehpad-sort | FileCheck %s --check-prefix=NOSORT-LOCALS
target triple = "wasm32-unknown-unknown"
diff --git a/llvm/test/CodeGen/WebAssembly/target-features-cpus.ll b/llvm/test/CodeGen/WebAssembly/target-features-cpus.ll
index 77d1564409f78c..ba10dd94a9838d 100644
--- a/llvm/test/CodeGen/WebAssembly/target-features-cpus.ll
+++ b/llvm/test/CodeGen/WebAssembly/target-features-cpus.ll
@@ -13,7 +13,10 @@ target triple = "wasm32-unknown-unknown"
; generic: +multivalue, +mutable-globals, +reference-types, +sign-ext
; GENERIC-LABEL: .custom_section.target_features,"",@
-; GENERIC-NEXT: .int8 4
+; GENERIC-NEXT: .int8 6
+; GENERIC-NEXT: .int8 43
+; GENERIC-NEXT: .int8 11
+; GENERIC-NEXT: .ascii "bulk-memory"
; GENERIC-NEXT: .int8 43
; GENERIC-NEXT: .int8 10
; GENERIC-NEXT: .ascii "multivalue"
@@ -21,6 +24,9 @@ target triple = "wasm32-unknown-unknown"
; GENERIC-NEXT: .int8 15
; GENERIC-NEXT: .ascii "mutable-globals"
; GENERIC-NEXT: .int8 43
+; GENERIC-NEXT: .int8 19
+; GENERIC-NEXT: .ascii "nontrapping-fptoint"
+; GENERIC-NEXT: .int8 43
; GENERIC-NEXT: .int8 15
; GENERIC-NEXT: .ascii "reference-types"
; GENERIC-NEXT: .int8 43
diff --git a/llvm/test/MC/WebAssembly/extern-functype-intrinsic.ll b/llvm/test/MC/WebAssembly/extern-functype-intrinsic.ll
index 320b65356ba9f3..b321c0c82ad4d3 100644
--- a/llvm/test/MC/WebAssembly/extern-functype-intrinsic.ll
+++ b/llvm/test/MC/WebAssembly/extern-functype-intrinsic.ll
@@ -1,5 +1,5 @@
-; RUN: llc %s -o - | FileCheck %s
-; RUN: llc %s -o - | llvm-mc -triple=wasm32-unknown-unknown | FileCheck %s
+; RUN: llc %s -mattr=-bulk-memory -o - | FileCheck %s
+; RUN: llc %s -mattr=-bulk-memory -o - | llvm-mc -triple=wasm32-unknown-unknown | FileCheck %s
; ModuleID = 'test.c'
source_filename = "test.c"
diff --git a/llvm/test/MC/WebAssembly/libcall.ll b/llvm/test/MC/WebAssembly/libcall.ll
index 8b81f150da892a..ffd32abe2345bc 100644
--- a/llvm/test/MC/WebAssembly/libcall.ll
+++ b/llvm/test/MC/WebAssembly/libcall.ll
@@ -1,4 +1,4 @@
-; RUN: llc -filetype=obj %s -o - | obj2yaml | FileCheck %s
+; RUN: llc -filetype=obj -mattr=-bulk-memory %s -o - | obj2yaml | FileCheck %s
target triple = "wasm32-unknown-unknown"
>From 26b8333e0dad6d8489a379d187483a8d1b86c3e2 Mon Sep 17 00:00:00 2001
From: Dan Gohman <dev at sunfishcode.online>
Date: Mon, 14 Oct 2024 03:11:45 -0700
Subject: [PATCH 3/8] Update tests.
---
clang/test/Preprocessor/wasm-target-features.c | 4 ++--
lld/test/wasm/custom-section-name.ll | 2 +-
lld/test/wasm/data-segments.ll | 2 +-
lld/test/wasm/lto/libcall-archive.ll | 2 +-
lld/test/wasm/lto/stub-library-libcall.s | 2 +-
5 files changed, 6 insertions(+), 6 deletions(-)
diff --git a/clang/test/Preprocessor/wasm-target-features.c b/clang/test/Preprocessor/wasm-target-features.c
index c64d3a0aa22825..f1b13c03757ee3 100644
--- a/clang/test/Preprocessor/wasm-target-features.c
+++ b/clang/test/Preprocessor/wasm-target-features.c
@@ -166,6 +166,8 @@
// GENERIC-INCLUDE-DAG: #define __wasm_mutable_globals__ 1{{$}}
// GENERIC-INCLUDE-DAG: #define __wasm_reference_types__ 1{{$}}
// GENERIC-INCLUDE-DAG: #define __wasm_sign_ext__ 1{{$}}
+// GENERIC-INCLUDE-DAG: #define __wasm_nontrapping_fptoint__ 1{{$}}
+// GENERIC-INCLUDE-DAG: #define __wasm_bulk_memory__ 1{{$}}
//
// RUN: %clang -E -dM %s -o - 2>&1 \
// RUN: -target wasm32-unknown-unknown -mcpu=generic \
@@ -175,12 +177,10 @@
// RUN: | FileCheck %s -check-prefix=GENERIC
//
// GENERIC-NOT: #define __wasm_atomics__ 1{{$}}
-// GENERIC-NOT: #define __wasm_bulk_memory__ 1{{$}}
// GENERIC-NOT: #define __wasm_exception_handling__ 1{{$}}
// GENERIC-NOT: #define __wasm_extended_const__ 1{{$}}
// GENERIC-NOT: #define __wasm__fp16__ 1{{$}}
// GENERIC-NOT: #define __wasm_multimemory__ 1{{$}}
-// GENERIC-NOT: #define __wasm_nontrapping_fptoint__ 1{{$}}
// GENERIC-NOT: #define __wasm_relaxed_simd__ 1{{$}}
// GENERIC-NOT: #define __wasm_simd128__ 1{{$}}
// GENERIC-NOT: #define __wasm_tail_call__ 1{{$}}
diff --git a/lld/test/wasm/custom-section-name.ll b/lld/test/wasm/custom-section-name.ll
index b860ef5a83e836..8799fbf36056d1 100644
--- a/lld/test/wasm/custom-section-name.ll
+++ b/lld/test/wasm/custom-section-name.ll
@@ -1,4 +1,4 @@
-; RUN: llc -filetype=obj %s -o %t.o
+; RUN: llc -filetype=obj -mattr=-bulk-memory %s -o %t.o
; RUN: wasm-ld -no-gc-sections --no-entry -o %t.wasm %t.o
; RUN: obj2yaml %t.wasm | FileCheck %s --check-prefixes=CHECK,NO-BSS
; RUN: wasm-ld -no-gc-sections --no-entry --import-memory -o %t.bss.wasm %t.o
diff --git a/lld/test/wasm/data-segments.ll b/lld/test/wasm/data-segments.ll
index 670ac3c1f373fa..41868a0b2b50b6 100644
--- a/lld/test/wasm/data-segments.ll
+++ b/lld/test/wasm/data-segments.ll
@@ -1,4 +1,4 @@
-; RUN: llc --mtriple=wasm32-unknown-unknown -filetype=obj %s -o %t.atomics.o -mattr=+atomics
+; RUN: llc --mtriple=wasm32-unknown-unknown -filetype=obj %s -o %t.atomics.o -mattr=+atomics,-bulk-memory
; RUN: llc --mtriple=wasm32-unknown-unknown -filetype=obj %s -o %t.bulk-mem.o -mattr=+bulk-memory
; RUN: llc --mtriple=wasm64-unknown-unknown -filetype=obj %s -o %t.bulk-mem64.o -mattr=+bulk-memory
; RUN: llc --mtriple=wasm32-unknown-unknown -filetype=obj %s -o %t.atomics.bulk-mem.o -mattr=+atomics,+bulk-memory
diff --git a/lld/test/wasm/lto/libcall-archive.ll b/lld/test/wasm/lto/libcall-archive.ll
index 365ce180f1441e..078faaa782dd93 100644
--- a/lld/test/wasm/lto/libcall-archive.ll
+++ b/lld/test/wasm/lto/libcall-archive.ll
@@ -2,7 +2,7 @@
; RUN: llvm-as -o %t.o %s
; RUN: llvm-as -o %t2.o %S/Inputs/libcall-archive.ll
; RUN: llvm-ar rcs %t.a %t2.o
-; RUN: wasm-ld -o %t %t.o %t.a
+; RUN: wasm-ld -mllvm -mattr=-bulk-memory -o %t %t.o %t.a
; RUN: obj2yaml %t | FileCheck %s
target datalayout = "e-m:e-p:32:32-p10:8:8-p20:8:8-i64:64-n32:64-S128"
diff --git a/lld/test/wasm/lto/stub-library-libcall.s b/lld/test/wasm/lto/stub-library-libcall.s
index ce88a32dd99dc7..3ae4b1ec0a5c2c 100644
--- a/lld/test/wasm/lto/stub-library-libcall.s
+++ b/lld/test/wasm/lto/stub-library-libcall.s
@@ -2,7 +2,7 @@
# RUN: llvm-mc -filetype=obj -triple=wasm32-unknown-unknown -o %t_main.o %t/main.s
# RUN: llvm-as %S/Inputs/foo.ll -o %t_foo.o
# RUN: llvm-as %S/Inputs/libcall.ll -o %t_libcall.o
-# RUN: wasm-ld %t_main.o %t_libcall.o %t_foo.o %p/Inputs/stub.so -o %t.wasm
+# RUN: wasm-ld -mllvm -mattr=-bulk-memory %t_main.o %t_libcall.o %t_foo.o %p/Inputs/stub.so -o %t.wasm
# RUN: obj2yaml %t.wasm | FileCheck %s
# The function `func_with_libcall` will generate an undefined reference to
>From 4283395bb5e3ec8ca83ee453beb4eba76c1dc269 Mon Sep 17 00:00:00 2001
From: Dan Gohman <dev at sunfishcode.online>
Date: Mon, 14 Oct 2024 04:11:51 -0700
Subject: [PATCH 4/8] Update tests.
---
lld/test/wasm/lto/stub-library-libcall.s | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/lld/test/wasm/lto/stub-library-libcall.s b/lld/test/wasm/lto/stub-library-libcall.s
index 3ae4b1ec0a5c2c..d65983c0cf5bf5 100644
--- a/lld/test/wasm/lto/stub-library-libcall.s
+++ b/lld/test/wasm/lto/stub-library-libcall.s
@@ -12,7 +12,7 @@
# If %t_foo.o is not included in the link we get an undefined symbol reported
# to the dependency of memcpy on the foo export:
-# RUN: not wasm-ld %t_main.o %t_libcall.o %p/Inputs/stub.so -o %t.wasm 2>&1 | FileCheck --check-prefix=MISSING %s
+# RUN: not wasm-ld -mllvm -mattr=-bulk-memory %t_main.o %t_libcall.o %p/Inputs/stub.so -o %t.wasm 2>&1 | FileCheck --check-prefix=MISSING %s
# MISSING: stub.so: undefined symbol: foo. Required by memcpy
#--- main.s
>From b4f8335ff4d58b46206247a231fd6caa73be8604 Mon Sep 17 00:00:00 2001
From: Dan Gohman <dev at sunfishcode.online>
Date: Tue, 15 Oct 2024 16:48:12 -0700
Subject: [PATCH 5/8] Move `-bulk-memory` flags from wasm-ld flags to a
function attributes.
---
lld/test/wasm/lto/Inputs/libcall-archive.ll | 4 +++-
lld/test/wasm/lto/libcall-archive.ll | 6 ++++--
2 files changed, 7 insertions(+), 3 deletions(-)
diff --git a/lld/test/wasm/lto/Inputs/libcall-archive.ll b/lld/test/wasm/lto/Inputs/libcall-archive.ll
index def1452bdf35cf..0317415d2faae4 100644
--- a/lld/test/wasm/lto/Inputs/libcall-archive.ll
+++ b/lld/test/wasm/lto/Inputs/libcall-archive.ll
@@ -1,6 +1,8 @@
target datalayout = "e-m:e-p:32:32-p10:8:8-p20:8:8-i64:64-n32:64-S128"
target triple = "wasm32-unknown-unknown"
-define void @memcpy() {
+define void @memcpy() #0 {
ret void
}
+
+attributes #0 = { "target-features"="-bulk-memory" }
diff --git a/lld/test/wasm/lto/libcall-archive.ll b/lld/test/wasm/lto/libcall-archive.ll
index 078faaa782dd93..4bfe7cf7471ef3 100644
--- a/lld/test/wasm/lto/libcall-archive.ll
+++ b/lld/test/wasm/lto/libcall-archive.ll
@@ -2,13 +2,13 @@
; RUN: llvm-as -o %t.o %s
; RUN: llvm-as -o %t2.o %S/Inputs/libcall-archive.ll
; RUN: llvm-ar rcs %t.a %t2.o
-; RUN: wasm-ld -mllvm -mattr=-bulk-memory -o %t %t.o %t.a
+; RUN: wasm-ld -o %t %t.o %t.a
; RUN: obj2yaml %t | FileCheck %s
target datalayout = "e-m:e-p:32:32-p10:8:8-p20:8:8-i64:64-n32:64-S128"
target triple = "wasm32-unknown-unknown"
-define void @_start(ptr %a, ptr %b) {
+define void @_start(ptr %a, ptr %b) #0 {
entry:
call void @llvm.memcpy.p0.p0.i64(ptr %a, ptr %b, i64 1024, i1 false)
ret void
@@ -16,6 +16,8 @@ entry:
declare void @llvm.memcpy.p0.p0.i64(ptr nocapture, ptr nocapture, i64, i1)
+attributes #0 = { "target-features"="-bulk-memory" }
+
; CHECK: - Type: CUSTOM
; CHECK-NEXT: Name: name
; CHECK-NEXT: FunctionNames:
>From 11b2240d5934800314ab80ce5d88c9133e23ec1f Mon Sep 17 00:00:00 2001
From: Dan Gohman <dev at sunfishcode.online>
Date: Tue, 15 Oct 2024 17:12:34 -0700
Subject: [PATCH 6/8] Delete redundant features.
---
clang/lib/Basic/Targets/WebAssembly.cpp | 2 --
1 file changed, 2 deletions(-)
diff --git a/clang/lib/Basic/Targets/WebAssembly.cpp b/clang/lib/Basic/Targets/WebAssembly.cpp
index 1ac8f14f8cb175..d5347079f3fece 100644
--- a/clang/lib/Basic/Targets/WebAssembly.cpp
+++ b/clang/lib/Basic/Targets/WebAssembly.cpp
@@ -161,12 +161,10 @@ bool WebAssemblyTargetInfo::initFeatureMap(
auto addBleedingEdgeFeatures = [&]() {
addGenericFeatures();
Features["atomics"] = true;
- Features["bulk-memory"] = true;
Features["exception-handling"] = true;
Features["extended-const"] = true;
Features["fp16"] = true;
Features["multimemory"] = true;
- Features["nontrapping-fptoint"] = true;
Features["tail-call"] = true;
setSIMDLevel(Features, RelaxedSIMD, true);
};
>From 881fd98dc11cecd1126acef71863a3505eb2c6d2 Mon Sep 17 00:00:00 2001
From: Dan Gohman <dev at sunfishcode.online>
Date: Tue, 15 Oct 2024 17:13:56 -0700
Subject: [PATCH 7/8] Sort the feature macros alphabetically.
---
clang/test/Preprocessor/wasm-target-features.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/clang/test/Preprocessor/wasm-target-features.c b/clang/test/Preprocessor/wasm-target-features.c
index f1b13c03757ee3..73b16964184104 100644
--- a/clang/test/Preprocessor/wasm-target-features.c
+++ b/clang/test/Preprocessor/wasm-target-features.c
@@ -162,12 +162,12 @@
// RUN: -target wasm64-unknown-unknown -mcpu=generic \
// RUN: | FileCheck %s -check-prefix=GENERIC-INCLUDE
//
+// GENERIC-INCLUDE-DAG: #define __wasm_bulk_memory__ 1{{$}}
// GENERIC-INCLUDE-DAG: #define __wasm_multivalue__ 1{{$}}
// GENERIC-INCLUDE-DAG: #define __wasm_mutable_globals__ 1{{$}}
+// GENERIC-INCLUDE-DAG: #define __wasm_nontrapping_fptoint__ 1{{$}}
// GENERIC-INCLUDE-DAG: #define __wasm_reference_types__ 1{{$}}
// GENERIC-INCLUDE-DAG: #define __wasm_sign_ext__ 1{{$}}
-// GENERIC-INCLUDE-DAG: #define __wasm_nontrapping_fptoint__ 1{{$}}
-// GENERIC-INCLUDE-DAG: #define __wasm_bulk_memory__ 1{{$}}
//
// RUN: %clang -E -dM %s -o - 2>&1 \
// RUN: -target wasm32-unknown-unknown -mcpu=generic \
>From 23fec587e7ba2c6bb34aa0a469bc17f9d4cfbb87 Mon Sep 17 00:00:00 2001
From: Dan Gohman <dev at sunfishcode.online>
Date: Wed, 16 Oct 2024 08:07:38 -0700
Subject: [PATCH 8/8] Protect memory.fill and memory.copy from zero-length
ranges.
---
.../lib/Target/WebAssembly/WebAssemblyISD.def | 4 +
.../WebAssembly/WebAssemblyISelLowering.cpp | 124 +++++++++++++++
.../WebAssembly/WebAssemblyInstrBulkMemory.td | 99 ++++++++++--
.../WebAssemblySelectionDAGInfo.cpp | 13 +-
llvm/test/CodeGen/WebAssembly/bulk-memory.ll | 102 ++++++++++---
.../test/CodeGen/WebAssembly/bulk-memory64.ll | 141 +++++++++++++-----
6 files changed, 415 insertions(+), 68 deletions(-)
diff --git a/llvm/lib/Target/WebAssembly/WebAssemblyISD.def b/llvm/lib/Target/WebAssembly/WebAssemblyISD.def
index b8954f4693f0a0..149f0cd70262bb 100644
--- a/llvm/lib/Target/WebAssembly/WebAssemblyISD.def
+++ b/llvm/lib/Target/WebAssembly/WebAssemblyISD.def
@@ -50,3 +50,7 @@ HANDLE_MEM_NODETYPE(GLOBAL_GET)
HANDLE_MEM_NODETYPE(GLOBAL_SET)
HANDLE_MEM_NODETYPE(TABLE_GET)
HANDLE_MEM_NODETYPE(TABLE_SET)
+
+// Bulk memory instructions that require branching to handle empty ranges.
+HANDLE_NODETYPE(MEMCPY)
+HANDLE_NODETYPE(MEMSET)
diff --git a/llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp b/llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp
index 5f76d666823e28..e7e84ed6b54313 100644
--- a/llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp
+++ b/llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp
@@ -561,6 +561,122 @@ static MachineBasicBlock *LowerFPToInt(MachineInstr &MI, DebugLoc DL,
return DoneMBB;
}
+// Lower a `MEMCPY` instruction into a CFG triangle around a `MEMORY_COPY`
+// instuction to handle the zero-length case.
+static MachineBasicBlock *LowerMemcpy(MachineInstr &MI, DebugLoc DL,
+ MachineBasicBlock *BB,
+ const TargetInstrInfo &TII,
+ bool Int64) {
+ MachineRegisterInfo &MRI = BB->getParent()->getRegInfo();
+
+ MachineOperand DstMem = MI.getOperand(0);
+ MachineOperand SrcMem = MI.getOperand(1);
+ MachineOperand Dst = MI.getOperand(2);
+ MachineOperand Src = MI.getOperand(3);
+ MachineOperand Len = MI.getOperand(4);
+
+ MachineOperand NoKillLen = Len;
+ NoKillLen.setIsKill(false);
+
+ unsigned Eqz = Int64 ? WebAssembly::EQZ_I64 : WebAssembly::EQZ_I32;
+
+ const BasicBlock *LLVMBB = BB->getBasicBlock();
+ MachineFunction *F = BB->getParent();
+ MachineBasicBlock *TrueMBB = F->CreateMachineBasicBlock(LLVMBB);
+ MachineBasicBlock *DoneMBB = F->CreateMachineBasicBlock(LLVMBB);
+
+ unsigned MemoryCopy = Int64 ? WebAssembly::MEMORY_COPY_A64 : WebAssembly::MEMORY_COPY_A32;
+
+ MachineFunction::iterator It = ++BB->getIterator();
+ F->insert(It, TrueMBB);
+ F->insert(It, DoneMBB);
+
+ // Transfer the remainder of BB and its successor edges to DoneMBB.
+ DoneMBB->splice(DoneMBB->begin(), BB, std::next(MI.getIterator()), BB->end());
+ DoneMBB->transferSuccessorsAndUpdatePHIs(BB);
+
+ BB->addSuccessor(TrueMBB);
+ BB->addSuccessor(DoneMBB);
+ TrueMBB->addSuccessor(DoneMBB);
+
+ unsigned EqzReg;
+ EqzReg = MRI.createVirtualRegister(&WebAssembly::I32RegClass);
+
+ MI.eraseFromParent();
+
+ BuildMI(BB, DL, TII.get(Eqz), EqzReg).add(NoKillLen);
+
+ BuildMI(TrueMBB, DL, TII.get(MemoryCopy))
+ .add(DstMem)
+ .add(SrcMem)
+ .add(Dst)
+ .add(Src)
+ .add(Len);
+
+ // Create the CFG triangle.
+ BuildMI(BB, DL, TII.get(WebAssembly::BR_IF)).addMBB(DoneMBB).addReg(EqzReg);
+ BuildMI(TrueMBB, DL, TII.get(WebAssembly::BR)).addMBB(DoneMBB);
+
+ return DoneMBB;
+}
+
+// Lower a `MEMSET` instruction into a CFG triangle around a `MEMORY_FILL`
+// instuction to handle the zero-length case.
+static MachineBasicBlock *LowerMemset(MachineInstr &MI, DebugLoc DL,
+ MachineBasicBlock *BB,
+ const TargetInstrInfo &TII,
+ bool Int64) {
+ MachineRegisterInfo &MRI = BB->getParent()->getRegInfo();
+
+ MachineOperand Mem = MI.getOperand(0);
+ MachineOperand Dst = MI.getOperand(1);
+ MachineOperand Val = MI.getOperand(2);
+ MachineOperand Len = MI.getOperand(3);
+
+ MachineOperand NoKillLen = Len;
+ NoKillLen.setIsKill(false);
+
+ unsigned Eqz = Int64 ? WebAssembly::EQZ_I64 : WebAssembly::EQZ_I32;
+
+ const BasicBlock *LLVMBB = BB->getBasicBlock();
+ MachineFunction *F = BB->getParent();
+ MachineBasicBlock *TrueMBB = F->CreateMachineBasicBlock(LLVMBB);
+ MachineBasicBlock *DoneMBB = F->CreateMachineBasicBlock(LLVMBB);
+
+ unsigned MemoryFill = Int64 ? WebAssembly::MEMORY_FILL_A64 : WebAssembly::MEMORY_FILL_A32;
+
+ MachineFunction::iterator It = ++BB->getIterator();
+ F->insert(It, TrueMBB);
+ F->insert(It, DoneMBB);
+
+ // Transfer the remainder of BB and its successor edges to DoneMBB.
+ DoneMBB->splice(DoneMBB->begin(), BB, std::next(MI.getIterator()), BB->end());
+ DoneMBB->transferSuccessorsAndUpdatePHIs(BB);
+
+ BB->addSuccessor(TrueMBB);
+ BB->addSuccessor(DoneMBB);
+ TrueMBB->addSuccessor(DoneMBB);
+
+ unsigned EqzReg;
+ EqzReg = MRI.createVirtualRegister(&WebAssembly::I32RegClass);
+
+ MI.eraseFromParent();
+
+ BuildMI(BB, DL, TII.get(Eqz), EqzReg).add(NoKillLen);
+
+ BuildMI(TrueMBB, DL, TII.get(MemoryFill))
+ .add(Mem)
+ .add(Dst)
+ .add(Val)
+ .add(Len);
+
+ // Create the CFG triangle.
+ BuildMI(BB, DL, TII.get(WebAssembly::BR_IF)).addMBB(DoneMBB).addReg(EqzReg);
+ BuildMI(TrueMBB, DL, TII.get(WebAssembly::BR)).addMBB(DoneMBB);
+
+ return DoneMBB;
+}
+
static MachineBasicBlock *
LowerCallResults(MachineInstr &CallResults, DebugLoc DL, MachineBasicBlock *BB,
const WebAssemblySubtarget *Subtarget,
@@ -718,6 +834,14 @@ MachineBasicBlock *WebAssemblyTargetLowering::EmitInstrWithCustomInserter(
case WebAssembly::FP_TO_UINT_I64_F64:
return LowerFPToInt(MI, DL, BB, TII, true, true, true,
WebAssembly::I64_TRUNC_U_F64);
+ case WebAssembly::MEMCPY_A32:
+ return LowerMemcpy(MI, DL, BB, TII, false);
+ case WebAssembly::MEMCPY_A64:
+ return LowerMemcpy(MI, DL, BB, TII, true);
+ case WebAssembly::MEMSET_A32:
+ return LowerMemset(MI, DL, BB, TII, false);
+ case WebAssembly::MEMSET_A64:
+ return LowerMemset(MI, DL, BB, TII, true);
case WebAssembly::CALL_RESULTS:
case WebAssembly::RET_CALL_RESULTS:
return LowerCallResults(MI, DL, BB, Subtarget, TII);
diff --git a/llvm/lib/Target/WebAssembly/WebAssemblyInstrBulkMemory.td b/llvm/lib/Target/WebAssembly/WebAssemblyInstrBulkMemory.td
index 7aeae54d95a8c9..de79f2d44cd328 100644
--- a/llvm/lib/Target/WebAssembly/WebAssemblyInstrBulkMemory.td
+++ b/llvm/lib/Target/WebAssembly/WebAssemblyInstrBulkMemory.td
@@ -21,16 +21,33 @@ multiclass BULK_I<dag oops_r, dag iops_r, dag oops_s, dag iops_s,
}
// Bespoke types and nodes for bulk memory ops
+
+// memory.copy (may trap on empty ranges)
+def wasm_memory_copy_t : SDTypeProfile<0, 5,
+ [SDTCisInt<0>, SDTCisInt<1>, SDTCisPtrTy<2>, SDTCisPtrTy<3>, SDTCisInt<4>]
+>;
+def wasm_memory_copy : SDNode<"WebAssemblyISD::MEMORY_COPY", wasm_memory_copy_t,
+ [SDNPHasChain, SDNPMayLoad, SDNPMayStore]>;
+
+// memory.copy with a branch to avoid trapping
def wasm_memcpy_t : SDTypeProfile<0, 5,
[SDTCisInt<0>, SDTCisInt<1>, SDTCisPtrTy<2>, SDTCisPtrTy<3>, SDTCisInt<4>]
>;
-def wasm_memcpy : SDNode<"WebAssemblyISD::MEMORY_COPY", wasm_memcpy_t,
+def wasm_memcpy : SDNode<"WebAssemblyISD::MEMCPY", wasm_memcpy_t,
[SDNPHasChain, SDNPMayLoad, SDNPMayStore]>;
+// memory.fill (may trap on empty ranges)
+def wasm_memory_fill_t : SDTypeProfile<0, 4,
+ [SDTCisInt<0>, SDTCisPtrTy<1>, SDTCisInt<2>, SDTCisInt<3>]
+>;
+def wasm_memory_fill : SDNode<"WebAssemblyISD::MEMORY_FILL", wasm_memory_fill_t,
+ [SDNPHasChain, SDNPMayStore]>;
+
+// memory.fill with a branch to avoid trapping
def wasm_memset_t : SDTypeProfile<0, 4,
[SDTCisInt<0>, SDTCisPtrTy<1>, SDTCisInt<2>, SDTCisInt<3>]
>;
-def wasm_memset : SDNode<"WebAssemblyISD::MEMORY_FILL", wasm_memset_t,
+def wasm_memset : SDNode<"WebAssemblyISD::MEMSET", wasm_memset_t,
[SDNPHasChain, SDNPMayStore]>;
multiclass BulkMemoryOps<WebAssemblyRegClass rc, string B> {
@@ -51,25 +68,83 @@ defm DATA_DROP :
[],
"data.drop\t$seg", "data.drop\t$seg", 0x09>;
+}
+
+defm : BulkMemoryOps<I32, "32">;
+defm : BulkMemoryOps<I64, "64">;
+
+// Define copy/fill manually instead of using the `BulkMemoryOps` multiclass
+// because when a multiclass defines opcodes, it gives them anonymous names
+// and we need opcodes with names so that we can handle them with custom code.
+
let mayLoad = 1, mayStore = 1 in
-defm MEMORY_COPY_A#B :
+defm MEMORY_COPY_A32 :
BULK_I<(outs), (ins i32imm_op:$src_idx, i32imm_op:$dst_idx,
- rc:$dst, rc:$src, rc:$len),
+ I32:$dst, I32:$src, I32:$len),
(outs), (ins i32imm_op:$src_idx, i32imm_op:$dst_idx),
- [(wasm_memcpy (i32 imm:$src_idx), (i32 imm:$dst_idx),
- rc:$dst, rc:$src, rc:$len
+ [(wasm_memory_copy (i32 imm:$src_idx), (i32 imm:$dst_idx),
+ I32:$dst, I32:$src, I32:$len
)],
"memory.copy\t$src_idx, $dst_idx, $dst, $src, $len",
"memory.copy\t$src_idx, $dst_idx", 0x0a>;
let mayStore = 1 in
-defm MEMORY_FILL_A#B :
- BULK_I<(outs), (ins i32imm_op:$idx, rc:$dst, I32:$value, rc:$size),
+defm MEMORY_FILL_A32 :
+ BULK_I<(outs), (ins i32imm_op:$idx, I32:$dst, I32:$value, I32:$size),
(outs), (ins i32imm_op:$idx),
- [(wasm_memset (i32 imm:$idx), rc:$dst, I32:$value, rc:$size)],
+ [(wasm_memory_fill (i32 imm:$idx), I32:$dst, I32:$value, I32:$size)],
"memory.fill\t$idx, $dst, $value, $size",
"memory.fill\t$idx", 0x0b>;
-}
-defm : BulkMemoryOps<I32, "32">;
-defm : BulkMemoryOps<I64, "64">;
+let mayLoad = 1, mayStore = 1 in
+defm MEMORY_COPY_A64 :
+ BULK_I<(outs), (ins i32imm_op:$src_idx, i32imm_op:$dst_idx,
+ I64:$dst, I64:$src, I64:$len),
+ (outs), (ins i32imm_op:$src_idx, i32imm_op:$dst_idx),
+ [(wasm_memory_copy (i32 imm:$src_idx), (i32 imm:$dst_idx),
+ I64:$dst, I64:$src, I64:$len
+ )],
+ "memory.copy\t$src_idx, $dst_idx, $dst, $src, $len",
+ "memory.copy\t$src_idx, $dst_idx", 0x0a>;
+
+let mayStore = 1 in
+defm MEMORY_FILL_A64 :
+ BULK_I<(outs), (ins i32imm_op:$idx, I64:$dst, I32:$value, I64:$size),
+ (outs), (ins i32imm_op:$idx),
+ [(wasm_memory_fill (i32 imm:$idx), I64:$dst, I32:$value, I64:$size)],
+ "memory.fill\t$idx, $dst, $value, $size",
+ "memory.fill\t$idx", 0x0b>;
+
+let usesCustomInserter = 1, isCodeGenOnly = 1, mayLoad = 1, mayStore = 1 in
+defm MEMCPY_A32 : I<(outs), (ins i32imm_op:$src_idx, i32imm_op:$dst_idx,
+ I32:$dst, I32:$src, I32:$len),
+ (outs), (ins i32imm_op:$src_idx, i32imm_op:$dst_idx),
+ [(wasm_memcpy (i32 imm:$src_idx), (i32 imm:$dst_idx),
+ I32:$dst, I32:$src, I32:$len
+ )],
+ "", "", 0>,
+ Requires<[HasBulkMemory]>;
+
+let usesCustomInserter = 1, isCodeGenOnly = 1, mayStore = 1 in
+defm MEMSET_A32 : I<(outs), (ins i32imm_op:$idx, I32:$dst, I32:$value, I32:$size),
+ (outs), (ins i32imm_op:$idx),
+ [(wasm_memset (i32 imm:$idx), I32:$dst, I32:$value, I32:$size)],
+ "", "", 0>,
+ Requires<[HasBulkMemory]>;
+
+let usesCustomInserter = 1, isCodeGenOnly = 1, mayLoad = 1, mayStore = 1 in
+defm MEMCPY_A64 : I<(outs), (ins i32imm_op:$src_idx, i32imm_op:$dst_idx,
+ I64:$dst, I64:$src, I64:$len),
+ (outs), (ins i32imm_op:$src_idx, i32imm_op:$dst_idx),
+ [(wasm_memcpy (i32 imm:$src_idx), (i32 imm:$dst_idx),
+ I64:$dst, I64:$src, I64:$len
+ )],
+ "", "", 0>,
+ Requires<[HasBulkMemory]>;
+
+let usesCustomInserter = 1, isCodeGenOnly = 1, mayStore = 1 in
+defm MEMSET_A64 : I<(outs), (ins i32imm_op:$idx, I64:$dst, I32:$value, I64:$size),
+ (outs), (ins i32imm_op:$idx),
+ [(wasm_memset (i32 imm:$idx), I64:$dst, I32:$value, I64:$size)],
+ "", "", 0>,
+ Requires<[HasBulkMemory]>;
diff --git a/llvm/lib/Target/WebAssembly/WebAssemblySelectionDAGInfo.cpp b/llvm/lib/Target/WebAssembly/WebAssemblySelectionDAGInfo.cpp
index 74af4c8873f735..48119de1216576 100644
--- a/llvm/lib/Target/WebAssembly/WebAssemblySelectionDAGInfo.cpp
+++ b/llvm/lib/Target/WebAssembly/WebAssemblySelectionDAGInfo.cpp
@@ -28,7 +28,11 @@ SDValue WebAssemblySelectionDAGInfo::EmitTargetCodeForMemcpy(
SDValue MemIdx = DAG.getConstant(0, DL, MVT::i32);
auto LenMVT = ST.hasAddr64() ? MVT::i64 : MVT::i32;
- return DAG.getNode(WebAssemblyISD::MEMORY_COPY, DL, MVT::Other,
+
+ // Use `MEMCPY` here instead of `MEMORY_COPY` because `memory.copy` traps
+ // if the pointers are invalid even if the length is zero. `MEMCPY` gets
+ // extra code to handle this in the way that LLVM IR expects.
+ return DAG.getNode(WebAssemblyISD::MEMCPY, DL, MVT::Other,
{Chain, MemIdx, MemIdx, Dst, Src,
DAG.getZExtOrTrunc(Size, DL, LenMVT)});
}
@@ -52,8 +56,13 @@ SDValue WebAssemblySelectionDAGInfo::EmitTargetCodeForMemset(
SDValue MemIdx = DAG.getConstant(0, DL, MVT::i32);
auto LenMVT = ST.hasAddr64() ? MVT::i64 : MVT::i32;
+
+ // Use `MEMSET` here instead of `MEMORY_FILL` because `memory.fill` traps
+ // if the pointers are invalid even if the length is zero. `MEMSET` gets
+ // extra code to handle this in the way that LLVM IR expects.
+ //
// Only low byte matters for val argument, so anyext the i8
- return DAG.getNode(WebAssemblyISD::MEMORY_FILL, DL, MVT::Other, Chain, MemIdx,
+ return DAG.getNode(WebAssemblyISD::MEMSET, DL, MVT::Other, Chain, MemIdx,
Dst, DAG.getAnyExtOrTrunc(Val, DL, MVT::i32),
DAG.getZExtOrTrunc(Size, DL, LenMVT));
}
diff --git a/llvm/test/CodeGen/WebAssembly/bulk-memory.ll b/llvm/test/CodeGen/WebAssembly/bulk-memory.ll
index dc29dc81c13ec2..ae170d757a305a 100644
--- a/llvm/test/CodeGen/WebAssembly/bulk-memory.ll
+++ b/llvm/test/CodeGen/WebAssembly/bulk-memory.ll
@@ -17,7 +17,12 @@ declare void @llvm.memset.p0.i32(ptr, i8, i32, i1)
; CHECK-LABEL: memcpy_i8:
; NO-BULK-MEM-NOT: memory.copy
; BULK-MEM-NEXT: .functype memcpy_i8 (i32, i32, i32) -> ()
+; BULK-MEM-NEXT: block
+; BULK-MEM-NEXT: i32.eqz $push0=, $2
+; BULK-MEM-NEXT: br_if 0, $pop0
; BULK-MEM-NEXT: memory.copy 0, 0, $0, $1, $2
+; BULK-MEM-NEXT: .LBB{{.*}}:
+; BULK-MEM-NEXT: end_block
; BULK-MEM-NEXT: return
define void @memcpy_i8(ptr %dest, ptr %src, i8 zeroext %len) {
call void @llvm.memcpy.p0.p0.i8(ptr %dest, ptr %src, i8 %len, i1 0)
@@ -27,7 +32,12 @@ define void @memcpy_i8(ptr %dest, ptr %src, i8 zeroext %len) {
; CHECK-LABEL: memmove_i8:
; NO-BULK-MEM-NOT: memory.copy
; BULK-MEM-NEXT: .functype memmove_i8 (i32, i32, i32) -> ()
+; BULK-MEM-NEXT: block
+; BULK-MEM-NEXT: i32.eqz $push0=, $2
+; BULK-MEM-NEXT: br_if 0, $pop0
; BULK-MEM-NEXT: memory.copy 0, 0, $0, $1, $2
+; BULK-MEM-NEXT: .LBB{{.*}}:
+; BULK-MEM-NEXT: end_block
; BULK-MEM-NEXT: return
define void @memmove_i8(ptr %dest, ptr %src, i8 zeroext %len) {
call void @llvm.memmove.p0.p0.i8(ptr %dest, ptr %src, i8 %len, i1 0)
@@ -37,7 +47,12 @@ define void @memmove_i8(ptr %dest, ptr %src, i8 zeroext %len) {
; CHECK-LABEL: memset_i8:
; NO-BULK-MEM-NOT: memory.fill
; BULK-MEM-NEXT: .functype memset_i8 (i32, i32, i32) -> ()
+; BULK-MEM-NEXT: block
+; BULK-MEM-NEXT: i32.eqz $push0=, $2
+; BULK-MEM-NEXT: br_if 0, $pop0
; BULK-MEM-NEXT: memory.fill 0, $0, $1, $2
+; BULK-MEM-NEXT: .LBB{{.*}}:
+; BULK-MEM-NEXT: end_block
; BULK-MEM-NEXT: return
define void @memset_i8(ptr %dest, i8 %val, i8 zeroext %len) {
call void @llvm.memset.p0.i8(ptr %dest, i8 %val, i8 %len, i1 0)
@@ -47,7 +62,12 @@ define void @memset_i8(ptr %dest, i8 %val, i8 zeroext %len) {
; CHECK-LABEL: memcpy_i32:
; NO-BULK-MEM-NOT: memory.copy
; BULK-MEM-NEXT: .functype memcpy_i32 (i32, i32, i32) -> ()
+; BULK-MEM-NEXT: block
+; BULK-MEM-NEXT: i32.eqz $push0=, $2
+; BULK-MEM-NEXT: br_if 0, $pop0
; BULK-MEM-NEXT: memory.copy 0, 0, $0, $1, $2
+; BULK-MEM-NEXT: .LBB{{.*}}:
+; BULK-MEM-NEXT: end_block
; BULK-MEM-NEXT: return
define void @memcpy_i32(ptr %dest, ptr %src, i32 %len) {
call void @llvm.memcpy.p0.p0.i32(ptr %dest, ptr %src, i32 %len, i1 0)
@@ -57,7 +77,12 @@ define void @memcpy_i32(ptr %dest, ptr %src, i32 %len) {
; CHECK-LABEL: memmove_i32:
; NO-BULK-MEM-NOT: memory.copy
; BULK-MEM-NEXT: .functype memmove_i32 (i32, i32, i32) -> ()
+; BULK-MEM-NEXT: block
+; BULK-MEM-NEXT: i32.eqz $push0=, $2
+; BULK-MEM-NEXT: br_if 0, $pop0
; BULK-MEM-NEXT: memory.copy 0, 0, $0, $1, $2
+; BULK-MEM-NEXT: .LBB{{.*}}:
+; BULK-MEM-NEXT: end_block
; BULK-MEM-NEXT: return
define void @memmove_i32(ptr %dest, ptr %src, i32 %len) {
call void @llvm.memmove.p0.p0.i32(ptr %dest, ptr %src, i32 %len, i1 0)
@@ -67,7 +92,12 @@ define void @memmove_i32(ptr %dest, ptr %src, i32 %len) {
; CHECK-LABEL: memset_i32:
; NO-BULK-MEM-NOT: memory.fill
; BULK-MEM-NEXT: .functype memset_i32 (i32, i32, i32) -> ()
+; BULK-MEM-NEXT: block
+; BULK-MEM-NEXT: i32.eqz $push0=, $2
+; BULK-MEM-NEXT: br_if 0, $pop0
; BULK-MEM-NEXT: memory.fill 0, $0, $1, $2
+; BULK-MEM-NEXT: .LBB{{.*}}:
+; BULK-MEM-NEXT: end_block
; BULK-MEM-NEXT: return
define void @memset_i32(ptr %dest, i8 %val, i32 %len) {
call void @llvm.memset.p0.i32(ptr %dest, i8 %val, i32 %len, i1 0)
@@ -107,8 +137,14 @@ define void @memset_1(ptr %dest, i8 %val) {
; CHECK-LABEL: memcpy_1024:
; NO-BULK-MEM-NOT: memory.copy
; BULK-MEM-NEXT: .functype memcpy_1024 (i32, i32) -> ()
+; BULK-MEM-NEXT: block
; BULK-MEM-NEXT: i32.const $push[[L0:[0-9]+]]=, 1024
-; BULK-MEM-NEXT: memory.copy 0, 0, $0, $1, $pop[[L0]]
+; BULK-MEM-NEXT: i32.eqz $push[[L1:[0-9]+]]=, $pop[[L0]]
+; BULK-MEM-NEXT: br_if 0, $pop[[L1]]
+; BULK-MEM-NEXT: i32.const $push[[L2:[0-9]+]]=, 1024
+; BULK-MEM-NEXT: memory.copy 0, 0, $0, $1, $pop[[L2]]
+; BULK-MEM-NEXT: .LBB{{.*}}:
+; BULK-MEM-NEXT: end_block
; BULK-MEM-NEXT: return
define void @memcpy_1024(ptr %dest, ptr %src) {
call void @llvm.memcpy.p0.p0.i32(ptr %dest, ptr %src, i32 1024, i1 0)
@@ -118,8 +154,14 @@ define void @memcpy_1024(ptr %dest, ptr %src) {
; CHECK-LABEL: memmove_1024:
; NO-BULK-MEM-NOT: memory.copy
; BULK-MEM-NEXT: .functype memmove_1024 (i32, i32) -> ()
+; BULK-MEM-NEXT: block
; BULK-MEM-NEXT: i32.const $push[[L0:[0-9]+]]=, 1024
-; BULK-MEM-NEXT: memory.copy 0, 0, $0, $1, $pop[[L0]]
+; BULK-MEM-NEXT: i32.eqz $push[[L1:[0-9]+]]=, $pop[[L0]]
+; BULK-MEM-NEXT: br_if 0, $pop[[L1]]
+; BULK-MEM-NEXT: i32.const $push[[L2:[0-9]+]]=, 1024
+; BULK-MEM-NEXT: memory.copy 0, 0, $0, $1, $pop[[L2]]
+; BULK-MEM-NEXT: .LBB{{.*}}:
+; BULK-MEM-NEXT: end_block
; BULK-MEM-NEXT: return
define void @memmove_1024(ptr %dest, ptr %src) {
call void @llvm.memmove.p0.p0.i32(ptr %dest, ptr %src, i32 1024, i1 0)
@@ -129,8 +171,14 @@ define void @memmove_1024(ptr %dest, ptr %src) {
; CHECK-LABEL: memset_1024:
; NO-BULK-MEM-NOT: memory.fill
; BULK-MEM-NEXT: .functype memset_1024 (i32, i32) -> ()
+; BULK-MEM-NEXT: block
; BULK-MEM-NEXT: i32.const $push[[L0:[0-9]+]]=, 1024
-; BULK-MEM-NEXT: memory.fill 0, $0, $1, $pop[[L0]]
+; BULK-MEM-NEXT: i32.eqz $push[[L1:[0-9]+]]=, $pop[[L0]]
+; BULK-MEM-NEXT: br_if 0, $pop[[L1]]
+; BULK-MEM-NEXT: i32.const $push[[L2:[0-9]+]]=, 1024
+; BULK-MEM-NEXT: memory.fill 0, $0, $1, $pop[[L2]]
+; BULK-MEM-NEXT: .LBB{{.*}}:
+; BULK-MEM-NEXT: end_block
; BULK-MEM-NEXT: return
define void @memset_1024(ptr %dest, i8 %val) {
call void @llvm.memset.p0.i32(ptr %dest, i8 %val, i32 1024, i1 0)
@@ -153,11 +201,17 @@ define void @memset_1024(ptr %dest, i8 %val) {
; BULK-MEM-NEXT: .functype memcpy_alloca_src (i32) -> ()
; BULK-MEM-NEXT: global.get $push[[L0:[0-9]+]]=, __stack_pointer
; BULK-MEM-NEXT: i32.const $push[[L1:[0-9]+]]=, 112
-; BULK-MEM-NEXT: i32.sub $push[[L2:[0-9]+]]=, $pop[[L0]], $pop[[L1]]
-; BULK-MEM-NEXT: i32.const $push[[L3:[0-9]+]]=, 12
-; BULK-MEM-NEXT: i32.add $push[[L4:[0-9]+]]=, $pop[[L2]], $pop[[L3]]
-; BULK-MEM-NEXT: i32.const $push[[L5:[0-9]+]]=, 100
-; BULK-MEM-NEXT: memory.copy 0, 0, $0, $pop[[L4]], $pop[[L5]]
+; BULK-MEM-NEXT: i32.sub $[[L2:[0-9]+]]=, $pop[[L0]], $pop[[L1]]
+; BULK-MEM-NEXT: block
+; BULK-MEM-NEXT: i32.const $push[[L3:[0-9]+]]=, 100
+; BULK-MEM-NEXT: i32.eqz $push[[L4:[0-9]+]]=, $pop[[L3]]
+; BULK-MEM-NEXT: br_if 0, $pop[[L4]]
+; BULK-MEM-NEXT: i32.const $push[[L5:[0-9]+]]=, 12
+; BULK-MEM-NEXT: i32.add $push[[L6:[0-9]+]]=, $[[L2]], $pop[[L5]]
+; BULK-MEM-NEXT: i32.const $push[[L7:[0-9]+]]=, 100
+; BULK-MEM-NEXT: memory.copy 0, 0, $0, $pop[[L6]], $pop[[L7]]
+; BULK-MEM-NEXT: .LBB{{.*}}:
+; BULK-MEM-NEXT: end_block
; BULK-MEM-NEXT: return
define void @memcpy_alloca_src(ptr %dst) {
%a = alloca [100 x i8]
@@ -170,11 +224,17 @@ define void @memcpy_alloca_src(ptr %dst) {
; BULK-MEM-NEXT: .functype memcpy_alloca_dst (i32) -> ()
; BULK-MEM-NEXT: global.get $push[[L0:[0-9]+]]=, __stack_pointer
; BULK-MEM-NEXT: i32.const $push[[L1:[0-9]+]]=, 112
-; BULK-MEM-NEXT: i32.sub $push[[L2:[0-9]+]]=, $pop[[L0]], $pop[[L1]]
-; BULK-MEM-NEXT: i32.const $push[[L3:[0-9]+]]=, 12
-; BULK-MEM-NEXT: i32.add $push[[L4:[0-9]+]]=, $pop[[L2]], $pop[[L3]]
-; BULK-MEM-NEXT: i32.const $push[[L5:[0-9]+]]=, 100
-; BULK-MEM-NEXT: memory.copy 0, 0, $pop[[L4]], $0, $pop[[L5]]
+; BULK-MEM-NEXT: i32.sub $[[L2:[0-9]+]]=, $pop[[L0]], $pop[[L1]]
+; BULK-MEM-NEXT: block
+; BULK-MEM-NEXT: i32.const $push[[L3:[0-9]+]]=, 100
+; BULK-MEM-NEXT: i32.eqz $push[[L4:[0-9]+]]=, $pop[[L3]]
+; BULK-MEM-NEXT: br_if 0, $pop[[L4]]
+; BULK-MEM-NEXT: i32.const $push[[L5:[0-9]+]]=, 12
+; BULK-MEM-NEXT: i32.add $push[[L6:[0-9]+]]=, $[[L2]], $pop[[L5]]
+; BULK-MEM-NEXT: i32.const $push[[L7:[0-9]+]]=, 100
+; BULK-MEM-NEXT: memory.copy 0, 0, $pop[[L6]], $0, $pop[[L7]]
+; BULK-MEM-NEXT: .LBB{{.*}}:
+; BULK-MEM-NEXT: end_block
; BULK-MEM-NEXT: return
define void @memcpy_alloca_dst(ptr %src) {
%a = alloca [100 x i8]
@@ -187,11 +247,17 @@ define void @memcpy_alloca_dst(ptr %src) {
; BULK-MEM-NEXT: .functype memset_alloca (i32) -> ()
; BULK-MEM-NEXT: global.get $push[[L0:[0-9]+]]=, __stack_pointer
; BULK-MEM-NEXT: i32.const $push[[L1:[0-9]+]]=, 112
-; BULK-MEM-NEXT: i32.sub $push[[L2:[0-9]+]]=, $pop[[L0]], $pop[[L1]]
-; BULK-MEM-NEXT: i32.const $push[[L3:[0-9]+]]=, 12
-; BULK-MEM-NEXT: i32.add $push[[L4:[0-9]+]]=, $pop[[L2]], $pop[[L3]]
-; BULK-MEM-NEXT: i32.const $push[[L5:[0-9]+]]=, 100
-; BULK-MEM-NEXT: memory.fill 0, $pop[[L4]], $0, $pop[[L5]]
+; BULK-MEM-NEXT: i32.sub $1=, $pop[[L0]], $pop[[L1]]
+; BULK-MEM-NEXT: block
+; BULK-MEM-NEXT: i32.const $push[[L2:[0-9]+]]=, 100
+; BULK-MEM-NEXT: i32.eqz $push[[L3:[0-9]+]]=, $pop[[L2]]
+; BULK-MEM-NEXT: br_if 0, $pop[[L3]]
+; BULK-MEM-NEXT: i32.const $push[[L4:[0-9]+]]=, 12
+; BULK-MEM-NEXT: i32.add $push[[L5:[0-9]+]]=, $1, $pop[[L4]]
+; BULK-MEM-NEXT: i32.const $push[[L6:[0-9]+]]=, 100
+; BULK-MEM-NEXT: memory.fill 0, $pop[[L5]], $0, $pop[[L6]]
+; BULK-MEM-NEXT: .LBB{{.*}}:
+; BULK-MEM-NEXT: end_block
; BULK-MEM-NEXT: return
define void @memset_alloca(i8 %val) {
%a = alloca [100 x i8]
diff --git a/llvm/test/CodeGen/WebAssembly/bulk-memory64.ll b/llvm/test/CodeGen/WebAssembly/bulk-memory64.ll
index 8ee5f6314381cd..0cf8493a995f96 100644
--- a/llvm/test/CodeGen/WebAssembly/bulk-memory64.ll
+++ b/llvm/test/CodeGen/WebAssembly/bulk-memory64.ll
@@ -17,8 +17,14 @@ declare void @llvm.memset.p0.i64(ptr, i8, i64, i1)
; CHECK-LABEL: memcpy_i8:
; NO-BULK-MEM-NOT: memory.copy
; BULK-MEM-NEXT: .functype memcpy_i8 (i64, i64, i32) -> ()
-; BULK-MEM-NEXT: i64.extend_i32_u $push0=, $2
-; BULK-MEM-NEXT: memory.copy 0, 0, $0, $1, $pop0
+; BULK-MEM-NEXT: block
+; BULK-MEM-NEXT: i64.extend_i32_u $push[[L0:[0-9]+]]=, $2
+; BULK-MEM-NEXT: local.tee $push[[L1:[0-9]+]]=, $3=, $pop[[L0]]
+; BULK-MEM-NEXT: i64.eqz $push0=, $pop[[L1]]
+; BULK-MEM-NEXT: br_if 0, $pop0
+; BULK-MEM-NEXT: memory.copy 0, 0, $0, $1, $3
+; BULK-MEM-NEXT: .LBB{{.*}}:
+; BULK-MEM-NEXT: end_block
; BULK-MEM-NEXT: return
define void @memcpy_i8(ptr %dest, ptr %src, i8 zeroext %len) {
call void @llvm.memcpy.p0.p0.i8(ptr %dest, ptr %src, i8 %len, i1 0)
@@ -28,8 +34,14 @@ define void @memcpy_i8(ptr %dest, ptr %src, i8 zeroext %len) {
; CHECK-LABEL: memmove_i8:
; NO-BULK-MEM-NOT: memory.copy
; BULK-MEM-NEXT: .functype memmove_i8 (i64, i64, i32) -> ()
-; BULK-MEM-NEXT: i64.extend_i32_u $push0=, $2
-; BULK-MEM-NEXT: memory.copy 0, 0, $0, $1, $pop0
+; BULK-MEM-NEXT: block
+; BULK-MEM-NEXT: i64.extend_i32_u $push[[L0:[0-9]+]]=, $2
+; BULK-MEM-NEXT: local.tee $push[[L1:[0-9]+]]=, $3=, $pop[[L0]]
+; BULK-MEM-NEXT: i64.eqz $push0=, $pop[[L1]]
+; BULK-MEM-NEXT: br_if 0, $pop0
+; BULK-MEM-NEXT: memory.copy 0, 0, $0, $1, $3
+; BULK-MEM-NEXT: .LBB{{.*}}:
+; BULK-MEM-NEXT: end_block
; BULK-MEM-NEXT: return
define void @memmove_i8(ptr %dest, ptr %src, i8 zeroext %len) {
call void @llvm.memmove.p0.p0.i8(ptr %dest, ptr %src, i8 %len, i1 0)
@@ -39,8 +51,14 @@ define void @memmove_i8(ptr %dest, ptr %src, i8 zeroext %len) {
; CHECK-LABEL: memset_i8:
; NO-BULK-MEM-NOT: memory.fill
; BULK-MEM-NEXT: .functype memset_i8 (i64, i32, i32) -> ()
-; BULK-MEM-NEXT: i64.extend_i32_u $push0=, $2
-; BULK-MEM-NEXT: memory.fill 0, $0, $1, $pop0
+; BULK-MEM-NEXT: block
+; BULK-MEM-NEXT: i64.extend_i32_u $push[[L0:[0-9]+]]=, $2
+; BULK-MEM-NEXT: local.tee $push[[L1:[0-9]+]]=, $3=, $pop[[L0]]
+; BULK-MEM-NEXT: i64.eqz $push0=, $pop[[L1]]
+; BULK-MEM-NEXT: br_if 0, $pop0
+; BULK-MEM-NEXT: memory.fill 0, $0, $1, $3
+; BULK-MEM-NEXT: .LBB{{.*}}:
+; BULK-MEM-NEXT: end_block
; BULK-MEM-NEXT: return
define void @memset_i8(ptr %dest, i8 %val, i8 zeroext %len) {
call void @llvm.memset.p0.i8(ptr %dest, i8 %val, i8 %len, i1 0)
@@ -50,7 +68,12 @@ define void @memset_i8(ptr %dest, i8 %val, i8 zeroext %len) {
; CHECK-LABEL: memcpy_i32:
; NO-BULK-MEM-NOT: memory.copy
; BULK-MEM-NEXT: .functype memcpy_i32 (i64, i64, i64) -> ()
-; BULK-MEM-NEXT: memory.copy 0, 0, $0, $1, $2
+; BULK-MEM-NEXT: block
+; BULK-MEM-NEXT: i64.eqz $push0=, $2
+; BULK-MEM-NEXT: br_if 0, $pop0
+; BULK-MEM-NEXT: memory.copy 0, 0, $0, $1, $2
+; BULK-MEM-NEXT: .LBB{{.*}}:
+; BULK-MEM-NEXT: end_block
; BULK-MEM-NEXT: return
define void @memcpy_i32(ptr %dest, ptr %src, i64 %len) {
call void @llvm.memcpy.p0.p0.i64(ptr %dest, ptr %src, i64 %len, i1 0)
@@ -60,7 +83,12 @@ define void @memcpy_i32(ptr %dest, ptr %src, i64 %len) {
; CHECK-LABEL: memmove_i32:
; NO-BULK-MEM-NOT: memory.copy
; BULK-MEM-NEXT: .functype memmove_i32 (i64, i64, i64) -> ()
-; BULK-MEM-NEXT: memory.copy 0, 0, $0, $1, $2
+; BULK-MEM-NEXT: block
+; BULK-MEM-NEXT: i64.eqz $push0=, $2
+; BULK-MEM-NEXT: br_if 0, $pop0
+; BULK-MEM-NEXT: memory.copy 0, 0, $0, $1, $2
+; BULK-MEM-NEXT: .LBB{{.*}}:
+; BULK-MEM-NEXT: end_block
; BULK-MEM-NEXT: return
define void @memmove_i32(ptr %dest, ptr %src, i64 %len) {
call void @llvm.memmove.p0.p0.i64(ptr %dest, ptr %src, i64 %len, i1 0)
@@ -70,7 +98,12 @@ define void @memmove_i32(ptr %dest, ptr %src, i64 %len) {
; CHECK-LABEL: memset_i32:
; NO-BULK-MEM-NOT: memory.fill
; BULK-MEM-NEXT: .functype memset_i32 (i64, i32, i64) -> ()
-; BULK-MEM-NEXT: memory.fill 0, $0, $1, $2
+; BULK-MEM-NEXT: block
+; BULK-MEM-NEXT: i64.eqz $push0=, $2
+; BULK-MEM-NEXT: br_if 0, $pop0
+; BULK-MEM-NEXT: memory.fill 0, $0, $1, $2
+; BULK-MEM-NEXT: .LBB{{.*}}:
+; BULK-MEM-NEXT: end_block
; BULK-MEM-NEXT: return
define void @memset_i32(ptr %dest, i8 %val, i64 %len) {
call void @llvm.memset.p0.i64(ptr %dest, i8 %val, i64 %len, i1 0)
@@ -110,8 +143,14 @@ define void @memset_1(ptr %dest, i8 %val) {
; CHECK-LABEL: memcpy_1024:
; NO-BULK-MEM-NOT: memory.copy
; BULK-MEM-NEXT: .functype memcpy_1024 (i64, i64) -> ()
-; BULK-MEM-NEXT: i64.const $push[[L0:[0-9]+]]=, 1024
-; BULK-MEM-NEXT: memory.copy 0, 0, $0, $1, $pop[[L0]]
+; BULK-MEM-NEXT: block
+; BULK-MEM-NEXT: i64.const $push[[L1:[0-9]+]]=, 1024
+; BULK-MEM-NEXT: i64.eqz $push0=, $pop[[L1]]
+; BULK-MEM-NEXT: br_if 0, $pop0
+; BULK-MEM-NEXT: i64.const $push[[L0:[0-9]+]]=, 1024
+; BULK-MEM-NEXT: memory.copy 0, 0, $0, $1, $pop[[L0]]
+; BULK-MEM-NEXT: .LBB{{.*}}:
+; BULK-MEM-NEXT: end_block
; BULK-MEM-NEXT: return
define void @memcpy_1024(ptr %dest, ptr %src) {
call void @llvm.memcpy.p0.p0.i64(ptr %dest, ptr %src, i64 1024, i1 0)
@@ -121,8 +160,14 @@ define void @memcpy_1024(ptr %dest, ptr %src) {
; CHECK-LABEL: memmove_1024:
; NO-BULK-MEM-NOT: memory.copy
; BULK-MEM-NEXT: .functype memmove_1024 (i64, i64) -> ()
-; BULK-MEM-NEXT: i64.const $push[[L0:[0-9]+]]=, 1024
-; BULK-MEM-NEXT: memory.copy 0, 0, $0, $1, $pop[[L0]]
+; BULK-MEM-NEXT: block
+; BULK-MEM-NEXT: i64.const $push[[L1:[0-9]+]]=, 1024
+; BULK-MEM-NEXT: i64.eqz $push0=, $pop[[L1]]
+; BULK-MEM-NEXT: br_if 0, $pop0
+; BULK-MEM-NEXT: i64.const $push[[L0:[0-9]+]]=, 1024
+; BULK-MEM-NEXT: memory.copy 0, 0, $0, $1, $pop[[L0]]
+; BULK-MEM-NEXT: .LBB{{.*}}:
+; BULK-MEM-NEXT: end_block
; BULK-MEM-NEXT: return
define void @memmove_1024(ptr %dest, ptr %src) {
call void @llvm.memmove.p0.p0.i64(ptr %dest, ptr %src, i64 1024, i1 0)
@@ -132,8 +177,14 @@ define void @memmove_1024(ptr %dest, ptr %src) {
; CHECK-LABEL: memset_1024:
; NO-BULK-MEM-NOT: memory.fill
; BULK-MEM-NEXT: .functype memset_1024 (i64, i32) -> ()
-; BULK-MEM-NEXT: i64.const $push[[L0:[0-9]+]]=, 1024
-; BULK-MEM-NEXT: memory.fill 0, $0, $1, $pop[[L0]]
+; BULK-MEM-NEXT: block
+; BULK-MEM-NEXT: i64.const $push[[L1:[0-9]+]]=, 1024
+; BULK-MEM-NEXT: i64.eqz $push0=, $pop[[L1]]
+; BULK-MEM-NEXT: br_if 0, $pop0
+; BULK-MEM-NEXT: i64.const $push[[L0:[0-9]+]]=, 1024
+; BULK-MEM-NEXT: memory.fill 0, $0, $1, $pop[[L0]]
+; BULK-MEM-NEXT: .LBB{{.*}}:
+; BULK-MEM-NEXT: end_block
; BULK-MEM-NEXT: return
define void @memset_1024(ptr %dest, i8 %val) {
call void @llvm.memset.p0.i64(ptr %dest, i8 %val, i64 1024, i1 0)
@@ -154,13 +205,19 @@ define void @memset_1024(ptr %dest, i8 %val) {
; CHECK-LABEL: memcpy_alloca_src:
; NO-BULK-MEM-NOT: memory.copy
; BULK-MEM-NEXT: .functype memcpy_alloca_src (i64) -> ()
-; BULK-MEM-NEXT: global.get $push[[L0:[0-9]+]]=, __stack_pointer
-; BULK-MEM-NEXT: i64.const $push[[L1:[0-9]+]]=, 112
-; BULK-MEM-NEXT: i64.sub $push[[L2:[0-9]+]]=, $pop[[L0]], $pop[[L1]]
-; BULK-MEM-NEXT: i64.const $push[[L3:[0-9]+]]=, 12
-; BULK-MEM-NEXT: i64.add $push[[L4:[0-9]+]]=, $pop[[L2]], $pop[[L3]]
-; BULK-MEM-NEXT: i64.const $push[[L5:[0-9]+]]=, 100
-; BULK-MEM-NEXT: memory.copy 0, 0, $0, $pop[[L4]], $pop[[L5]]
+; BULK-MEM-NEXT: global.get $push[[L1:[0-9]+]]=, __stack_pointer
+; BULK-MEM-NEXT: i64.const $push[[L0:[0-9]+]]=, 112
+; BULK-MEM-NEXT: i64.sub $[[L2:[0-9]+]]=, $pop[[L1]], $pop[[L0]]
+; BULK-MEM-NEXT: block
+; BULK-MEM-NEXT: i64.const $push[[L3:[0-9]+]]=, 100
+; BULK-MEM-NEXT: i64.eqz $push[[L4:[0-9]+]]=, $pop[[L3]]
+; BULK-MEM-NEXT: br_if 0, $pop[[L4]]
+; BULK-MEM-NEXT: i64.const $push[[L5:[0-9]+]]=, 12
+; BULK-MEM-NEXT: i64.add $push[[L6:[0-9]+]]=, $[[L2]], $pop[[L5]]
+; BULK-MEM-NEXT: i64.const $push[[L7:[0-9]+]]=, 100
+; BULK-MEM-NEXT: memory.copy 0, 0, $0, $pop[[L6]], $pop[[L7]]
+; BULK-MEM-NEXT: .LBB{{.*}}:
+; BULK-MEM-NEXT: end_block
; BULK-MEM-NEXT: return
define void @memcpy_alloca_src(ptr %dst) {
%a = alloca [100 x i8]
@@ -171,13 +228,19 @@ define void @memcpy_alloca_src(ptr %dst) {
; CHECK-LABEL: memcpy_alloca_dst:
; NO-BULK-MEM-NOT: memory.copy
; BULK-MEM-NEXT: .functype memcpy_alloca_dst (i64) -> ()
-; BULK-MEM-NEXT: global.get $push[[L0:[0-9]+]]=, __stack_pointer
-; BULK-MEM-NEXT: i64.const $push[[L1:[0-9]+]]=, 112
-; BULK-MEM-NEXT: i64.sub $push[[L2:[0-9]+]]=, $pop[[L0]], $pop[[L1]]
-; BULK-MEM-NEXT: i64.const $push[[L3:[0-9]+]]=, 12
-; BULK-MEM-NEXT: i64.add $push[[L4:[0-9]+]]=, $pop[[L2]], $pop[[L3]]
-; BULK-MEM-NEXT: i64.const $push[[L5:[0-9]+]]=, 100
-; BULK-MEM-NEXT: memory.copy 0, 0, $pop[[L4]], $0, $pop[[L5]]
+; BULK-MEM-NEXT: global.get $push[[L1:[0-9]+]]=, __stack_pointer
+; BULK-MEM-NEXT: i64.const $push[[L0:[0-9]+]]=, 112
+; BULK-MEM-NEXT: i64.sub $[[L2:[0-9]+]]=, $pop[[L1]], $pop[[L0]]
+; BULK-MEM-NEXT: block
+; BULK-MEM-NEXT: i64.const $push[[L3:[0-9]+]]=, 100
+; BULK-MEM-NEXT: i64.eqz $push[[L4:[0-9]+]]=, $pop[[L3]]
+; BULK-MEM-NEXT: br_if 0, $pop[[L4]]
+; BULK-MEM-NEXT: i64.const $push[[L5:[0-9]+]]=, 12
+; BULK-MEM-NEXT: i64.add $push[[L6:[0-9]+]]=, $[[L2]], $pop[[L5]]
+; BULK-MEM-NEXT: i64.const $push[[L7:[0-9]+]]=, 100
+; BULK-MEM-NEXT: memory.copy 0, 0, $pop[[L6]], $0, $pop[[L7]]
+; BULK-MEM-NEXT: .LBB{{.*}}:
+; BULK-MEM-NEXT: end_block
; BULK-MEM-NEXT: return
define void @memcpy_alloca_dst(ptr %src) {
%a = alloca [100 x i8]
@@ -188,13 +251,19 @@ define void @memcpy_alloca_dst(ptr %src) {
; CHECK-LABEL: memset_alloca:
; NO-BULK-MEM-NOT: memory.fill
; BULK-MEM-NEXT: .functype memset_alloca (i32) -> ()
-; BULK-MEM-NEXT: global.get $push[[L0:[0-9]+]]=, __stack_pointer
-; BULK-MEM-NEXT: i64.const $push[[L1:[0-9]+]]=, 112
-; BULK-MEM-NEXT: i64.sub $push[[L2:[0-9]+]]=, $pop[[L0]], $pop[[L1]]
-; BULK-MEM-NEXT: i64.const $push[[L3:[0-9]+]]=, 12
-; BULK-MEM-NEXT: i64.add $push[[L4:[0-9]+]]=, $pop[[L2]], $pop[[L3]]
-; BULK-MEM-NEXT: i64.const $push[[L5:[0-9]+]]=, 100
-; BULK-MEM-NEXT: memory.fill 0, $pop[[L4]], $0, $pop[[L5]]
+; BULK-MEM-NEXT: global.get $push[[L1:[0-9]+]]=, __stack_pointer
+; BULK-MEM-NEXT: i64.const $push[[L0:[0-9]+]]=, 112
+; BULK-MEM-NEXT: i64.sub $1=, $pop[[L1]], $pop[[L0]]
+; BULK-MEM-NEXT: block
+; BULK-MEM-NEXT: i64.const $push[[L2:[0-9]+]]=, 100
+; BULK-MEM-NEXT: i64.eqz $push[[L3:[0-9]+]]=, $pop[[L2]]
+; BULK-MEM-NEXT: br_if 0, $pop[[L3]]
+; BULK-MEM-NEXT: i64.const $push[[L4:[0-9]+]]=, 12
+; BULK-MEM-NEXT: i64.add $push[[L5:[0-9]+]]=, $1, $pop[[L4]]
+; BULK-MEM-NEXT: i64.const $push[[L6:[0-9]+]]=, 100
+; BULK-MEM-NEXT: memory.fill 0, $pop[[L5]], $0, $pop[[L6]]
+; BULK-MEM-NEXT: .LBB{{.*}}:
+; BULK-MEM-NEXT: end_block
; BULK-MEM-NEXT: return
define void @memset_alloca(i8 %val) {
%a = alloca [100 x i8]
More information about the llvm-commits
mailing list