[lld] b2032f1 - [lld][WebAssembly] Relax limitations on multithreaded instantiation

Thomas Lively via llvm-commits llvm-commits at lists.llvm.org
Mon Sep 13 15:03:58 PDT 2021


Author: Thomas Lively
Date: 2021-09-13T15:03:51-07:00
New Revision: b2032f18c9dec45a9cb4163136fa9dcbe256e772

URL: https://github.com/llvm/llvm-project/commit/b2032f18c9dec45a9cb4163136fa9dcbe256e772
DIFF: https://github.com/llvm/llvm-project/commit/b2032f18c9dec45a9cb4163136fa9dcbe256e772.diff

LOG: [lld][WebAssembly] Relax limitations on multithreaded instantiation

For multithreaded modules (i.e. modules with a shared memory), lld injects a
synthetic Wasm start function that is automatically called during instantiation
to initialize memory from passive data segments. Even though the module will be
instantiated separately on each thread, memory initialization should happen only
once. Furthermore, memory initialization should be finished by the time each
thread finishes instantiation. Since multiple threads may be instantiating their
modules at the same time, the synthetic function must synchronize them.

The current synchronization tries to atomically increment a flag from 0 to 1 in
memory then enters one of two cases. First, if the increment was successful, the
current thread is responsible for initializing memory. It does so, increments
the flag to 2 to signify that memory has been initialized, then notifies all
threads waiting on the flag. Otherwise, the thread atomically waits on the flag
with an expected value of 1 until memory has been initialized. Either the
initializer thread finishes initializing memory (i.e. sets the flag to 2) first
and the waiter threads do not end up blocking, or the waiter threads succesfully
start waiting before memory is initialized so they will be woken by the
initializer thread once it has finished.

One complication with this scheme is that there are various contexts on the Web,
most notably on the main browser thread, that cannot successfully execute a
wait. Executing a wait in these contexts causes a trap, and in this case would
cause instantiation to fail. The embedder must therefore ensure that these
contexts win the race and become responsible for initializing memory, since that
is the only code path that does not execute a wait.

Unfortunately, since only one thread can win the race and initialize memory,
this scheme makes it impossible to have multiple threads in contexts that cannot
wait. For example, it is not currently possible to instantiate the module on
both the main browser thread as well as in an AudioWorklet. To loosen this
restriction, this commit inserts an extra check so that the wait will not be
executed at all when memory has already been initialized, i.e. when the flag
value is 2. After this change, the module can be instantiated on threads in
non-waiting contexts as long as the embedder can guarantee either that the
thread will win the race and initialize memory (as before) or that memory has
already been initialized when instantiation begins. Threads in contexts that can
wait can continue racing to initialize memory.

Fixes (or at least improves) PR51702.

Reviewed By: dschuff

Differential Revision: https://reviews.llvm.org/D109722

Added: 
    

Modified: 
    lld/test/wasm/data-segments.ll
    lld/wasm/Writer.cpp
    llvm/include/llvm/BinaryFormat/Wasm.h

Removed: 
    


################################################################################
diff  --git a/lld/test/wasm/data-segments.ll b/lld/test/wasm/data-segments.ll
index 61bf903a116cf..e47f38753d829 100644
--- a/lld/test/wasm/data-segments.ll
+++ b/lld/test/wasm/data-segments.ll
@@ -84,8 +84,8 @@
 ; PASSIVE-NEXT:        Body:            0B
 ; PASSIVE-NEXT:      - Index:           2
 ; PASSIVE-NEXT:        Locals:          []
-; PASSIVE32-NEXT:        Body:            41B4D60041004101FE480200044041B4D6004101427FFE0102001A054180084100410DFC08000041900841004114FC08010041B4D6004102FE17020041B4D600417FFE0002001A0BFC0900FC09010B
-; PASSIVE64-NEXT:        Body:            42B4D60041004101FE480200044042B4D6004101427FFE0102001A054280084100410DFC08000042900841004114FC08010042B4D6004102FE17020042B4D600417FFE0002001A0BFC0900FC09010B
+; PASSIVE32-NEXT:        Body:             02400240024041B4D60041004101FE4802000E020001020B4180084100410DFC08000041900841004114FC08010041B4D6004102FE17020041B4D600417FFE0002001A0C010B41B4D6004101427FFE0102001A0BFC0900FC09010B
+; PASSIVE64-NEXT:        Body:            02400240024042B4D60041004101FE4802000E020001020B4280084100410DFC08000042900841004114FC08010042B4D6004102FE17020042B4D600417FFE0002001A0C010B42B4D6004101427FFE0102001A0BFC0900FC09010B
 ; PASSIVE-NEXT:  - Type:            DATA
 ; PASSIVE-NEXT:    Segments:
 ; PASSIVE-NEXT:      - SectionOffset:   3
@@ -121,8 +121,8 @@
 ; PASSIVE32-PIC-NEXT:          - Type:            I32
 ; PASSIVE64-PIC-NEXT:          - Type:            I64
 ; PASSIVE-PIC-NEXT:            Count:           1
-; PASSIVE32-PIC-NEXT:        Body:            230141B4CE006A2100200041004101FE480200044020004101427FFE0102001A05410023016A4100410DFC080000411023016A41004114FC08010020004102FE1702002000417FFE0002001A0BFC0900FC09010B
-; PASSIVE64-PIC-NEXT:        Body:            230142B4CE007C2100200041004101FE480200044020004101427FFE0102001A05420023017C4100410DFC080000421023017C41004114FC08010020004102FE1702002000417FFE0002001A0BFC0900FC09010B
+; PASSIVE32-PIC-NEXT:          Body:            230141B4CE006A2100024002400240200041004101FE4802000E020001020B410023016A4100410DFC080000411023016A41004114FC08010020004102FE1702002000417FFE0002001A0C010B20004101427FFE0102001A0BFC0900FC09010B
+; PASSIVE64-PIC-NEXT:          Body:            230142B4CE007C2100024002400240200041004101FE4802000E020001020B420023017C4100410DFC080000421023017C41004114FC08010020004102FE1702002000417FFE0002001A0C010B20004101427FFE0102001A0BFC0900FC09010B
 ; PASSIVE-PIC-NEXT:      - Index:           3
 ; PASSIVE-PIC-NEXT:        Locals:          []
 ; PASSIVE-PIC-NEXT:        Body:            0B

diff  --git a/lld/wasm/Writer.cpp b/lld/wasm/Writer.cpp
index ac81f14e547ea..2a81fd31bf7e3 100644
--- a/lld/wasm/Writer.cpp
+++ b/lld/wasm/Writer.cpp
@@ -1022,22 +1022,17 @@ void Writer::createInitMemoryFunction() {
     // initialized. The generated code is as follows:
     //
     // (func $__wasm_init_memory
-    //  (if
-    //   (i32.atomic.rmw.cmpxchg align=2 offset=0
-    //    (i32.const $__init_memory_flag)
-    //    (i32.const 0)
-    //    (i32.const 1)
-    //   )
-    //   (then
-    //    (drop
-    //     (i32.atomic.wait align=2 offset=0
-    //      (i32.const $__init_memory_flag)
-    //      (i32.const 1)
-    //      (i32.const -1)
+    //  (block $drop
+    //   (block $wait
+    //    (block $init
+    //     (br_table $init $wait $drop
+    //      (i32.atomic.rmw.cmpxchg align=2 offset=0
+    //       (i32.const $__init_memory_flag)
+    //       (i32.const 0)
+    //       (i32.const 1)
+    //      )
     //     )
-    //    )
-    //   )
-    //   (else
+    //    ) ;; $init
     //    ( ... initialize data segments ... )
     //    (i32.atomic.store align=2 offset=0
     //     (i32.const $__init_memory_flag)
@@ -1049,8 +1044,16 @@ void Writer::createInitMemoryFunction() {
     //      (i32.const -1u)
     //     )
     //    )
+    //    (br $drop)
+    //   ) ;; $wait
+    //   (drop
+    //    (i32.atomic.wait align=2 offset=0
+    //     (i32.const $__init_memory_flag)
+    //     (i32.const 1)
+    //     (i32.const -1)
+    //    )
     //   )
-    //  )
+    //  ) ;; $drop
     //  ( ... drop data segments ... )
     // )
     //
@@ -1084,29 +1087,31 @@ void Writer::createInitMemoryFunction() {
       }
     };
 
-    // Atomically check whether this is the main thread.
+    // Set up destination blocks
+    writeU8(os, WASM_OPCODE_BLOCK, "block $drop");
+    writeU8(os, WASM_TYPE_NORESULT, "block type");
+    writeU8(os, WASM_OPCODE_BLOCK, "block $wait");
+    writeU8(os, WASM_TYPE_NORESULT, "block type");
+    writeU8(os, WASM_OPCODE_BLOCK, "block $init");
+    writeU8(os, WASM_TYPE_NORESULT, "block type");
+
+    // Atomically check whether we win the race.
     writeGetFlagAddress();
     writeI32Const(os, 0, "expected flag value");
-    writeI32Const(os, 1, "flag value");
+    writeI32Const(os, 1, "new flag value");
     writeU8(os, WASM_OPCODE_ATOMICS_PREFIX, "atomics prefix");
     writeUleb128(os, WASM_OPCODE_I32_RMW_CMPXCHG, "i32.atomic.rmw.cmpxchg");
     writeMemArg(os, 2, 0);
-    writeU8(os, WASM_OPCODE_IF, "IF");
-    writeU8(os, WASM_TYPE_NORESULT, "blocktype");
-
-    // Did not increment 0, so wait for main thread to initialize memory
-    writeGetFlagAddress();
-    writeI32Const(os, 1, "expected flag value");
-    writeI64Const(os, -1, "timeout");
-
-    writeU8(os, WASM_OPCODE_ATOMICS_PREFIX, "atomics prefix");
-    writeUleb128(os, WASM_OPCODE_I32_ATOMIC_WAIT, "i32.atomic.wait");
-    writeMemArg(os, 2, 0);
-    writeU8(os, WASM_OPCODE_DROP, "drop");
 
-    writeU8(os, WASM_OPCODE_ELSE, "ELSE");
+    // Based on the value, decide what to do next.
+    writeU8(os, WASM_OPCODE_BR_TABLE, "br_table");
+    writeUleb128(os, 2, "label vector length");
+    writeUleb128(os, 0, "label $init");
+    writeUleb128(os, 1, "label $wait");
+    writeUleb128(os, 2, "default label $drop");
 
-    // Did increment 0, so conditionally initialize passive data segments
+    // Initialize passive data segments
+    writeU8(os, WASM_OPCODE_END, "end $init");
     for (const OutputSegment *s : segments) {
       if (needsPassiveInitialization(s)) {
         // destination address
@@ -1145,9 +1150,23 @@ void Writer::createInitMemoryFunction() {
     writeMemArg(os, 2, 0);
     writeU8(os, WASM_OPCODE_DROP, "drop");
 
-    writeU8(os, WASM_OPCODE_END, "END");
+    // Branch to drop the segments
+    writeU8(os, WASM_OPCODE_BR, "br");
+    writeUleb128(os, 1, "label $drop");
+
+    // Wait for the winning thread to initialize memory
+    writeU8(os, WASM_OPCODE_END, "end $wait");
+    writeGetFlagAddress();
+    writeI32Const(os, 1, "expected flag value");
+    writeI64Const(os, -1, "timeout");
+
+    writeU8(os, WASM_OPCODE_ATOMICS_PREFIX, "atomics prefix");
+    writeUleb128(os, WASM_OPCODE_I32_ATOMIC_WAIT, "i32.atomic.wait");
+    writeMemArg(os, 2, 0);
+    writeU8(os, WASM_OPCODE_DROP, "drop");
 
     // Unconditionally drop passive data segments
+    writeU8(os, WASM_OPCODE_END, "end $drop");
     for (const OutputSegment *s : segments) {
       if (needsPassiveInitialization(s)) {
         // data.drop instruction
@@ -1156,6 +1175,8 @@ void Writer::createInitMemoryFunction() {
         writeUleb128(os, s->index, "segment index immediate");
       }
     }
+
+    // End the function
     writeU8(os, WASM_OPCODE_END, "END");
   }
 

diff  --git a/llvm/include/llvm/BinaryFormat/Wasm.h b/llvm/include/llvm/BinaryFormat/Wasm.h
index f49828ca87574..966dc1bbc886d 100644
--- a/llvm/include/llvm/BinaryFormat/Wasm.h
+++ b/llvm/include/llvm/BinaryFormat/Wasm.h
@@ -284,8 +284,10 @@ enum : unsigned {
 
 // Opcodes used in synthetic functions.
 enum : unsigned {
-  WASM_OPCODE_IF = 0x04,
-  WASM_OPCODE_ELSE = 0x05,
+  WASM_OPCODE_BLOCK = 0x02,
+  WASM_OPCODE_BR = 0x0c,
+  WASM_OPCODE_BR_TABLE = 0x0e,
+  WASM_OPCODE_RETURN = 0x0f,
   WASM_OPCODE_DROP = 0x1a,
   WASM_OPCODE_MISC_PREFIX = 0xfc,
   WASM_OPCODE_MEMORY_INIT = 0x08,


        


More information about the llvm-commits mailing list