[PATCH] D64537: [WebAssembly] Implement thread-local storage for non-PIC cases

Heejin Ahn via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Thu Jul 11 08:24:33 PDT 2019


aheejin added a comment.

Nice!
Then where should we call `__wasm_init_tls`, in case within a library?



================
Comment at: clang/include/clang/Basic/BuiltinsWebAssembly.def:33
+// Thread-local storage
+TARGET_BUILTIN(__builtin_wasm_tls_size, "z", "nc", "bulk-memory")
+
----------------
Why is it `c`(const)? According to [[ https://github.com/llvm/llvm-project/blob/e6695821e592f95bffe1340b28be7bcfcce04284/clang/include/clang/Basic/Builtins.h#L104-L108 | this comment ]], this is true if this function has no side effects and doesn't read memory, i.e., the result should be only dependent on its arguments. Can't wasm globals be memory locations in machines?


================
Comment at: lld/test/wasm/tls.ll:57
+;   memory.init 0, 0
+;   end
+
----------------
Hmm, I think there might be a way to actually print disassembly results. There are '*.test' files in test directory, in which we have some examples. For example, [[ https://github.com/llvm/llvm-project/blob/master/lld/test/wasm/export-table.test | this test ]] has a sequence of commands you want to run, and you can put input files in a separate directory. That test uses `obj2yaml`, but can we possibly use `llvm-objdump` or something to disassemble?


================
Comment at: lld/wasm/Driver.cpp:543
+      "__wasm_init_tls", WASM_SYMBOL_VISIBILITY_HIDDEN,
+      make<SyntheticFunction>(I32ArgSignature, "__wasm_init_tls"));
+
----------------
Does this TLS thing work when `Config->Shared == true`, i.e., do we create TLS globals even when this is a library?


================
Comment at: lld/wasm/Writer.cpp:431
+    error("'bulk-memory' feature must be used in order to use thread-local "
+          "storage");
+
----------------
Should we check for "mutable-global" feature too?


================
Comment at: lld/wasm/Writer.cpp:514
       if (G->getGlobalType()->Mutable) {
         // Only the __stack_pointer should ever be create as mutable.
+        assert(G == WasmSym::StackPointer || G == WasmSym::TLSBase);
----------------
Now `__tls_base` is also mutable, I think we should fix the comment


================
Comment at: lld/wasm/Writer.cpp:769
+  std::string BodyContent;
+  {
+    raw_string_ostream OS(BodyContent);
----------------
Any reason for the new block scope?


================
Comment at: lld/wasm/Writer.cpp:777
+      break;
+    }
+
----------------
Is it guaranteed that there's only one TLS segment?


================
Comment at: llvm/include/llvm/IR/IntrinsicsWebAssembly.td:134
+            [],
+            [IntrNoMem, IntrSpeculatable]>;
+
----------------
- Why is it speculatable?
- I'm not sure if it's `InstNoMem`, because wasm globals can be memory locations after all. What do other people think?


================
Comment at: llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp:1095
+WebAssemblyTargetLowering::LowerGlobalTLSAddress(SDValue Op,
+                                                 SelectionDAG &DAG) const {
+  SDLoc DL(Op);
----------------
If you do the conversion in `WebAssemblyISelDAGToDAG.cpp`, you don't need to create `WebAssemblyISD` node,  `SDTypeProfile`, and `SDNode` for every single instruction you want to generate. This `WebAssemblyISelLowering.cpp` is a part of legalization so it cannot generate machine instructions directly, whereas `WebAssemblyISelDAGToDAG.cpp` is a part of instruction selection so you have direct access to machine instructions. I think moving routines there can be cleaner?


================
Comment at: llvm/lib/Target/WebAssembly/WebAssemblyISelLowering.cpp:1109
+      WebAssemblyISD::CONST_I32, DL, VT,
+      DAG.getTargetGlobalAddress(GA->getGlobal(), DL, VT, GA->getOffset(), 0));
+
----------------
Does this offset work when there are non-thread-local globals too?


================
Comment at: llvm/lib/Target/WebAssembly/WebAssemblyInstrInfo.td:86
+def SDT_WebAssemblyConstI32   : SDTypeProfile<1, 1, [SDTCisVT<0, i32>,
+                                                       SDTCisVT<1, i32>]>;
 
----------------
Nit: indentation


================
Comment at: llvm/lib/Target/WebAssembly/WebAssemblyInstrInfo.td:124
+def WebAssemblyconsti32 : SDNode<"WebAssemblyISD::CONST_I32",
+                                    SDT_WebAssemblyConstI32>;
 
----------------
Nit: indentations look weird for both defs


================
Comment at: llvm/lib/Target/WebAssembly/WebAssemblyTargetMachine.cpp:192
       Stripped |= stripAtomics(M);
-      Stripped |= stripThreadLocals(M);
     }
----------------
Looks like this feature requires both bulk memory and mutable global. Shouldn't we strip thread locals if either is not enabled?


================
Comment at: llvm/test/CodeGen/WebAssembly/target-features-tls.ll:23
 ; ATOMICS-NEXT: .ascii "atomics"
 ; ATOMICS-NEXT: .tbss.foo,"",@
----------------
Is thread local feature still dependent on atomics? If not, do we still need to test this with both atomics on and off? This now looks rather dependent on bulk memory and mutable global. Should we test this feature with those options instead? (I can be mistaken, so please let me know if so)


================
Comment at: llvm/test/CodeGen/WebAssembly/tls.ll:16
+  ; O2-NEXT: i32.const tls
+  ; O2-NEXT: i32.add
+  ; O2-NEXT: return
----------------
If the only difference of O0 and O2 results is the order of `global.get` and `i32.const`, you can use [[ https://llvm.org/docs/CommandGuide/FileCheck.html#the-check-dag-directive | `CHECK-DAG` ]] directive, like
```
; CHECK-DAG: global.get __tls_base
; CHECK-DAG: i32.const tls
...
```

The same for other functions too. Maybe we don't need separate O0 and O2 testing in this file.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D64537/new/

https://reviews.llvm.org/D64537





More information about the cfe-commits mailing list