[PATCH] D64537: [WebAssembly] Implement thread-local storage (local-exec model)

Guanzhong Chen via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Thu Jul 11 18:09:49 PDT 2019


quantum marked 28 inline comments as done.
quantum added inline comments.


================
Comment at: clang/include/clang/Basic/BuiltinsWebAssembly.def:33
+// Thread-local storage
+TARGET_BUILTIN(__builtin_wasm_tls_size, "z", "nc", "bulk-memory")
+
----------------
tlively wrote:
> quantum wrote:
> > quantum wrote:
> > > aheejin wrote:
> > > > Why is it `c`(const)? According to [[ https://github.com/llvm/llvm-project/blob/e6695821e592f95bffe1340b28be7bcfcce04284/clang/include/clang/Basic/Builtins.h#L104-L108 | this comment ]], this is true if this function has no side effects and doesn't read memory, i.e., the result should be only dependent on its arguments. Can't wasm globals be memory locations in machines?
> > > I was thinking that since the global is immutable, so the value is always a constant.
> > According to @tlively, there is no solid definition on whether a global (especially a constant one), counts as memory access. For now, I am going to change this to not constant. We can always change it back later.
> I think this requires more conversation.
We decided that this is really a constant, since this global is immutable.


================
Comment at: lld/test/wasm/data-layout.ll:43
+; CHECK-NEXT:       - Index:           3
+; CHECK-NEXT:         Type:            I32
 ; CHECK-NEXT:         Mutable:         false
----------------
tlively wrote:
> These globals don't have enough information to tell the reader what they even are, and they don't have anything to do with the data layout, so how about skipping these in the test with a comment saying what is being skipped?
Going to skip over `__tls_base` and `__tls_size`.


================
Comment at: lld/test/wasm/tls.ll:57
+;   memory.init 0, 0
+;   end
+
----------------
tlively wrote:
> quantum wrote:
> > aheejin wrote:
> > > Hmm, I think there might be a way to actually print disassembly results. There are '*.test' files in test directory, in which we have some examples. For example, [[ https://github.com/llvm/llvm-project/blob/master/lld/test/wasm/export-table.test | this test ]] has a sequence of commands you want to run, and you can put input files in a separate directory. That test uses `obj2yaml`, but can we possibly use `llvm-objdump` or something to disassemble?
> > We already know that we can do something like
> > 
> >     Run: obj2yaml %t.wasm | sed -n '/Body:/{s/^\s*Body:\s*//;s/../0x& /gp}'  | llvm-mc -disassemble -triple=wasm32
> > 
> > to compare the actual assembly. As for `llvm-objdump`, it seems to be unable to disassemble the WASM properly:
> > 
> > 
> > ```
> > .../tools/lld/test/wasm/Output/tls.ll.tmp.wasm:	file format WASM
> > 
> > 
> > Disassembly of section CODE:
> > 
> > 00000000 CODE:
> >         # 4 functions in section.
> >        1: 02 00                        	block   	invalid_type
> >        3: 0b                           	end
> >        4: 10 00                        	call	0
> >        6: 20 00                        	local.get	0
> >        8: 24 01                        	global.set	1
> >        a: 20 00                        	local.get	0
> >        c: 41 00                        	i32.const	0
> >        e: 41 08                        	i32.const	8
> >       10: fc 08 00 00                  	memory.init	0, 0
> >       14: 0b                           	end
> >       15: 0f                           	return
> >       16: 00                           	llvm-objdump: lib/Target/WebAssembly/WebAssemblyGenAsmWriter.inc:2032: void llvm::WebAssemblyInstPrinter::printInstruction(const llvm::MCInst *, llvm::raw_ostream &): Assertion `Bits != 0 && "Cannot print this instruction."' failed.
> > 
> > ```
> It might be worth filing an LLVM bug for this (or possibly fixing in a separate CL).
Going to fix this at some point in the future.


================
Comment at: lld/wasm/Symbols.cpp:208
+    if (Segment->OutputSeg->Name == ".tdata")
+      return Segment->OutputSegmentOffset + Offset;
     return Segment->OutputSeg->StartVA + Segment->OutputSegmentOffset + Offset;
----------------
tlively wrote:
> It would be great to have an explanatory comment here.
Adding a comment.


================
Comment at: lld/wasm/Writer.cpp:431
+    error("'bulk-memory' feature must be used in order to use thread-local "
+          "storage");
+
----------------
quantum wrote:
> aheejin wrote:
> > Should we check for "mutable-global" feature too?
> Do we need to? I thought it's always available since we use it for the stack pointer.
On second thought, the `mutable-global` feature is for import/export of mutable globals. TLS does not need to do this.


================
Comment at: lld/wasm/Writer.cpp:777
+      break;
+    }
+
----------------
tlively wrote:
> quantum wrote:
> > aheejin wrote:
> > > Is it guaranteed that there's only one TLS segment?
> > Yes, all the TLS input segments will be merged into `.tdata`.
> --no-merge-data-segments!
I am making `--no-merge-data-segments` merge TLS segments anyway.


================
Comment at: lld/wasm/Writer.cpp:250
+      TLSSize->Global->Global.InitExpr.Value.Int32 = Seg->Size;
+    }
   }
----------------
tlively wrote:
> What happens when there are multiple TLS sections and `--no-merge-data-segments` is used? I assume their sizes should be added together?
I'll merge the TLS segments anyways despite `--no-merge-data-segments`.


================
Comment at: lld/wasm/Writer.cpp:638
+  if (Name.startswith(".tbss."))
+    return ".tdata";
   return Name;
----------------
tlively wrote:
> Does this mean we can't control whether .tdata or .tbss comes first? Is that important for anything?
Yes, it does mean that. The only reason why .tbss is supposed to be separate is so that its memory can just be zeroed whereas .tdata has to have the bytes stored in the program image. Currently, we just explicitly store the zero bytes, so this won't be a problem.


================
Comment at: llvm/include/llvm/IR/IntrinsicsWebAssembly.td:134
+            [],
+            [IntrNoMem, IntrSpeculatable]>;
+
----------------
quantum wrote:
> quantum wrote:
> > aheejin wrote:
> > > - Why is it speculatable?
> > > - I'm not sure if it's `InstNoMem`, because wasm globals can be memory locations after all. What do other people think?
> > @tlively suggested that I do this. It should be speculatable because it has no side effects (since it returns a constant). As for `InstrNoMem`, I am not sure if globals are memory, and whether reading a constant counts as memory access.
> I think I am going to remove these attributes for now. We can add them back if we end up deciding that immutable globals are not memory accesses.
We decided that this is a constant after all, since the global is immutable. This also simplifies the implementation of the lowering code.


================
Comment at: llvm/lib/Target/WebAssembly/WebAssemblyInstrInfo.td:124
+def WebAssemblyconsti32 : SDNode<"WebAssemblyISD::CONST_I32",
+                                    SDT_WebAssemblyConstI32>;
 
----------------
tlively wrote:
> aheejin wrote:
> > Nit: indentations look weird for both defs
> I agree with @aheejin that it would be ideal if we didn't need to add these new WebAssemblyISD items. We've come this far without needing them, so it would be good to keep that going.
I'll try out an implementation with ISelDAGtoDAG later.


================
Comment at: llvm/lib/Target/WebAssembly/WebAssemblyTargetMachine.cpp:192
       Stripped |= stripAtomics(M);
-      Stripped |= stripThreadLocals(M);
     }
----------------
tlively wrote:
> quantum wrote:
> > aheejin wrote:
> > > Looks like this feature requires both bulk memory and mutable global. Shouldn't we strip thread locals if either is not enabled?
> > Do we want to do that? I think having your thread locals vanish if you didn't pass `-mbulk-memory` is surprising. It might make more sense to strip thread locals if `-pthread` is not passed.
> I have a patch up to make `-pthread` imply `-mbulk-memory`, so this shouldn't be a problem from a UI perspective. I agree with @aheejin that we should strip TLS if the necessary features aren't present to preserver feature-correctness.
> 
> All the errors for using TLS without bulk memory should also apply to mutable-globals, which I hadn't considered before. I will update my patch so -pthread implies -mmutable-globals as well.
Going to strip out TLS if `bulk-memory` is not enabled.


================
Comment at: llvm/test/CodeGen/WebAssembly/target-features-tls.ll:23
 ; ATOMICS-NEXT: .ascii "atomics"
 ; ATOMICS-NEXT: .tbss.foo,"",@
----------------
tlively wrote:
> quantum wrote:
> > aheejin wrote:
> > > Is thread local feature still dependent on atomics? If not, do we still need to test this with both atomics on and off? This now looks rather dependent on bulk memory and mutable global. Should we test this feature with those options instead? (I can be mistaken, so please let me know if so)
> > On second thought, this test is somewhat meaningless now, since the thread local is always created regardless of whether atomics are enabled. What does @tlively think?
> I agree with @aheejin that we should change the test to toggle bulk-memory and mutable-global and strip TLS if those are not both present.
Changed test to test `+/-bulk-memory`.


================
Comment at: llvm/test/CodeGen/WebAssembly/tls.ll:16
+  ; O2-NEXT: i32.const tls
+  ; O2-NEXT: i32.add
+  ; O2-NEXT: return
----------------
tlively wrote:
> quantum wrote:
> > aheejin wrote:
> > > If the only difference of O0 and O2 results is the order of `global.get` and `i32.const`, you can use [[ https://llvm.org/docs/CommandGuide/FileCheck.html#the-check-dag-directive | `CHECK-DAG` ]] directive, like
> > > ```
> > > ; CHECK-DAG: global.get __tls_base
> > > ; CHECK-DAG: i32.const tls
> > > ...
> > > ```
> > > 
> > > The same for other functions too. Maybe we don't need separate O0 and O2 testing in this file.
> > Will do. I didn't know `CHECK-DAG` existed.
> If the point of testing with -O0 and -O2 is to test both with and without FastIsel, it might be better to pass `-fast-isel` to explicitly enable it instead.
Changing to use `-fast-isel`.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D64537/new/

https://reviews.llvm.org/D64537





More information about the cfe-commits mailing list