[lld] cd01430 - [lld][WebAssembly] Allow data symbols to extend past end of segment

Sam Clegg via llvm-commits llvm-commits at lists.llvm.org
Wed May 12 13:48:30 PDT 2021


Author: Sam Clegg
Date: 2021-05-12T13:43:37-07:00
New Revision: cd01430ff13b5441bc71e6dc3c4e688052f7be82

URL: https://github.com/llvm/llvm-project/commit/cd01430ff13b5441bc71e6dc3c4e688052f7be82
DIFF: https://github.com/llvm/llvm-project/commit/cd01430ff13b5441bc71e6dc3c4e688052f7be82.diff

LOG: [lld][WebAssembly] Allow data symbols to extend past end of segment

This fixes a bug with string merging with string symbols that contain
NULLs, as is the case in the `merge-string.s` test.

The bug only showed when we run with `--relocatable` and then try read
the resulting object back in.  In this case we would end up with string
symbols that extend past the end of the segment in which they live.

The problem comes from the fact that sections which are flagged as
string mergable assume that all strings are NULL terminated.  The
merging algorithm will drop trailing chars that follow a NULL since they
are essentially unreachable.  However, the "size" attribute (in the
symbol table) of such a truncated symbol is not updated resulting a
symbol size that can overlap the end of the segment.

I verified that this can happen in ELF too given the right conditions
and the its harmless enough.  In practice Strings that contain embedded
null should not be part of a mergable section.

Differential Revision: https://reviews.llvm.org/D102281

Added: 
    llvm/test/Object/wasm-bad-data-symbol.yaml

Modified: 
    lld/test/wasm/merge-string.s
    llvm/lib/Object/WasmObjectFile.cpp

Removed: 
    


################################################################################
diff  --git a/lld/test/wasm/merge-string.s b/lld/test/wasm/merge-string.s
index c422bc7740f87..a4b89abfc46d7 100644
--- a/lld/test/wasm/merge-string.s
+++ b/lld/test/wasm/merge-string.s
@@ -10,6 +10,10 @@
 // RUN: wasm-ld -O0 %t.o -o %t2.wasm --no-gc-sections --no-entry
 // RUN: obj2yaml %t2.wasm | FileCheck --check-prefixes=COMMON,NOMERGE %s
 
+// Check relocatable
+// RUN: wasm-ld -r %t.o -o %t2.o
+// RUN: obj2yaml %t2.o | FileCheck --check-prefixes=RELOC %s
+
         .section .rodata1,"S",@
         .asciz "abc"
 foo:
@@ -18,7 +22,7 @@ foo:
 bar:
         .asciz "bc"
         .asciz "bc"
-        .size bar, 4
+        .size bar, 6
 
         .section .rodata_relocs,"",@
 negative_addend:
@@ -74,3 +78,24 @@ negative_addend:
 //  COMMON-NEXT:          Value:           1024
 //   MERGE-NEXT:          Content:         '61626300'
 // NOMERGE-NEXT:          Content:         '6162630061626300626300'
+
+
+//      RELOC:  - Type:            DATA
+// RELOC-NEXT:    Relocations:
+// RELOC-NEXT:      - Type:            R_WASM_MEMORY_ADDR_I32
+// RELOC-NEXT:        Index:           0
+// RELOC-NEXT:        Offset:          0xF
+// RELOC-NEXT:        Addend:          -10
+// RELOC-NEXT:    Segments:
+// RELOC-NEXT:      - SectionOffset:   6
+// RELOC-NEXT:        InitFlags:       0
+// RELOC-NEXT:        Offset:
+// RELOC-NEXT:          Opcode:          I32_CONST
+// RELOC-NEXT:          Value:           0
+// RELOC-NEXT:        Content:         '61626300'
+// RELOC-NEXT:      - SectionOffset:   15
+// RELOC-NEXT:        InitFlags:       0
+// RELOC-NEXT:        Offset:
+// RELOC-NEXT:          Opcode:          I32_CONST
+// RELOC-NEXT:          Value:           4
+// RELOC-NEXT:        Content:         F6FFFFFF

diff  --git a/llvm/lib/Object/WasmObjectFile.cpp b/llvm/lib/Object/WasmObjectFile.cpp
index df64a46dd8377..8004c5eb25a9b 100644
--- a/llvm/lib/Object/WasmObjectFile.cpp
+++ b/llvm/lib/Object/WasmObjectFile.cpp
@@ -637,9 +637,12 @@ Error WasmObjectFile::parseLinkingSectionSymtab(ReadContext &Ctx) {
                                                 object_error::parse_failed);
         auto Offset = readVaruint64(Ctx);
         auto Size = readVaruint64(Ctx);
-        if (Offset + Size > DataSegments[Index].Data.Content.size())
-          return make_error<GenericBinaryError>("invalid data symbol offset",
-                                                object_error::parse_failed);
+        size_t SegmentSize = DataSegments[Index].Data.Content.size();
+        if (Offset > SegmentSize)
+          return make_error<GenericBinaryError>(
+              "invalid data symbol offset: `" + Info.Name + "` (offset: " +
+                  Twine(Offset) + " segment size: " + Twine(SegmentSize) + ")",
+              object_error::parse_failed);
         Info.DataRef = wasm::WasmDataReference{Index, Offset, Size};
       }
       break;

diff  --git a/llvm/test/Object/wasm-bad-data-symbol.yaml b/llvm/test/Object/wasm-bad-data-symbol.yaml
new file mode 100644
index 0000000000000..a30b2fe04ac5c
--- /dev/null
+++ b/llvm/test/Object/wasm-bad-data-symbol.yaml
@@ -0,0 +1,31 @@
+# RUN: yaml2obj %s | not llvm-objdump -s - 2>&1 | FileCheck %s
+
+# Check that data symbols must have and offset that is within the
+# bounds of the containing segment
+
+# CHECK: invalid data symbol offset: `foo` (offset: 42 segment size: 5)
+
+--- !WASM
+FileHeader:
+  Version:         0x00000001
+Sections:
+  - Type:            DATA
+    Segments:
+      - SectionOffset:   0
+        InitFlags:       0
+        Offset:
+          Opcode:          I32_CONST
+          Value:           0
+        Content:         '6401020304'
+  - Type:            CUSTOM
+    Name:            linking
+    Version:         2
+    SymbolTable:
+      - Index:           0
+        Kind:            DATA
+        Name:            foo
+        Flags:           [ ]
+        Segment:         0
+        Offset:          42
+        Size:            1
+...


        


More information about the llvm-commits mailing list