[llvm-bugs] [Bug 43133] New: [WebAssembly] Pointer arithmetic on function pointers

via llvm-bugs llvm-bugs at lists.llvm.org
Tue Aug 27 16:58:23 PDT 2019


https://bugs.llvm.org/show_bug.cgi?id=43133

            Bug ID: 43133
           Summary: [WebAssembly] Pointer arithmetic on function pointers
           Product: libraries
           Version: trunk
          Hardware: PC
                OS: Linux
            Status: NEW
          Severity: enhancement
          Priority: P
         Component: Backend: WebAssembly
          Assignee: unassignedbugs at nondot.org
          Reporter: dschuff at google.com
                CC: llvm-bugs at lists.llvm.org

Suppose there is some dodgy C code which casts a function pointer to a char
pointer, adds a value, and lets it escape:

void a();
void i(void *);

int main() {
  i((char *)a + 4);
  return 0;
}


The resulting IR is something like the following: 

define hidden i32 @main() local_unnamed_addr #0 {
entry:
  tail call void @i(i8* getelementptr inbounds (i8, i8* bitcast (void ()* @a to
i8*), i32 8))
  ret i32 0
}

When selected with fast-isel, the result is of the form:


        i32.const       a
        i32.const       8
        i32.add 
        call    i

This "works", with the resulting argument to @i getting the value of a's table
index plus 8. The result isn't actually usable in C, of course; also I'm not
sure if LLVM lets you guarantee anything about the order of functions in the
table such that you could otherwise use "the next function in the table after
a", but at least the result makes sense.

However, when selected with DAG ISel, the result is:

        i32.const       a + 8
        call    i

i.e. the constant gets folded into the offset of the symbol operand of the
const. Lowering of that offset is not allowed for function symbol operands
(https://github.com/llvm/llvm-project/blob/93a26ec98d345ccbad5e57e72e213d29cf8efaf1/llvm/lib/Target/WebAssembly/WebAssemblyMCInstLower.cpp#L152).
If it were allowed it would make sense to result in a R_WASM_TABLE_INDEX_SLEB
relocation with an addend, which would work just like memory address
relocations with addends.

Unfortunately R_WASM_TABLE_INDEX_SLEB relocations do not have addends.

We could say "we don't care because this kind of pointer arithmetic makes no
sense on wasm because Harvard Architecture" (assuming it's actually true that
you can't control the layout of the indirect function table). That seems kind
of bad because it creates an inconsistency between the 2 ways to codegen that
IR. Of course we could also try harder to reject this kind of code entirely;
but I think it might actually be nice if we could allow some kind of control
over table layout, in which case we'd actually want to keep this working. Also
there is real-world code that does this (whence this bug in the first place)
and as long as you don't actually try to *use* this result on wasm, maybe we
shouldn't unnecessarily break your build (especially in hard-to-debug ways).

Option 2 would be to modify DAG codegen to not produce function operands with
offsets. I haven't looked into that yet but it might be OK if it doesn't create
too much complexity and we can make the limitation only apply to function
symbols (we do want it for data symbols).

Option 3 would be to add another relocation type, equivalent to
R_WASM_TABLE_INDEX_SLEB but with an addend. That would be quite straightforward
to implement but it seems slightly silly to spec yet another reloc type for
such a niche use.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20190827/73560704/attachment.html>


More information about the llvm-bugs mailing list