[PATCH] D120266: [clang][CodeGen] Avoid emitting ifuncs with undefined resolvers

Itay Bookstein via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Mon Feb 21 11:24:25 PST 2022


ibookstein created this revision.
ibookstein added reviewers: erichkeane, rsmith, MaskRay.
ibookstein requested review of this revision.
Herald added a project: clang.
Herald added a subscriber: cfe-commits.

The purpose of this change is to fix the following codegen bug:

// main.c
__attribute__((cpu_specific(generic)))
int *foo(void) { static int z; return &z;}
int main() { return *foo() = 5; }

// other.c
__attribute__((cpu_dispatch(generic))) int *foo(void);

// run:
clang main.c other.c -o main; ./main

This will segfault prior to the change, and return the correct
exit code 5 after the change.

The underlying cause is that when a translation unit contains
a cpu_specific function without the corresponding cpu_dispatch
(which is a valid use-case, they can be put into different
translation units), the generated code binds the reference to
foo() against a GlobalIFunc whose resolver is undefined. This
is invalid (the resolver must be defined in the same translation
unit as the ifunc), but historically the LLVM bitcode verifier
did not check that. The generated code then binds against the
resolver rather than the ifunc, so it ends up calling the
resolver rather than the resolvee. In the example above it treats
its return value as an int *, therefore trying to write to program
text.

The root issue at the representation level is that GlobalIFunc,
like GlobalAlias, does not support a "declaration" state. The
object which provides the correct semantics in these cases
is a Function declaration, but unlike Functions, changing a
declaration to a definition in the GlobalIFunc case constitutes
a change of the object type (as opposed to simply emitting code
into the function).

I think this limitation is unlikely to change, so I implemented
the fix by rewriting the generated IR to use a function
declaration instead of an ifunc if the resolver ends up undefined.
This uses takeName + replaceAllUsesWith in similar vein to
other places where the correct IR object type cannot be known
up front locally, like in CodeGenModule::EmitAliasDefinition.
In this case, we don't know whether the translation unit
will contain the cpu_dispatch when generating code for a reference
bound against a cpu_specific symbol.

It is also possible to generate the reference as a Function
declaration first, and 'upgrade' it to a GlobalIFunc once a
cpu_dispatch is encountered, which is somewhat more 'natural'.
That would involve a larger code change, though, so I wanted to
get feedback on the viability of the approach first.

Signed-off-by: Itay Bookstein <ibookstein at gmail.com>


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D120266

Files:
  clang/lib/CodeGen/CodeGenModule.cpp
  clang/lib/CodeGen/CodeGenModule.h
  clang/test/CodeGen/attr-cpuspecific.c

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D120266.410345.patch
Type: text/x-patch
Size: 5372 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20220221/ab48cc97/attachment.bin>


More information about the cfe-commits mailing list