[llvm-dev] Possible bug with struct types and linking

Tim Armstrong via llvm-dev llvm-dev at lists.llvm.org
Thu Mar 24 09:18:46 PDT 2016


Hi All,

  I ran into what I think is a linker bug around structural uniquing of
types when trying to make a long-overdue upgrade from LLVM 3.3 to LLVM
3.8.0. Verification can fail after linking together two well-formed modules
due to mismatched pointer types.

What happens is that both of the types in the source module get mapped to
the same type in the destination module. The old behaviour was to map each
type to the type in the destination with matching name and structure. This
means that the type signatures of functions called from the Dst module can
change, causing verification failures, e.g. if a call instruction
originally from Dst calls a function in Src with the wrong typed pointer.

Dst Module
========
%"struct.impala_udf::TinyIntVal" = type { %"struct.impala_udf::AnyVal", i8 }
%"struct.impala_udf::AnyVal" = type { i8 }
%"struct.impala_udf::BooleanVal" = type { %"struct.impala_udf::AnyVal", i8 }

<forward declarations of IdentityTiny and IdentityBool>

<functions that call IdentityTiny and IdentityBool>

Src Module
========
%"struct.impala_udf::BooleanVal" = type { %"struct.impala_udf::AnyVal", i8 }
%"struct.impala_udf::AnyVal" = type { i8 }
%"struct.impala_udf::TinyIntVal" = type { %"struct.impala_udf::AnyVal", i8 }

define i16 @IdentityTiny(%"struct.impala_udf::TinyIntVal"* nocapture
readonly dereferenceable(2) %arg) #0 {
  %1 = bitcast %"struct.impala_udf::TinyIntVal"* %arg to i16*
  %2 = load i16, i16* %1, align 1
  ret i16 %2
}

define i16 @IdentityBool(%"struct.impala_udf::BooleanVal"* nocapture
readonly dereferenceable(2) %arg) #0 {
  %1 = bitcast %"struct.impala_udf::BooleanVal"* %arg to i16*
  %2 = load i16, i16* %1, align 1
  ret i16 %2
}

Combined Module
==============
%"struct.impala_udf::TinyIntVal" = type { %"struct.impala_udf::AnyVal", i8 }
%"struct.impala_udf::AnyVal" = type { i8 }
%"struct.impala_udf::BooleanVal" = type { %"struct.impala_udf::AnyVal", i8 }

<functions that call IdentityTiny and IdentityBool with TinyIntVal* and
BooleanVal* args>

define i16 @IdentityTiny(%"struct.impala_udf::TinyIntVal"* nocapture
readonly dereferenceable(2) %arg) #0 {
  %1 = bitcast %"struct.impala_udf::TinyIntVal"* %arg to i16*
  %2 = load i16, i16* %1, align 1
  ret i16 %2
}

; Function signature changed =>
define i16 @IdentityBool(%"struct.impala_udf::TinyIntVal"* nocapture
readonly dereferenceable(2) %arg) #0 {
  %1 = bitcast %"struct.impala_udf::TinyIntVal"* %arg to i16*
  %2 = load i16, i16* %1, align 1
  ret i16 %2
}

This seems like a bug to me, I wouldn't expect this scenario to result in
malformed modules but it's not really clear what the expected behaviour of
the linker is in situations like this.

It seems like this hasn't been hit by the LLVM linker since it starts with
an empty module and links in all non-empty modules, so all non-opaque
struct types in the destination module are already structurally unique. In
our case we are starting with a main module then adding in IRBuilder code
and code from other modules.

I have a fix and I'm happy to put together a patch with regression tests,
but I wanted to check that this would be considered useful, given that it
doesn't seem to be a common use-case for LLVM and the planned work on
typeless pointers will make it irrelevant.

- Tim
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160324/c91e020c/attachment.html>


More information about the llvm-dev mailing list