[cfe-dev] Function pointer type becomes empty struct
David Blaikie via cfe-dev
cfe-dev at lists.llvm.org
Sat Nov 19 09:40:02 PST 2016
On Sat, Nov 19, 2016 at 2:20 AM Christian Dehnert via cfe-dev <
cfe-dev at lists.llvm.org> wrote:
> Hi David,
>
> On 17 Nov 2016, at 23:55, David Wiberg <dwiberg at gmail.com> wrote:
>
> Hi Chris,
>
> 2016-11-13 12:49 GMT+01:00 Christian Dehnert via cfe-dev
> <cfe-dev at lists.llvm.org>:
>
> […]
>
>
> TL;DR - Clang tries to determine LLVM IR types related to the first
> function but bails out due to the recursive nature of the type
> hierarchy.
>
> I thought this was an interesting question so I spent some time trying
> to understand what happens. I can't however say if the behavior is
> correct. Note that I haven't looked at this code previously so please
> correct me if I've misunderstood something.
>
> - Clang handles top level declarations first. In your example the
> first one is the function "h". To be able to emit LLVM IR, information
> about the function is collected. One thing done is to iterate over the
> function arguments (to determine IR types?). The first interesting
> argument is the pointer to struct A.
> - To compute the layout of the struct, there's an iteration over the
> struct fields which eventually triggers an attempt to get the LLVM
> type for each field.
> - The first field is a function pointer and an attempt is made to
> gather information regarding the function. But since this function is
> the same as the one we are currently processing, this attempt is
> aborted (to avoid recursion?) and an empty struct is created as type
> instead. This is the unexpected type you see in your output.
> - The second field is also a function pointer but the main
> difference is that this function type isn't currently being processed
> which means that it can be handled. (The handling actually leads into
> the same call stack as for the previous function. But in this case,
> once the pointer to struct A argument is found, it is determined that
> the struct is currently being processed and further handling is
> deferred.) Since it was possible to handle the function the expected
> type is returned.
>
>
> Thanks for the in-depth analysis of what is happening here! To me it is a
> bit strange that what I get depends on the ordering of functions in the
> input file even though this ordering is irrelevant at the C level, but I
> can roughly follow what’s happening internally here.
>
> […]
>
>
> If you insert a function first which doesn't use struct A there's no
> need to determine the types for the struct and the behavior stays the
> same.
>
> […]
>
>
> If you insert a function which uses struct A but doesn't cause the
> recursive behavior when looking up the types of struct A you get the
> wanted behavior.
>
> One possible way of fixing this (if it is considered an issue) might
> be to defer the handling of structs if they contain fields with
> function pointers. I'm however not sure that this is correct or if it
> has unwanted side-effects.
>
>
> We’re just using LLVM-IR emitted by clang in a stand-alone academic tool
> and needed to understand what our parser did not like about the IR in this
> case, so I guess the thing we need to fix is probably our parser (and our
> current view on LLVM-IR).
>
> Putting aside the (at least to me) surprising behavior of the IR
> generation, I was told in the LLVM IRC channel that the IR is currently
> undergoing transition to remove types of pointers where they are not needed
> and rather have opaque pointers:
>
> https://www.youtube.com/watch?v=OWgWDx_gB1I
>
> Therefore, if I understand correctly, we cannot expect the types in the
> struct any more, but the type information is rather given on instructions.
>
Right - in the long run you'll just have a struct with two opaque pointers.
If you're relying on the types of what a pointer points to, as you see
you're already in a bit of trouble today - which is the issue. The types
that pointers point to aren't meaningful in the IR (they don't guarantee
anything - pointers are cast back/forth to other types all the time - and
optimizations should ignore the pointee type of a pointer to ensure they
aren't pessimized when the pointee type isn't what they desire). So by
removing them we ensure these sort of bugs can no longer be written -
because the type won't be there to be an unstable/incomplete crutch.
I know I need to get back to this work (and/or recruit some help) to finish
off the transition & save people a bunch of confusion :)
- Dave
>
> Thanks again for your effort here!
>
> Best wishes,
> Chris
>
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20161119/44090156/attachment.html>
More information about the cfe-dev
mailing list