[cfe-dev] Cannot parse the Linux kernel

Fri Jun 12 07:20:41 PDT 2009

Eli Friedman wrote:
> On Thu, Jun 11, 2009 at 1:49 AM, Roberto Bagnara<bagnara at cs.unipr.it> wrote:
>> Hi there,
>>
>> there are heavily used versions of the Linux kernel that
>> cannot be parsed by clang due to the following bugs:
>>
>> http://llvm.org/bugs/show_bug.cgi?id=4236
>> http://llvm.org/bugs/show_bug.cgi?id=3429
>>
>> Are there plans to fix them?  Note that changing the
>> kernel code is, unfortunately, not an option.
> 
> We definitely intend to implement __label__ support relatively soon;
> it's just a matter of someone writing the patch.  (It's non-trivial,
> but it shouldn't be particularly hard.)

Great!

> As for PR4236, I'm not really sure about the best way to go about
> fixing it, or whether we even want to,  It'd require some
> platform-specific code to figure out how to translate situations like
> that into LLVM inline asm in a gcc-compatible way, and we don't really
> want to add weird hacks for rare edge cases.  If you don't care about
> CodeGen, though, you can just comment out the relevant error-checking
> code in SemaStmt.cpp; the resulting AST will still be well-formed.

Thanks a lot Eli: this gives us a very useful workaround.

> On a side note, what are you trying to do that requiring a one-line
> change somehow breaks it?  If you don't want to touch your kernel
> sources, you could always write a wrapper around clang that patches
> the file in question, then pipes it into clang.

We are writing a program analyzer that should be able to analyze
existing code in widespread use as it is.  Without having to patch
it, no matter how insignificant the patch is.

This should be compatible with the design goals of clang. Of course, if
specific gcc constructs that are not implementable in clang are
encountered, clang will have to give up.   But whenever  a fatal failure
is avoidable, it should be avoided.  This is, IMHO, a key point for the
success of clang and for the success of every project that uses it.

In the specific case, we think that simply converting the fatal failure
into a warning and matching the semantics of gcc is the best option.

Of course we can maintain a patch for clang doing that, but we think
that the best thing for clang itself is to follow the motto "never
die for source code quality nitpicking if gcc can parse it and that
source is in widespread use."  It serves no purpose risking potential
users to give up with clang (and with our analyzer), because "out of
the box it doesn't compile my kernel and who knows how many other
sources need to be patched in obscure ways."

All the best,

    Roberto

-- 
Prof. Roberto Bagnara
Computer Science Group
Department of Mathematics, University of Parma, Italy
http://www.cs.unipr.it/~bagnara/
mailto:bagnara at cs.unipr.it