[LLVMdev] LLVM Language Reference Strictness

Wed Oct 19 20:58:42 PDT 2011

On Wed, Oct 19, 2011 at 8:20 PM, Shea Levy <shea at shealevy.com> wrote:
> Hello,
>
> I'd like write a program that performs static analysis of code at the
> LLVM assembly/bitcode level, and to do so I plan on extensively
> referencing the language reference. As I hope to eventually use this
> tool as part of a security analysis of untrusted code, I need to be
> rather strict in my interpretation of the document. As such, I have some
> questions about how the implementers interpret the document (each
> question assumes we're considering a single fixed release version):
>
> 1. Is http://www.llvm.org/releases/<version>/docs/LangRef.html the most
> authoritative reference for a given version aside from the source code
> itself?

Yes.

> 2. Are target-specific behaviors documented for each supported target?

When anything has target-specific behavior, that fact should be
documented.  Beyond that, if you have a question about what some
construct is supposed to do, please ask.

> 3. Does undefined behavior semantically invalidate the entire program or
> is its unpredictable effect limited in scope somehow?

There is no limit to the scope of undefined behavior.

> 4. Are any behaviors undefined by virtue of not being specified in the
> reference, or are all scenarios that lead to undefined behavior
> explicitly identified as such?

We really want to explicitly identify them all in the reference; if
you have a question about some specific case, please ask.

> 5. Are there any language features with non-performance related semantic
> import (e.g annotations, instructions, intrinsic functions, types, etc.)
> that are not specified by the reference but are nevertheless implemented
> in the build system?

You should be able to analyze the semantics of IR accurately based
purely on information encoded into the IR.  Every instruction, type,
attribute etc. should be documented in LangRef.  Platform-specific
intrinsics are not documented, but can generally be treated like a
call to an external function.

> 6. Are all deviations from the reference, no matter how minor,
> considered bugs (either in the implementation or the spec)? If not, what
> deviations are considered acceptable?

If the reference doesn't describe the implementation accurately, we
consider it a bug.  Granted, some bugs are relatively low-priority.

> If so, is it expected that all
> such discovered and possibly corrected deviations will have associated
> bug reports, or might some be corrected in the development repository
> without documentation of the issue outside of a commit message? In other
> words, if I'm working with, say, llvm 2.9 and want to find all
> deviations known to upstream, can I just browse bug reports or will I
> have to go through commit logs as well?

LLVM Bugzilla doesn't contain an entry for every bug; to find every
fix, you'll have to go through commit logs.  Not sure what you're
trying to do here, though.

> These are the questions I have for now, but I may have more as I go
> along. Is this the appropriate place to ask this kind of thing?

Yes.

-Eli