r220493 - Add a "signature" to AST files to verify that they haven't changed

Chandler Carruth chandlerc at google.com
Mon Mar 23 22:40:49 PDT 2015


On Mon, Mar 23, 2015 at 10:37 PM, Ben Langmuir <blangmuir at apple.com> wrote:

> On Mar 23, 2015, at 10:21 PM, Chandler Carruth <chandlerc at google.com>
> wrote:
>
>
> On Mon, Mar 23, 2015 at 9:54 PM, Ben Langmuir <blangmuir at apple.com> wrote:
>
>> On Mar 23, 2015, at 5:38 PM, Chandler Carruth <chandlerc at google.com>
>> wrote:
>>
>> Digging this thread back up, I'm starting to work on deterministic output
>> of modules. This is *really* important. Unfortunately, it's almost
>> impossible to make progress as long as this code is actually firing because
>> it kind of makes things definitively non-deterministic. ;]
>>
>>
>> Ben, do you have any concerns about only emitting the signature record
>> for implicit modules? This would omit it for explicit modules (where I'm
>> working), PCH files, etc. Based on your commit message and explanation, it
>> doesn't really make sense outside of implicit modules AFAICT, and we're
>> hoping it will go away completely.
>>
>>
>> SGTM although I would prefer we continue to check the /expected/
>> signature when an AST file imports an implicit module (and in particular
>> when a PCH imports an implicit module).  Thanks for working on this!
>>
>
> I'm not planning to change the checking code path at all. All the tests
> pass if I just restrict the emission to be when generating implicit modules
>
>
> Perfect, that’s what I was hoping for.  Sorry I didn’t phrase it clearly.
>
> Perhaps the confusion is that we have two totally different
> 'implicit/explicit' things in Modules land. There are implicit vs. explicit
> submodules. That's not what I'm talking about. I'm talking about whether
> -fno-implicit-modules is passed. When the build system is produces modules
> files for us, it doesn't seem like we ever need to take responsibility for
> the signature and can just omit it.
>
> At least, all the regression tests pass when I make the above change, so
> even if it isn't the perfect state, I think it is a strict improvement. It
> at least allows me to test the removal of some other bit of non-determinism.
>
>
>>
>>
>>
>> Also, I have one question here which may just be my ignorance:
>>
>> On Thu, Oct 23, 2014 at 11:05 AM, Ben Langmuir <blangmuir at apple.com>
>> wrote:
>>
>>> +                            ASTFileSignature ExpectedSignature,
>>> +
>>> std::function<ASTFileSignature(llvm::BitstreamReader &)>
>>> +                                ReadSignature,
>>>
>>
>> Wow, a std::function? Is this really necessary? Seems super heavy weight.
>>
>>
>> Oops, totally unnecessary.  Strength-reduced to a function pointer
>> in r233050.
>>
>>
>>
>>>
>>> +static ASTFileSignature readASTFileSignature(llvm::BitstreamReader
>>> &StreamFile){
>>> +  BitstreamCursor Stream(StreamFile);
>>> +  if (Stream.Read(8) != 'C' ||
>>> +      Stream.Read(8) != 'P' ||
>>> +      Stream.Read(8) != 'C' ||
>>> +      Stream.Read(8) != 'H') {
>>> +    return 0;
>>> +  }
>>>
>>
>> This is truly magical. I assume this is checking for some magic string?
>> Comments or something would be really nice here. Sharing the code would be
>> even more nice.
>>
>>
>> Yeah, this is the AST/PCH file magic number.  Added
>> “startsWithASTFileMagic” to factor this out of the three places we check
>> this, also in r233050.
>>
>>
>>
>>> +
>>> +  // Scan for the CONTROL_BLOCK_ID block.
>>> +  if (SkipCursorToBlock(Stream, CONTROL_BLOCK_ID))
>>> +    return 0;
>>>
>>
>> Why would it ever be reasonable to call this for a module file without
>> such a block?
>>
>>
>> It’s not, but that’s exactly what returning 0 means - I added a doc
>> comment to that effect.  Or was there something else here?
>>
>
> If this is a programming invariant, it should be assert()-ed rather than
> returning something.
>
>
> It’s an error condition not an invariant, since we’re reading a file
> provided by the user.  The caller of this function produces a fatal error
> if it returns 0.  Admittedly we aren’t very paranoid about malformed AST
> files in general, but an error seems better when we can trivially provide
> one.
>

If we want to diagnose malformed AST files, we should be using a
diagnostic, not a return of zero.

The AST files are essentially programmatic interfaces between Clang and
itself. I'm not aware of any realistic handling of malformed inputs like
this. If we want to do this, we should do it correctly. If we're not
actually doing it, it seems quite misleading to return a bizarre error
number.

What prevents a random number of '0' from actually *being* the signature?

None of this seems really principled in its design, and as something that
is in trunk for many months now, I think its principles need to be shored
up.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20150323/aadfbeeb/attachment.html>


More information about the cfe-commits mailing list