r195877 - strip UTF-8 BOM in -frewrite-includes (PR#15664)
alp at nuanti.com
Wed Nov 27 13:51:45 PST 2013
On 27/11/2013 21:14, Lubos Lunak wrote:
> Author: llunak
> Date: Wed Nov 27 15:14:43 2013
> New Revision: 195877
> URL: http://llvm.org/viewvc/llvm-project?rev=195877&view=rev
> strip UTF-8 BOM in -frewrite-includes (PR#15664)
> Modified: cfe/trunk/lib/Rewrite/Frontend/InclusionRewriter.cpp
> URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Rewrite/Frontend/InclusionRewriter.cpp?rev=195877&r1=195876&r2=195877&view=diff
> --- cfe/trunk/lib/Rewrite/Frontend/InclusionRewriter.cpp (original)
> +++ cfe/trunk/lib/Rewrite/Frontend/InclusionRewriter.cpp Wed Nov 27 15:14:43 2013
> @@ -367,6 +367,11 @@ bool InclusionRewriter::Process(FileID F
> unsigned NextToWrite = 0;
> int Line = 1; // The current input file line number.
> + // Ignore UTF-8 BOM, otherwise it'd end up somewhere else than the start
> + // of the resulting file.
> + if (FromFile.getBuffer().startswith("\xEF\xBB\xBF"))
> + NextToWrite = 3;
> Token RawToken;
> Added: cfe/trunk/test/Frontend/Inputs/rewrite-includes-bom.h
> URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/test/Frontend/Inputs/rewrite-includes-bom.h?rev=195877&view=auto
> --- cfe/trunk/test/Frontend/Inputs/rewrite-includes-bom.h (added)
> +++ cfe/trunk/test/Frontend/Inputs/rewrite-includes-bom.h Wed Nov 27 15:14:43 2013
> @@ -0,0 +1 @@
> +ï»¿// This file starts with UTF-8 BOM marker.
> Added: cfe/trunk/test/Frontend/rewrite-includes-bom.c
> URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/test/Frontend/rewrite-includes-bom.c?rev=195877&view=auto
> --- cfe/trunk/test/Frontend/rewrite-includes-bom.c (added)
> +++ cfe/trunk/test/Frontend/rewrite-includes-bom.c Wed Nov 27 15:14:43 2013
> @@ -0,0 +1,4 @@
> +// RUN: %clang -E -frewrite-includes -I %S/Inputs %s -o - | %clang -fsyntax-only -Xclang -verify -x c -
> +// expected-no-diagnostics
> +#include "rewrite-includes-bom.h"
I don't think it's reasonable to check in this test in its current
state, given that it passes with or without the BOM. If the invisible
BOM happens to get removed by automatic encoding transformations in VCS
or accidental code cleanup, it would give a false sense that the feature
is working when it could in fact have regressed.
Leaving the feature untested until you get round to writing a better
test, perhaps checking output for presence of BOM with something like
grep, is preferable for that reason.
> cfe-commits mailing list
> cfe-commits at cs.uiuc.edu
the browser experts
More information about the cfe-commits