<html><head></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space; "><div>Sorry to spam, but I sent this out a few days ago and there was no response, and I'm wondering if it got filtered out due to the attachment.</div><div><br></div>This is my first contribution to this project, so please let me know if there's anything I can do to get this patch in better shape before committing. The aim of this patch is to fix an obstacle for clang working as a drop-in replacement for Microsoft's C++ compiler. A summary of this behavior is below, but it can also be seen in bug 11789: <a href="http://llvm.org/bugs/show_bug.cgi?id=11789">http://llvm.org/bugs/show_bug.cgi?id=11789</a><div><br></div><div>This patch adds a feature that is enabled only when "-fms-extensions" is set, to fix some errors trying to parse Microsoft's standard library implementation. In Microsoft's <locale> header, there is some debugging code that attempts to take the __FUNCTION__ predefined expression, and turn it into a wide char literal. It does something like this:</div><div><br></div><div>#define _STR2WSTR(s) L##s</div><div>#define STR2WSTR(s) _STR2WSTR(s)</div><div>#define __FUNCTIONW__ STR2WSTR(__FUNCTION__)</div><div><br></div><div>This wouldn't work in clang, because __FUNCTION__ isn't a macro, it's a predefined expression, so the token-paste operator just turns it into L__FUNCTION__, as the standard says it should. Microsoft's compiler has an undocumented extension that allows this to work, making it appear as if __FUNCTION__ were a macro, though it's not: </div><div>1. The preprocessor special-cases pasting the token 'L' with __FUNCTION__, if __FUNCTION__ came from a macro expansion, pasting it to __LPREFIX( __FUNCTION__). This can be seen by using VC's preprocessor. When I say, "if __FUNCTION__ came from a macro expansion", I mean that _STR2WSTR(__FUNCTION__) does not have any special rules applied to it, in other words, the preprocessor pretends that __FUNCTION__ is a macro being expanded to a string, even though it's not.</div><div>2. The frontend is extended to parse __LPREFIX in a special way: The argument to __LPREFIX must be a compile-time constant string literal, or a predefined expression, and it must be formatted in UTF8. The result of the __LPREFIX expression is the same string, but formatted as a wide char literal.</div><div><br></div><div>This patch implements both of these phases, and allows Microsoft's <locale> header to be successfully parsed. Please let me know if there's anything I can do to better package it. I developed it against the 3.1 release tag, but it applies cleanly to top of tree clang as well.</div><div><br></div><div>Best regards,</div><div>Aaron</div><div><br></div><div><div>Index: test/Sema/ms_wide_predefined_expr.cpp</div><div>===================================================================</div><div>--- test/Sema/ms_wide_predefined_expr.cpp<span class="Apple-tab-span" style="white-space:pre"> </span>(revision 0)</div><div>+++ test/Sema/ms_wide_predefined_expr.cpp<span class="Apple-tab-span" style="white-space:pre"> </span>(revision 0)</div><div>@@ -0,0 +1,10 @@</div><div>+// RUN: %clang_cc1 %s -fsyntax-only -Wno-unused-value -Wmicrosoft -verify -fms-extensions</div><div>+</div><div>+// Wide character predefined identifiers</div><div>+#define _STR2WSTR(str) L##str</div><div>+#define STR2WSTR(str) _STR2WSTR(str)</div><div>+void abcdefghi12(void) {</div><div>+ const wchar_t (*ss)[12] = &STR2WSTR(__FUNCTION__);</div><div>+ static int arr[sizeof(STR2WSTR(__FUNCTION__))==12*sizeof(wchar_t) ? 1 : -1];</div><div>+}</div><div>+</div><div>Index: test/Sema/ms_lprefix.cpp</div><div>===================================================================</div><div>--- test/Sema/ms_lprefix.cpp<span class="Apple-tab-span" style="white-space:pre"> </span>(revision 0)</div><div>+++ test/Sema/ms_lprefix.cpp<span class="Apple-tab-span" style="white-space:pre"> </span>(revision 0)</div><div>@@ -0,0 +1,7 @@</div><div>+// RUN: %clang_cc1 %s -fsyntax-only -Wno-unused-value -Wmicrosoft -verify -fms-extensions</div><div>+</div><div>+void a(void) {</div><div>+ const wchar_t (*ss)[6] = &__LPREFIX("hello");</div><div>+ static int arr[sizeof(__LPREFIX("hello"))==6*sizeof(wchar_t) ? 1 : -1];</div><div>+}</div><div>+</div><div>Index: test/Preprocessor/macro_paste_msextensions.cpp</div><div>===================================================================</div><div>--- test/Preprocessor/macro_paste_msextensions.cpp<span class="Apple-tab-span" style="white-space:pre"> </span>(revision 0)</div><div>+++ test/Preprocessor/macro_paste_msextensions.cpp<span class="Apple-tab-span" style="white-space:pre"> </span>(revision 0)</div><div>@@ -0,0 +1,29 @@</div><div>+// RUN: %clang_cc1 -P -E -fms-extensions %s | FileCheck -strict-whitespace %s</div><div>+</div><div>+#define _STR2WSTR(str) L##str</div><div>+#define STR2WSTR(str) _STR2WSTR(str)</div><div>+</div><div>+#define _ENDTEST(str1, str2) L##str1##str2</div><div>+#define ENDTEST(str1, str2) _ENDTEST(str1, str2)</div><div>+</div><div>+void fun() {</div><div>+// Special token pasting for __FUNCTION__ to make it seem like it's a macro</div><div>+// rather than a predefined expr</div><div>+STR2WSTR(__FUNCTION__)</div><div>+// CHECK: __LPREFIX( __FUNCTION__)</div><div>+</div><div>+// However, this rule is only applied if __FUNCTION__ would have been expanded</div><div>+// if it were a true macro.</div><div>+_STR2WSTR(__FUNCTION__)</div><div>+// CHECK: L__FUNCTION__</div><div>+</div><div>+// Make sure it's not applied for regular token pasting</div><div>+#define NOTFUNCTION NOREALLY</div><div>+STR2WSTR(NOTFUNCTION)</div><div>+// CHECK: LNOREALLY</div><div>+</div><div>+// Make sure token pasting with three arguments works with the special</div><div>+// __FUNCTION__ token pasting rule.</div><div>+ENDTEST(__FUNCTION__, JUNCTION)</div><div>+// CHECK: __LPREFIX( __FUNCTION__) JUNCTION</div><div>+}</div><div>Index: test/CodeGenCXX/ms_wide_predefined_expr.cpp</div><div>===================================================================</div><div>--- test/CodeGenCXX/ms_wide_predefined_expr.cpp<span class="Apple-tab-span" style="white-space:pre"> </span>(revision 0)</div><div>+++ test/CodeGenCXX/ms_wide_predefined_expr.cpp<span class="Apple-tab-span" style="white-space:pre"> </span>(revision 0)</div><div>@@ -0,0 +1,20 @@</div><div>+// RUN: %clang_cc1 %s -fms-extensions -emit-llvm -o - | FileCheck %s</div><div>+</div><div>+// CHECK: @__FUNCTION__._Z4funcv.WChar = private constant [5 x i32] [i32 102, i32 117, i32 110, i32 99, i32 0], align 4</div><div>+</div><div>+void wprint(const wchar_t*);</div><div>+</div><div>+#define __STR2WSTR(str) L##str</div><div>+#define _STR2WSTR(str) __STR2WSTR(str)</div><div>+#define STR2WSTR(str) _STR2WSTR(str)</div><div>+</div><div>+void func() {</div><div>+ wprint(STR2WSTR(__FUNCTION__));</div><div>+}</div><div>+</div><div>+int main() {</div><div>+ func();</div><div>+</div><div>+ return 0;</div><div>+}</div><div>+</div><div>Index: include/clang/Basic/TokenKinds.def</div><div>===================================================================</div><div>--- include/clang/Basic/TokenKinds.def<span class="Apple-tab-span" style="white-space:pre"> </span>(revision 157363)</div><div>+++ include/clang/Basic/TokenKinds.def<span class="Apple-tab-span" style="white-space:pre"> </span>(working copy)</div><div>@@ -421,6 +421,7 @@</div><div> KEYWORD(__thiscall , KEYALL)</div><div> KEYWORD(__forceinline , KEYALL)</div><div> KEYWORD(__unaligned , KEYMS)</div><div>+KEYWORD(__LPREFIX , KEYMS)</div><div> </div><div> // OpenCL-specific keywords</div><div> KEYWORD(__kernel , KEYOPENCL)</div><div>Index: include/clang/Basic/DiagnosticParseKinds.td</div><div>===================================================================</div><div>--- include/clang/Basic/DiagnosticParseKinds.td<span class="Apple-tab-span" style="white-space:pre"> </span>(revision 157363)</div><div>+++ include/clang/Basic/DiagnosticParseKinds.td<span class="Apple-tab-span" style="white-space:pre"> </span>(working copy)</div><div>@@ -705,6 +705,12 @@</div><div> </div><div> def err_seh___finally_block : Error<</div><div> "%0 only allowed in __finally block">;</div><div>+</div><div>+def err_ms_extensions_expected_for_lprefix : Error<</div><div>+ "Expected MS extensions to be enabled for '__LPREFIX'">;</div><div>+</div><div>+def err_lprefix_expected_string_literal : Error<</div><div>+ "Expected string literal as argument to '__LPREFIX'">;</div><div> </div><div> } // end of Parse Issue category.</div><div> </div><div>Index: include/clang/Sema/Sema.h</div><div>===================================================================</div><div>--- include/clang/Sema/Sema.h<span class="Apple-tab-span" style="white-space:pre"> </span>(revision 157363)</div><div>+++ include/clang/Sema/Sema.h<span class="Apple-tab-span" style="white-space:pre"> </span>(working copy)</div><div>@@ -2620,7 +2620,8 @@</div><div> SourceLocation LitEndLoc,</div><div> TemplateArgumentListInfo *ExplicitTemplateArgs = 0);</div><div> </div><div>- ExprResult ActOnPredefinedExpr(SourceLocation Loc, tok::TokenKind Kind);</div><div>+ ExprResult ActOnPredefinedExpr(SourceLocation Loc, tok::TokenKind Kind,</div><div>+ CanQualType *ForceTy = 0);</div><div> ExprResult ActOnIntegerConstant(SourceLocation Loc, uint64_t Val);</div><div> ExprResult ActOnNumericConstant(const Token &Tok, Scope *UDLScope = 0);</div><div> ExprResult ActOnCharacterConstant(const Token &Tok, Scope *UDLScope = 0);</div><div>Index: include/clang/Lex/TokenLexer.h</div><div>===================================================================</div><div>--- include/clang/Lex/TokenLexer.h<span class="Apple-tab-span" style="white-space:pre"> </span>(revision 157363)</div><div>+++ include/clang/Lex/TokenLexer.h<span class="Apple-tab-span" style="white-space:pre"> </span>(working copy)</div><div>@@ -91,6 +91,12 @@</div><div> /// should not be subject to further macro expansion.</div><div> bool DisableMacroExpansion : 1;</div><div> </div><div>+ /// MSLPrefix - This is true when we are outputing __LPREFIX( __FUNCTION__)</div><div>+ /// for Microsoft comptibility mode</div><div>+ bool MSLPrefix : 1;</div><div>+ unsigned MSLPrefixState;</div><div>+ SourceRange MSLPrefixRange;</div><div>+</div><div> TokenLexer(const TokenLexer&); // DO NOT IMPLEMENT</div><div> void operator=(const TokenLexer&); // DO NOT IMPLEMENT</div><div> public:</div><div>@@ -168,6 +174,12 @@</div><div> /// first token on the next line.</div><div> void HandleMicrosoftCommentPaste(Token &Tok);</div><div> </div><div>+ /// HandleMicrosoftLPrefix - In Microsoft compatibility mode, L##__FUNCTION__</div><div>+ /// pastes to __LPREFIX( __FUNCTION__). This means it turns into multiple</div><div>+ /// tokens. When MSLPrefix is true, we output this stream of tokens. If</div><div>+ /// this returns true, the caller should immediately return the token.</div><div>+ bool HandleMicrosoftLPrefix(Token &Tok);</div><div>+</div><div> /// \brief If \arg loc is a FileID and points inside the current macro</div><div> /// definition, returns the appropriate source location pointing at the</div><div> /// macro expansion source location entry.</div><div>Index: include/clang/Lex/Token.h</div><div>===================================================================</div><div>--- include/clang/Lex/Token.h<span class="Apple-tab-span" style="white-space:pre"> </span>(revision 157363)</div><div>+++ include/clang/Lex/Token.h<span class="Apple-tab-span" style="white-space:pre"> </span>(working copy)</div><div>@@ -76,7 +76,8 @@</div><div> DisableExpand = 0x04, // This identifier may never be macro expanded.</div><div> NeedsCleaning = 0x08, // Contained an escaped newline or trigraph.</div><div> LeadingEmptyMacro = 0x10, // Empty macro exists before this token.</div><div>- HasUDSuffix = 0x20 // This string or character literal has a ud-suffix.</div><div>+ HasUDSuffix = 0x20, // This string or character literal has a ud-suffix.</div><div>+ ExpandedFakeMacro = 0x40 // This is a fake macro that has been expanded</div><div> };</div><div> </div><div> tok::TokenKind getKind() const { return (tok::TokenKind)Kind; }</div><div>@@ -267,6 +268,13 @@</div><div> /// \brief Return true if this token is a string or character literal which</div><div> /// has a ud-suffix.</div><div> bool hasUDSuffix() const { return (Flags & HasUDSuffix) ? true : false; }</div><div>+</div><div>+ /// \brief Returns true if this token is an identifier representing</div><div>+ /// a fake macro like __FUNCTION__, that would have been expanded if</div><div>+ /// it were a real macro like __FILE__</div><div>+ bool isExpandedFakeMacro() const {</div><div>+ return (Flags & ExpandedFakeMacro) ? true : false;</div><div>+ }</div><div> };</div><div> </div><div> /// PPConditionalInfo - Information about the conditional stack (#if directives)</div><div>Index: include/clang/Parse/Parser.h</div><div>===================================================================</div><div>--- include/clang/Parse/Parser.h<span class="Apple-tab-span" style="white-space:pre"> </span>(revision 157363)</div><div>+++ include/clang/Parse/Parser.h<span class="Apple-tab-span" style="white-space:pre"> </span>(working copy)</div><div>@@ -1376,6 +1376,8 @@</div><div> </div><div> ExprResult ParseObjCBoolLiteral();</div><div> </div><div>+ ExprResult ParseLPrefixExpression();</div><div>+</div><div> //===--------------------------------------------------------------------===//</div><div> // C++ Expressions</div><div> ExprResult ParseCXXIdExpression(bool isAddressOfOperand = false);</div><div>Index: lib/Sema/SemaExpr.cpp</div><div>===================================================================</div><div>--- lib/Sema/SemaExpr.cpp<span class="Apple-tab-span" style="white-space:pre"> </span>(revision 157363)</div><div>+++ lib/Sema/SemaExpr.cpp<span class="Apple-tab-span" style="white-space:pre"> </span>(working copy)</div><div>@@ -2374,7 +2374,8 @@</div><div> }</div><div> }</div><div> </div><div>-ExprResult Sema::ActOnPredefinedExpr(SourceLocation Loc, tok::TokenKind Kind) {</div><div>+ExprResult Sema::ActOnPredefinedExpr(SourceLocation Loc, tok::TokenKind Kind,</div><div>+ CanQualType *ForceTy) {</div><div> PredefinedExpr::IdentType IT;</div><div> </div><div> switch (Kind) {</div><div>@@ -2402,7 +2403,8 @@</div><div> unsigned Length = PredefinedExpr::ComputeName(IT, currentDecl).length();</div><div> </div><div> llvm::APInt LengthI(32, Length + 1);</div><div>- ResTy = Context.CharTy.withConst();</div><div>+ if (ForceTy) ResTy = ForceTy->withConst();</div><div>+ else ResTy = Context.CharTy.withConst();</div><div> ResTy = Context.getConstantArrayType(ResTy, LengthI, ArrayType::Normal, 0);</div><div> }</div><div> return Owned(new (Context) PredefinedExpr(Loc, ResTy, IT));</div><div>Index: lib/Lex/TokenLexer.cpp</div><div>===================================================================</div><div>--- lib/Lex/TokenLexer.cpp<span class="Apple-tab-span" style="white-space:pre"> </span>(revision 157363)</div><div>+++ lib/Lex/TokenLexer.cpp<span class="Apple-tab-span" style="white-space:pre"> </span>(working copy)</div><div>@@ -32,6 +32,9 @@</div><div> ActualArgs = Actuals;</div><div> CurToken = 0;</div><div> </div><div>+ MSLPrefix = false;</div><div>+ MSLPrefixState = 0;</div><div>+</div><div> ExpandLocStart = Tok.getLocation();</div><div> ExpandLocEnd = ELEnd;</div><div> AtStartOfLine = Tok.isAtStartOfLine();</div><div>@@ -91,6 +94,8 @@</div><div> DisableMacroExpansion = disableMacroExpansion;</div><div> NumTokens = NumToks;</div><div> CurToken = 0;</div><div>+ MSLPrefix = false;</div><div>+ MSLPrefixState = 0;</div><div> ExpandLocStart = ExpandLocEnd = SourceLocation();</div><div> AtStartOfLine = false;</div><div> HasLeadingSpace = false;</div><div>@@ -354,9 +359,32 @@</div><div> }</div><div> }</div><div> </div><div>+namespace {</div><div>+ /// Is the given token one that is treated as a "fake" macro, like</div><div>+ /// __FUNCTION__. A fake macro isn't really a macro; it's an identifier</div><div>+ /// that is expanded into a string literal during compilation, where</div><div>+ /// special rules are obeyed in the preprocessor to make it act a little</div><div>+ /// more like a macro. This is used to implement Microsoft's __LPREFIX</div><div>+ /// extension for __FUNCTION__.</div><div>+ bool IsTokenFakeMacro(const Token &Tok, bool MicrosoftExt) {</div><div>+ if (!MicrosoftExt) return false;</div><div>+ switch (Tok.getKind()) {</div><div>+ case tok::kw___FUNCTION__:</div><div>+ return true;</div><div>+ default:</div><div>+ return false;</div><div>+ }</div><div>+ }</div><div>+}</div><div>+</div><div> /// Lex - Lex and return a token from this macro stream.</div><div> ///</div><div> void TokenLexer::Lex(Token &Tok) {</div><div>+ // Handle outputting the Microsoft __LPREFIX extension, if the</div><div>+ // MSLPrefix flag is set.</div><div>+ if (MSLPrefix && HandleMicrosoftLPrefix(Tok))</div><div>+ return;</div><div>+</div><div> // Lexing off the end of the macro, pop this macro off the expansion stack.</div><div> if (isAtEnd()) {</div><div> // If this is a macro (not a token stream), mark the macro enabled now</div><div>@@ -383,6 +411,11 @@</div><div> // Get the next token to return.</div><div> Tok = Tokens[CurToken++];</div><div> </div><div>+ // Check for fake macros. If this is a fake macro, mark it expanded.</div><div>+ if (IsTokenFakeMacro(Tok, PP.getLangOpts().MicrosoftExt)) {</div><div>+ Tok.setFlagValue(Token::ExpandedFakeMacro, true);</div><div>+ }</div><div>+</div><div> bool TokenIsFromPaste = false;</div><div> </div><div> // If this token is followed by a token paste (##) operator, paste the tokens!</div><div>@@ -483,6 +516,34 @@</div><div> if (BufPtr != &Buffer[LHSLen]) // Really, we want the chars in Buffer!</div><div> memcpy(&Buffer[LHSLen], BufPtr, RHSLen);</div><div> </div><div>+ // If Microsoft extensions are enabled, special-case token pasting</div><div>+ // __FUNCTION__. L##__FUNCTION__ turns into __LPREFIX( __FUNCTION__)</div><div>+ if (PP.getLangOpts().MicrosoftExt) {</div><div>+ const bool isRHSFunction = RHS.isExpandedFakeMacro()</div><div>+ && RHS.getKind() == tok::kw___FUNCTION__;</div><div>+ if (isRHSFunction && LHSLen == 1 && Buffer[0] == 'L') {</div><div>+ // __LPREFIX( __FUNCTION__)</div><div>+ // For this expansion, we don't return a single token; it actually</div><div>+ // turns into four tokens. Paste the __LPREFIX token here,</div><div>+ // and set MSLPrefix to true. In Lex(), when this flag is set, we'll</div><div>+ // continue outputting this series of tokens rather than</div><div>+ // continuing to lex.</div><div>+ MSLPrefix = true;</div><div>+ MSLPrefixState = 0;</div><div>+ MSLPrefixRange.setBegin(Tok.getLocation());</div><div>+ MSLPrefixRange.setEnd(RHS.getLocation());</div><div>+</div><div>+ Tok.startToken();</div><div>+ Tok.setKind(tok::kw___LPREFIX);</div><div>+ Tok.setLength(9);</div><div>+ PP.CreateString("__LPREFIX", 9, Tok,</div><div>+ MSLPrefixRange.getBegin(),</div><div>+ MSLPrefixRange.getEnd());</div><div>+ ++CurToken;</div><div>+ return false;</div><div>+ }</div><div>+ }</div><div>+</div><div> // Trim excess space.</div><div> Buffer.resize(LHSLen+RHSLen);</div><div> </div><div>@@ -646,6 +707,52 @@</div><div> PP.HandleMicrosoftCommentPaste(Tok);</div><div> }</div><div> </div><div>+/// HandleMicrosoftLPrefix - In Microsoft compatibility mode, L##__FUNCTION__</div><div>+/// pastes to __LPREFIX( __FUNCTION__). This means it turns into multiple</div><div>+/// tokens. When MSLPrefix is true, we output this stream of tokens. If</div><div>+/// this returns true, the caller should immediately return the token.</div><div>+bool TokenLexer::HandleMicrosoftLPrefix(Token &Tok) {</div><div>+ assert(MSLPrefix && "Expected MSLPrefix to be set");</div><div>+</div><div>+ SourceLocation StartLoc = MSLPrefixRange.getBegin(),</div><div>+ EndLoc = MSLPrefixRange.getEnd();</div><div>+ switch (MSLPrefixState++) {</div><div>+ case 0:</div><div>+ Tok.startToken();</div><div>+ Tok.setKind(tok::l_paren);</div><div>+ Tok.setLength(1);</div><div>+ PP.CreateString("(", 1, Tok, StartLoc, EndLoc);</div><div>+ return true;</div><div>+ case 1:</div><div>+ // We put a space before __FUNCTION__ to get __LPREFIX( __FUNCTION__)</div><div>+ // as Microsoft's compiler does</div><div>+ Tok.startToken();</div><div>+ Tok.setKind(tok::kw___FUNCTION__);</div><div>+ Tok.setLength(13);</div><div>+ PP.CreateString(" __FUNCTION__", 13, Tok, StartLoc, EndLoc);</div><div>+ return true;</div><div>+ case 2:</div><div>+ Tok.startToken();</div><div>+ Tok.setKind(tok::r_paren);</div><div>+ Tok.setLength(1);</div><div>+ PP.CreateString(")", 1, Tok, StartLoc, EndLoc);</div><div>+ return true;</div><div>+ case 3:</div><div>+ // Once we've pasted __LPREFIX( __FUNCTION__), look for a token</div><div>+ // paste (##) operator. If we find one, skip it. This for example:</div><div>+ // #define __ENDTEST(str1, str2) L##str1##str2</div><div>+ // #define _ENDTEST(str1, str2) __ENDTEST(str1, str2)</div><div>+ // #define ENDTEST _ENDTEST(__FUNCTION__, JUNCTION)</div><div>+ // ENDTEST should expand to __LPREFIX( __FUNCTION__)JUNCTION</div><div>+ if (Tokens[CurToken].is(tok::hashhash)) ++CurToken;</div><div>+ MSLPrefix = false;</div><div>+ return false;</div><div>+ default:</div><div>+ assert(false);</div><div>+ return false;</div><div>+ }</div><div>+}</div><div>+</div><div> /// \brief If \arg loc is a file ID and points inside the current macro</div><div> /// definition, returns the appropriate source location pointing at the</div><div> /// macro expansion source location entry, otherwise it returns an invalid</div><div>Index: lib/Lex/LiteralSupport.cpp</div><div>===================================================================</div><div>--- lib/Lex/LiteralSupport.cpp<span class="Apple-tab-span" style="white-space:pre"> </span>(revision 157363)</div><div>+++ lib/Lex/LiteralSupport.cpp<span class="Apple-tab-span" style="white-space:pre"> </span>(working copy)</div><div>@@ -1090,6 +1090,15 @@</div><div> </div><div> // TODO: K&R warning: "traditional C rejects string constant concatenation"</div><div> </div><div>+ // If the first token is __LPREFIX, parse the string literal as if it</div><div>+ // started with 'L', and skip the token. When the parser encounters</div><div>+ // __LPREFIX("string"), it passes us __LPREFIX "string" as two tokens.</div><div>+ if (Kind == tok::kw___LPREFIX) {</div><div>+ Kind = tok::wide_string_literal;</div><div>+ ++StringToks;</div><div>+ --NumStringToks;</div><div>+ }</div><div>+</div><div> // Get the width in bytes of char/wchar_t/char16_t/char32_t</div><div> CharByteWidth = getCharWidth(Kind, Target);</div><div> assert((CharByteWidth & 7) == 0 && "Assumes character size is byte multiple");</div><div>Index: lib/CodeGen/CGExpr.cpp</div><div>===================================================================</div><div>--- lib/CodeGen/CGExpr.cpp<span class="Apple-tab-span" style="white-space:pre"> </span>(revision 157363)</div><div>+++ lib/CodeGen/CGExpr.cpp<span class="Apple-tab-span" style="white-space:pre"> </span>(working copy)</div><div>@@ -21,6 +21,7 @@</div><div> #include "TargetInfo.h"</div><div> #include "clang/AST/ASTContext.h"</div><div> #include "clang/AST/DeclObjC.h"</div><div>+#include "clang/Basic/ConvertUTF.h"</div><div> #include "clang/Frontend/CodeGenOptions.h"</div><div> #include "llvm/Intrinsics.h"</div><div> #include "llvm/LLVMContext.h"</div><div>@@ -1683,7 +1684,76 @@</div><div> E->getType());</div><div> }</div><div> </div><div>+namespace {</div><div>+ llvm::Constant*</div><div>+ GetAddrOfConstantWideString(StringRef Str,</div><div>+ const char *GlobalName,</div><div>+ ASTContext &Context,</div><div>+ QualType Ty, SourceLocation Loc,</div><div>+ CodeGenModule &CGM) {</div><div> </div><div>+ StringLiteral *SL = StringLiteral::Create(Context,</div><div>+ Str,</div><div>+ StringLiteral::Wide,</div><div>+ /*Pascal = */false,</div><div>+ Ty, Loc);</div><div>+ llvm::Constant *C = CGM.GetConstantArrayFromStringLiteral(SL);</div><div>+ llvm::GlobalVariable *GV =</div><div>+ new llvm::GlobalVariable(CGM.getModule(), C->getType(),</div><div>+ !CGM.getLangOpts().WritableStrings,</div><div>+ llvm::GlobalValue::PrivateLinkage,</div><div>+ C, GlobalName);</div><div>+ const unsigned WideAlignment =</div><div>+ Context.getTypeAlignInChars(Ty).getQuantity();</div><div>+ GV->setAlignment(WideAlignment);</div><div>+ return GV;</div><div>+ }</div><div>+</div><div>+ // FIXME: Mostly copied from StringLiteralParser::CopyStringFragment</div><div>+ void ConvertUTF8ToWideString(unsigned CharByteWidth, StringRef Source,</div><div>+ SmallString<32>& Target) {</div><div>+ Target.resize(CharByteWidth * (Source.size() + 1));</div><div>+ char* ResultPtr = &Target[0];</div><div>+</div><div>+ assert(CharByteWidth==1 || CharByteWidth==2 || CharByteWidth==4);</div><div>+ ConversionResult result = conversionOK;</div><div>+ // Copy the character span over.</div><div>+ if (CharByteWidth == 1) {</div><div>+ if (!isLegalUTF8String(reinterpret_cast<const UTF8*>(&*Source.begin()),</div><div>+ reinterpret_cast<const UTF8*>(&*Source.end())))</div><div>+ result = sourceIllegal;</div><div>+ memcpy(ResultPtr, Source.data(), Source.size());</div><div>+ ResultPtr += Source.size();</div><div>+ } else if (CharByteWidth == 2) {</div><div>+ UTF8 const *sourceStart = (UTF8 const *)Source.data();</div><div>+ // FIXME: Make the type of the result buffer correct instead of</div><div>+ // using reinterpret_cast.</div><div>+ UTF16 *targetStart = reinterpret_cast<UTF16*>(ResultPtr);</div><div>+ ConversionFlags flags = strictConversion;</div><div>+ result = ConvertUTF8toUTF16(</div><div>+ &sourceStart,sourceStart + Source.size(),</div><div>+ &targetStart,targetStart + 2*Source.size(),flags);</div><div>+ if (result==conversionOK)</div><div>+ ResultPtr = reinterpret_cast<char*>(targetStart);</div><div>+ } else if (CharByteWidth == 4) {</div><div>+ UTF8 const *sourceStart = (UTF8 const *)Source.data();</div><div>+ // FIXME: Make the type of the result buffer correct instead of</div><div>+ // using reinterpret_cast.</div><div>+ UTF32 *targetStart = reinterpret_cast<UTF32*>(ResultPtr);</div><div>+ ConversionFlags flags = strictConversion;</div><div>+ result = ConvertUTF8toUTF32(</div><div>+ &sourceStart,sourceStart + Source.size(),</div><div>+ &targetStart,targetStart + 4*Source.size(),flags);</div><div>+ if (result==conversionOK)</div><div>+ ResultPtr = reinterpret_cast<char*>(targetStart);</div><div>+ }</div><div>+ assert((result != targetExhausted)</div><div>+ && "ConvertUTF8toUTFXX exhausted target buffer");</div><div>+ assert(result == conversionOK);</div><div>+ Target.resize(ResultPtr - &Target[0]);</div><div>+ }</div><div>+}</div><div>+</div><div> LValue CodeGenFunction::EmitPredefinedLValue(const PredefinedExpr *E) {</div><div> switch (E->getIdentType()) {</div><div> default:</div><div>@@ -1722,8 +1792,27 @@</div><div> ? FnName.str()</div><div> : PredefinedExpr::ComputeName((PredefinedExpr::IdentType)Type, CurDecl));</div><div> </div><div>- llvm::Constant *C =</div><div>- CGM.GetAddrOfConstantCString(FunctionName, GlobalVarName.c_str());</div><div>+ const ConstantArrayType *CAT =</div><div>+ getContext().getAsConstantArrayType(E->getType());</div><div>+ QualType ElemType = CAT->getElementType();</div><div>+ llvm::Constant *C;</div><div>+ if (ElemType == getContext().WCharTy.withConst()) {</div><div>+ GlobalVarName += ".WChar";</div><div>+ SmallString<32> RawChars;</div><div>+ ConvertUTF8ToWideString(</div><div>+ getContext().getTypeSizeInChars(ElemType).getQuantity(),</div><div>+ FunctionName, RawChars);</div><div>+ C = GetAddrOfConstantWideString(RawChars,</div><div>+ GlobalVarName.c_str(),</div><div>+ getContext(),</div><div>+ E->getType(),</div><div>+ E->getLocation(),</div><div>+ CGM);</div><div>+ } else {</div><div>+ C = CGM.GetAddrOfConstantCString(FunctionName,</div><div>+ GlobalVarName.c_str(),</div><div>+ 1);</div><div>+ }</div><div> return MakeAddrLValue(C, E->getType());</div><div> }</div><div> }</div><div>Index: lib/Parse/ParseExpr.cpp</div><div>===================================================================</div><div>--- lib/Parse/ParseExpr.cpp<span class="Apple-tab-span" style="white-space:pre"> </span>(revision 157363)</div><div>+++ lib/Parse/ParseExpr.cpp<span class="Apple-tab-span" style="white-space:pre"> </span>(working copy)</div><div>@@ -857,6 +857,9 @@</div><div> case tok::utf32_string_literal:</div><div> Res = ParseStringLiteralExpression(true);</div><div> break;</div><div>+ case tok::kw___LPREFIX:</div><div>+ Res = ParseLPrefixExpression();</div><div>+ break;</div><div> case tok::kw__Generic: // primary-expression: generic-selection [C11 6.5.1]</div><div> Res = ParseGenericSelectionExpression();</div><div> break;</div><div>@@ -2429,3 +2432,63 @@</div><div> tok::TokenKind Kind = Tok.getKind();</div><div> return Actions.ActOnObjCBoolLiteral(ConsumeToken(), Kind);</div><div> }</div><div>+</div><div>+/// ParseLPrefixExpression - This is for a Microsoft expression.</div><div>+/// '__LPREFIX' '(' string-literal ')'</div><div>+/// '__LPREFIX' '(' __FUNCTION__ ')'</div><div>+ExprResult Parser::ParseLPrefixExpression() {</div><div>+ assert(Tok.getKind() == tok::kw___LPREFIX);</div><div>+</div><div>+ // MS extensions should be enabled or we should give an error.</div><div>+ // Still parse regularly, though.</div><div>+ if (!PP.getLangOpts().MicrosoftExt) {</div><div>+ Diag(Tok, diag::err_ms_extensions_expected_for_lprefix);</div><div>+ }</div><div>+</div><div>+ // Eat the __LPREFIX</div><div>+ Token LPrefixTok = Tok;</div><div>+ ConsumeToken();</div><div>+</div><div>+ // Expect a lparen</div><div>+ if (ExpectAndConsume(tok::l_paren, diag::err_expected_lparen,</div><div>+ "", tok::r_paren))</div><div>+ return ExprError();</div><div>+</div><div>+</div><div>+ ExprResult Res;</div><div>+ switch (Tok.getKind()) {</div><div>+ case tok::kw___FUNCTION__:</div><div>+ Res = Actions.ActOnPredefinedExpr(Tok.getLocation(), Tok.getKind(),</div><div>+ &Actions.Context.WCharTy);</div><div>+ ConsumeToken();</div><div>+ break;</div><div>+ case tok::string_literal: {</div><div>+ // Pass the string literal parser __LPREFIX "string" without the</div><div>+ // parentheses. The string literal parser will treat __LPREFIX</div><div>+ // like 'L'.</div><div>+ SmallVector<Token, 4> StringToks;</div><div>+ StringToks.push_back(LPrefixTok);</div><div>+</div><div>+ do {</div><div>+ StringToks.push_back(Tok);</div><div>+ ConsumeStringToken();</div><div>+ } while (isTokenStringLiteral());</div><div>+</div><div>+ // Pass the set of string tokens, ready for concatenation, to the actions.</div><div>+ Res = Actions.ActOnStringLiteral(&StringToks[0], StringToks.size(), 0);</div><div>+ break;</div><div>+ }</div><div>+ default:</div><div>+ Diag(Tok, diag::err_lprefix_expected_string_literal);</div><div>+ SkipUntil(tok::r_paren);</div><div>+ Res = ExprError();</div><div>+ break;</div><div>+ }</div><div>+ </div><div>+ // Expect a rparen</div><div>+ if (ExpectAndConsume(tok::r_paren, diag::err_expected_rparen, ""))</div><div>+ return ExprError();</div><div>+</div><div>+ return Res;</div><div>+}</div><div>+</div></div><div><br></div></body></html>