<div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote">2015-06-19 14:38 GMT-07:00 Duncan P. N. Exon Smith <span dir="ltr"><<a href="mailto:dexonsmith@apple.com" target="_blank">dexonsmith@apple.com</a>></span>:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="HOEnZb"><div class="h5"><br>
> On 2015-Jun-19, at 13:35, Alex Lorenz <<a href="mailto:arphaman@gmail.com">arphaman@gmail.com</a>> wrote:<br>
><br>
> Hi dexonsmith, bob.wilson, bogner,<br>
><br>
> This patch is based on a previous serialization patch that serializes physical register operands (<a href="https://urldefense.proofpoint.com/v2/url?u=http-3A__reviews.llvm.org_D10525&d=AwMFaQ&c=8hUWFZcy2Z-Za5rBPlktOQ&r=mQ4LZ2PUj9hpadE3cDHZnIdEwhEBrbAstXeMaFoB9tg&m=8S5QxziHuQ-R6n88HJae4w2zTVR_AJaLQ3UT3cGzFyI&s=znaHt6sOR-PbyxbXUyfzLjEnTU0ftsSgeAYd9uCZtVM&e=" rel="noreferrer" target="_blank">http://reviews.llvm.org/D10525</a>).<br>
><br>
> This patch serialized immediate machine operands by using integer literals.<br>
><br>
> REPOSITORY<br>
> rL LLVM<br>
><br>
> <a href="https://urldefense.proofpoint.com/v2/url?u=http-3A__reviews.llvm.org_D10573&d=AwMFaQ&c=8hUWFZcy2Z-Za5rBPlktOQ&r=mQ4LZ2PUj9hpadE3cDHZnIdEwhEBrbAstXeMaFoB9tg&m=8S5QxziHuQ-R6n88HJae4w2zTVR_AJaLQ3UT3cGzFyI&s=Uyp43ewl4u4tldK-TpjZcrhFStrkUJjNyXYJH-77QRw&e=" rel="noreferrer" target="_blank">http://reviews.llvm.org/D10573</a><br>
><br>
> Files:<br>
> lib/CodeGen/MIRParser/MILexer.cpp<br>
> lib/CodeGen/MIRParser/MILexer.h<br>
> lib/CodeGen/MIRParser/MIParser.cpp<br>
> lib/CodeGen/MIRPrinter.cpp<br>
> test/CodeGen/MIR/X86/immediate-operands.mir<br>
><br>
> EMAIL PREFERENCES<br>
> <a href="https://urldefense.proofpoint.com/v2/url?u=http-3A__reviews.llvm.org_settings_panel_emailpreferences_&d=AwMFaQ&c=8hUWFZcy2Z-Za5rBPlktOQ&r=mQ4LZ2PUj9hpadE3cDHZnIdEwhEBrbAstXeMaFoB9tg&m=8S5QxziHuQ-R6n88HJae4w2zTVR_AJaLQ3UT3cGzFyI&s=wTHl3h01myP3lDR8rj_b61-fVUkgxq02ukuniDi2OR4&e=" rel="noreferrer" target="_blank">http://reviews.llvm.org/settings/panel/emailpreferences/</a><br>
</div></div>> <D10573.28043.patch><br>
<br>
> Index: lib/CodeGen/MIRParser/MILexer.cpp<br>
> ===================================================================<br>
> --- lib/CodeGen/MIRParser/MILexer.cpp<br>
> +++ lib/CodeGen/MIRParser/MILexer.cpp<br>
> @@ -35,6 +35,8 @@<br>
><br>
> char peek() const { return isEOF() ? 0 : *Ptr; }<br>
><br>
> + char peekNext() const { return (Ptr + 1) >= End ? 0 : *(Ptr + 1); }<br>
<br>
This looks like undefined behaviour. You can only compare two pointers<br>
if they're within the same allocation. It looks like `Ptr + 1` might go<br>
past the end of the allocation here (already undefined behaviour IIRC,<br>
even if you don't compare it to anything).<br>
<br>
Unless you can guarantee that it's valid to dereference `End` (and even<br>
then, really, for clarity), rearrange the terms:<br>
<br>
return End - Ptr <= 1 ? 0 : Ptr[1];<br>
<br>
(Also note the `Ptr[1]` -- IMO, brackets are easier to read than<br>
dereference-and-parentheses. Up to you though.)<br>
<br>
I wonder if this can just be merged with `peek()`?<br>
<br>
char peek(unsigned I = 0) const { return End - Ptr <= I ? 0 : Ptr[I]; }<br>
<br>
Then instead of calling `peekNext()`, call `peek(1)`. I don't really<br>
have an opinion here, it's just an idea; feel free to keep separate<br>
functions.<br>
<br>
> +<br>
> void advance() { ++Ptr; }<br>
><br>
> StringRef remaining() const { return StringRef(Ptr, End - Ptr); }<br>
> @@ -77,6 +79,32 @@<br>
> return C;<br>
> }<br>
><br>
> +static APSInt toInteger(StringRef Str) {<br>
> + assert(!Str.empty() && "Integer literal string must not be empty");<br>
> + unsigned NumBits = ((Str.size() * 64) / 19) + 2;<br>
> + APInt Tmp(NumBits, Str, /*Radix=*/10);<br>
> + if (Str[0] == '-') {<br>
> + unsigned MinBits = Tmp.getMinSignedBits();<br>
> + if (MinBits > 0 && MinBits < NumBits)<br>
> + Tmp = Tmp.trunc(MinBits);<br>
> + return APSInt(Tmp, /*IsUnsigned=*/false);<br>
> + }<br>
> + unsigned ActiveBits = Tmp.getActiveBits();<br>
> + if (ActiveBits > 0 && ActiveBits < NumBits)<br>
> + Tmp = Tmp.trunc(ActiveBits);<br>
> + return APSInt(Tmp, /*IsUnsigned=*/true);<br>
> +}<br>
<br>
Is this copied directly from `LLLexer`? If so, is there some way of<br>
sharing the logic? (Should it go in `APSInt.h`?)<br></blockquote><div><br></div><div>Yeah, it's a copy. It would make sense to share this code indeed, I will create a patch.</div><div><br></div><div>Alex.</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<br>
If not, what's different?<br>
<br>
> +<br>
> +static Cursor lexIntegerLiteral(Cursor C, MIToken &Token) {<br>
> + auto Range = C;<br>
> + C.advance();<br>
> + while (isdigit(C.peek()))<br>
> + C.advance();<br>
> + StringRef StrVal = Range.upto(C);<br>
> + Token = MIToken(MIToken::IntegerLiteral, StrVal, toInteger(StrVal));<br>
> + return C;<br>
> +}<br>
> +<br>
> static MIToken::TokenKind symbolToken(char C) {<br>
> switch (C) {<br>
> case ',':<br>
<br>
</blockquote></div><br></div></div>