[PATCH] Tolerate unmangled names in sample profiles.

Diego Novillo dnovillo at google.com
Tue Mar 18 05:08:45 PDT 2014

On Mon, Mar 17, 2014 at 4:19 PM, Chandler Carruth <chandlerc at gmail.com> wrote:

> Comment at: lib/Transforms/Scalar/SampleProfile.cpp:260-265
> @@ -259,1 +259,8 @@
> +  /// \brief Report a parse warning message.
> +  void reportParseWarning(int64_t LineNumber, Twine Msg) const {
> +    DiagnosticInfoSampleProfile Diag(Filename.data(), LineNumber, Msg,
> +                                     DS_Warning);
> +    M.getContext().diagnose(Diag);
> +  }
> +
> ----------------
> Now dead code?


> ================
> Comment at: lib/Transforms/Scalar/SampleProfile.cpp:472
> @@ -464,3 +471,3 @@
>    // accumulate samples as we parse them.
> -  Regex HeadRE("^([^:]+):([0-9]+):([0-9]+)$");
> +  Regex HeadRE("^([^0-9].*):([0-9]+):([0-9]+)$");
>    Regex LineSample("^([0-9]+)\\.?([0-9]+)?: ([0-9]+)(.*)$");
> ----------------
> I would at least comment about the heuristic used to discard invalid samples. It is really hard to mentally verify that this switches us to consume names containing a ':'. I had to ask others to become confident. I think we've crossed the threshold at which I like REs for parsing. ;]

Yeah. Backtracking is what makes this work. The RE will greedily match
everything and then go back when it finds the colon separated numbers
at the end. I've added some comments.

> It is also possible in theory to use an asm label or other hack to create a symbol starting with a number. Not sure how or why this would ever be
> relevant for us though, so I'm happy with this approach, but again, the documentation will be essential if we ever get a bug filed on this.

Huh. Really? I've never seen that, but it's easy enough to relax this
pattern to accept anything.


More information about the llvm-commits mailing list