<div dir="ltr">Yea, I've seen these weird inputs. <div>But just blindly adding ' ' after all tokens might be a bad idea,</div><div>because we lose fun ways the tokens may get combined. (e.g. '1' and '2' will be combined into a '12'). </div><div><br></div><div>Note, the -tokens feature is a toy so far (although it does work great for clang-fuzzer). </div><div><br></div><div>While we are at it, I want to remind everyone interested that </div><div><a href="https://urldefense.proofpoint.com/v2/url?u=http-3A__lab.llvm.org-3A8011_builders_sanitizer-2Dx86-5F64-2Dlinux-2Dfuzzer&d=AwMFaQ&c=8hUWFZcy2Z-Za5rBPlktOQ&r=mQ4LZ2PUj9hpadE3cDHZnIdEwhEBrbAstXeMaFoB9tg&m=HdSlZR4RTguDBTf3WdKGv7A1Pe3CpaNU5YBPauxHLk4&s=R8a4NXETLYu-dviI4-YRerLfAO8_gdiCLpOjy06pNHc&e=">http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fuzzer</a><br></div><div>shows a large variety of assert failures and memory bugs in clang and clang-format </div><div>and you are welcome to fix those! </div><div><br></div><div><br></div></div><div class="gmail_extra"><br><div class="gmail_quote">On Wed, May 20, 2015 at 11:42 PM, Justin Bogner <span dir="ltr"><<a href="mailto:mail@justinbogner.com" target="_blank">mail@justinbogner.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Hey Kostya,<br>
<br>
I was playing with clang-fuzz and the -tokens= flag, and I noticed most<br>
of the generated inputs contain things like "elsedeletecontinue1union",<br>
where a bunch of tokens are concatenated together with no spaces, such<br>
that we just get long ascii identifiers rather than distinct tokens. It<br>
seems like we're more likely to get interesting input out of the tokens<br>
if we space-delimit them.<br>
<br>
WDYT?<br>
<br>
</blockquote></div><br></div>