<html><head><meta http-equiv="Content-Type" content="text/html charset=us-ascii"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class=""><br class=""><div><blockquote type="cite" class=""><div class="">On Nov 4, 2017, at 12:09 PM, Tim Rakowski via cfe-dev <<a href="mailto:cfe-dev@lists.llvm.org" class="">cfe-dev@lists.llvm.org</a>> wrote:</div><br class="Apple-interchange-newline"><div class=""><div dir="ltr" class=""><div style="color:rgb(33,33,33);font-size:13px" class="">Hi,</div><div style="color:rgb(33,33,33);font-size:13px" class=""><br class=""></div><div style="color:rgb(33,33,33);font-size:13px" class=""><br class=""></div><div style="color:rgb(33,33,33);font-size:13px" class="">TLDR: Is there any work going on to drastically improve constexpr evaluation</div><div style="color:rgb(33,33,33);font-size:13px" class=""> speed?</div></div></div></blockquote><div><br class=""></div>I am not aware of any, no. If you're willing to share some of your intermediate programs</div><div>with us, I suspect there are a number of things we could do to speed up constexpr evaluation</div><div>short of actually taking on the massive complexity cost of using a JIT. In particular, our value</div><div>representation is not particularly tuned, and the constant evaluator probably does a lot of</div><div>deep copies.</div><div><br class=""></div><div>John.</div><div><br class=""><blockquote type="cite" class=""><div class=""><div dir="ltr" class=""><div style="color:rgb(33,33,33);font-size:13px" class=""><br class=""></div><div style="color:rgb(33,33,33);font-size:13px" class=""><br class=""></div><div style="color:rgb(33,33,33);font-size:13px" class="">after watching "constexpr ALL the things!"</div><div style="color:rgb(33,33,33);font-size:13px" class="">(<a href="https://www.youtube.com/watch?v=HMB9oXFobJc" target="_blank" class="">https://www.youtube.com/watch?v=HMB9oXFobJc</a>) I started making my hobby</div><div style="color:rgb(33,33,33);font-size:13px" class="">lexer/parser generator library constexpr as well, which turned out to be just as</div><div style="color:rgb(33,33,33);font-size:13px" class="">easy as advertised. There where two problems I could deal with relativly easily:</div><div style="color:rgb(33,33,33);font-size:13px" class=""><br class=""></div><div style="color:rgb(33,33,33);font-size:13px" class="">a) gcc exhausts any available memory. So switch to clang.</div><div style="color:rgb(33,33,33);font-size:13px" class="">b) clang 5.0.0 has the constexpr constructor bug </div><div style="color:rgb(33,33,33);font-size:13px" class=""> (<a href="https://bugs.llvm.org/show_bug.cgi?id=19741" target="_blank" class="">https://bugs.llvm.org/show_bug.cgi?id=19741</a>), so build clang from source.</div><div style="color:rgb(33,33,33);font-size:13px" class="">c) -fconstexpr-steps=-1</div><div style="color:rgb(33,33,33);font-size:13px" class=""><br class=""></div><div style="color:rgb(33,33,33);font-size:13px" class="">The only obstacle left (except for missing placement new and having to</div><div style="color:rgb(33,33,33);font-size:13px" class="">initialize all memory) are the slow, slow slow compile times. I'm talking about</div><div style="color:rgb(33,33,33);font-size:13px" class="">"I don't even know if it will ever end", before I started optimizing for </div><div style="color:rgb(33,33,33);font-size:13px" class="">constexpr evaluation. I have been able to decrease the compile time to 20</div><div style="color:rgb(33,33,33);font-size:13px" class="">minutes, then 15 minutes, then 10 minutes, ... etc. There was no "Oh, I</div><div style="color:rgb(33,33,33);font-size:13px" class="">accidentally used a bad algorithm." or "Oh, the compiler doesn't handle X very</div><div style="color:rgb(33,33,33);font-size:13px" class="">well.". Just incrementally optimizing the code for compile time evaluation.</div><div style="color:rgb(33,33,33);font-size:13px" class=""><br class=""></div><div style="color:rgb(33,33,33);font-size:13px" class="">By now I am down to 3 minutes for my "test case" of constructing a minimal DFA </div><div style="color:rgb(33,33,33);font-size:13px" class="">for a json Lexer including</div><div style="color:rgb(33,33,33);font-size:13px" class=""><br class=""></div><div style="color:rgb(33,33,33);font-size:13px" class="">a) parsing the Regexes for json terminals and constructing an NFA,</div><div style="color:rgb(33,33,33);font-size:13px" class="">b) constructing the corresponding DFA based on the NFA,</div><div style="color:rgb(33,33,33);font-size:13px" class="">c) minimizing the DFA (Hopcroft's algorithm).</div><div style="color:rgb(33,33,33);font-size:13px" class=""><br class=""></div><div style="color:rgb(33,33,33);font-size:13px" class="">I'm basically using algorithms described in the Dragon Book.</div><div style="color:rgb(33,33,33);font-size:13px" class=""><br class=""></div><div style="color:rgb(33,33,33);font-size:13px" class="">With -O0 and disabling constexpr, this runs in 0.275 seconds. I have</div><div style="color:rgb(33,33,33);font-size:13px" class="">continuously improved the compilation time by profiling and optimizing the -O0 </div><div style="color:rgb(33,33,33);font-size:13px" class="">build first with perf tools and then with callgrind. Here is what I learned:</div><div style="color:rgb(33,33,33);font-size:13px" class=""><br class=""></div><div style="color:rgb(33,33,33);font-size:13px" class="">a) Just reduce the number of operations necessary to compute the result.</div><div style="color:rgb(33,33,33);font-size:13px" class="">b) Ignore cache locality.</div><div style="color:rgb(33,33,33);font-size:13px" class="">c) Abstraction is too expensive. Inline everything manually.</div><div style="color:rgb(33,33,33);font-size:13px" class=""><br class=""></div><div style="color:rgb(33,33,33);font-size:13px" class="">While this is a fascinating experience that goes against everything I learned</div><div style="color:rgb(33,33,33);font-size:13px" class="">about optimization in the past few years, it obviously hinders the idea of being</div><div style="color:rgb(33,33,33);font-size:13px" class="">able to use constexpr functions both at compile time AND at run time.</div><div style="color:rgb(33,33,33);font-size:13px" class=""><br class=""></div><div style="color:rgb(33,33,33);font-size:13px" class="">So of course I would prefer it if compile time evaluation speed would be</div><div style="color:rgb(33,33,33);font-size:13px" class="">comparable to, say, an interpreted language. Maybe one using just-in-time</div><div style="color:rgb(33,33,33);font-size:13px" class="">compilation like LLVM (hint, hint). Of course you can't just compile and run the </div><div style="color:rgb(33,33,33);font-size:13px" class="">constexpr code because undefined behavior must be "trapped". But running a</div><div style="color:rgb(33,33,33);font-size:13px" class="">no-undefined-behavior version of C++ would be cool, I think. Even if it</div><div style="color:rgb(33,33,33);font-size:13px" class="">certainly would be a lot of work ... </div><div style="color:rgb(33,33,33);font-size:13px" class=""><br class=""></div><div style="color:rgb(33,33,33);font-size:13px" class="">Is there any work going on in that direction? Am I missing something?</div><div style="color:rgb(33,33,33);font-size:13px" class=""><br class=""></div><div style="color:rgb(33,33,33);font-size:13px" class="">Regards,</div><div style="color:rgb(33,33,33);font-size:13px" class="">Tim Rakowski</div></div>
_______________________________________________<br class="">cfe-dev mailing list<br class=""><a href="mailto:cfe-dev@lists.llvm.org" class="">cfe-dev@lists.llvm.org</a><br class="">http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev<br class=""></div></blockquote></div><br class=""></body></html>