[LLVMdev] PEG parsers? (was Re: Regular Expressions)

Samuel Crow samuraileumas at yahoo.com
Thu Jun 11 11:06:46 PDT 2009


Hello everybody,

I don't quite understand how the proposed regex library works but I know that PEG parser generators use a super-set of regex functionality to define their parsers.  There's also a nice one on Google code called YARDparser that uses templates based on PEGs to generate efficient recursive-decent parsers.

Furthermore, my partner and I am working on an interpreter for PEG syntax based parsers and LLVM Assembly code for all of the code-generation stages so that we can debug the code and the parser all in one go.  It's part of the Mattathias BASIC project on SourceForge.net and don't let the LGPL license scare you.  We can change the license for the parser generator.  It's only the extensible BASIC compiler and interpreter we are building with it that is required to be LGPL due to the runtime libraries.

Does the prospect of a PEG based syntax interest the team for the TableGen?  They are wonderfully simple to maintain since they don't allow ambiguity of any type.  Their downfall is that since they use a greedy algorithm they sometimes degrade performance to O(n) where a nongreedy algorithm would result in O(1).  Since the YARDparser is template-based, it should be possible to slip a stringmap into the keyword lookup to bring the performance back up to a more resonable specification.

Let me know what you think.

--Sam Crow



----- Original Message ----
> From: Chris Lattner <clattner at apple.com>
> To: LLVM Developers Mailing List <llvmdev at cs.uiuc.edu>
> Sent: Thursday, June 11, 2009 12:28:45 PM
> Subject: Re: [LLVMdev] Regular Expressions
> 
> 
> On Jun 9, 2009, at 12:39 PM, David Greene wrote:
> 
> > On Tuesday 09 June 2009 14:34, Dan Gohman wrote:
> >> Can you describe what problem you're trying to solve here?  Does it
> >> really need Regular Expressions?
> >
> > Yes.  I want TableGen to be able to infer lots of stuff  
> > programmatically.
> > This helps tremendously when specifying things like, oh, AVX.  :)
> 
> I don't see how this relates to regex's, and really don't want to suck  
> in an external regex library.  Can you give an example of how this  
> would help?
> 
> > BTW, is there a process for dropping compiler support?  Eventually  
> > most
> > compilers will have TR1 except for very old versions.  It's not  
> > feasible
> > to support all of those crufty compilers forever.
> 
> 
> There is no formal process.  What do you need out of TR1?  Is it  
> possible to make it conditional?  Many old compilers are already  
> broken anyway, but all it takes is one "popular" one to make it a no-go.
> 
> -Chris
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev



      



More information about the llvm-dev mailing list