Irony
Sometimes its helpful to be able to create a parser. It might be for a simple language, an expression evaluator - anytime it's necessary validate entry by a user that is more complex than can be handled by RegEx or so that keywords in a grammar can be replaced intelligently. So far the two choices have been to use tools such as Lex/Yacc/Antlr to generate some C/C++ from a grammar or create a tokenizer/parser by hand. Using the tools, you can get a performant parser but at the cost of using another tool set. Using a hand-crafted parser you get the benefit of being able to use one enviroment (managed code in my case) as the cost of having to write a parser long-hand.
So along comes Irony (there's a primer on Code Project) It allows the grammar of an expression or language to be decribed using BNF style definitions. The definitions are used by a sift-reduce engine that generates nodes for each terminal and non-terminal found in the source expression. For example, suppose you want to be able to create a (very) simple grammar that allows two numbers to be combined optionally enclosed in parentheses. It might be defined like:
Terminal N = new NumberTerminal("Number");
NonTerminal E = new NonTerminal("Expr");
NonTerminal O = new NonTerminal("Operators");
E.Rule = E | N + O + E | "(" + E + ")";
O.Rule = Symbol("+") | "-" | "*" | "/" | "**";
RegisterOperators(1, "+", "-");
RegisterOperators(2, "*", "/");
RegisterOperators(3, "**");
How easy is that? And you know it's going to report errors or create a node tree for you to use. So what's it doing? First of all Terminals and Non-terminals are defined. In this case the terminals are numbers and there's a standard terminal type for these. There are other standard terminal types for identifiers, string and so on but you can also create your own. There are two non-terminals declared, one for the expression itself and one for the standard arithmetic operatators. You could also define other symbols as operators or include logical operators. The parser learns the precedence of the operators from the RegisterOperator() methods which takes a precedence value and the list of operators sharing that precedence level.The heavy lifting is done by the grammar definitions in the middle. The "O" operator rule says that an O can be either one of the supported symbols. Irony overloads the pipe and plus operators so that they can be used to build expressions. The "E" expression rule then allows a valid expression to be a "N" number plus an "O" operator then another "N" number or itself, an expression, enclosed in parenthesis. And that's it!All that's left to do is to parse some input and use the node tree returned:AstNode rootNode = new LanguageCompiler(this).Parse("1+2");(assumes "this" is a descendent of the abstract class Grammar).OK, this is a really trivial example. But the code for arbitrarily complex grammars is essentially the same. And the real benefit is that the grammar is written declaritively and describes the intention of the parser not the mechanism for parsing the input.



I’d like to see a follow up which use this parser to interpret user input!
(simple math expression only of course