Dart Documentationpetitparser

petitparser library

This package contains the core library of PetitParser, a dynamic parser combinator framework.

Writing a Simple Grammar

Writing grammars with PetitParser is simple as writing Dart code. For example, to write a grammar that can parse identifiers that start with a letter followed by zero or more letter or digits is defined as follows:

Parser id = letter().seq(letter().or(digit()).star());

If you look at the object id in the debugger, you'll notice that the code above buils a tree of parser objects:

  • Sequence: This parser accepts a sequence of parsers.
    • Predicate: This parser accepts a single letter.
    • Repeater: This parser accepts zero or more times another parser.
      • Choice: This parser accepts a single word character.
        • Predicate: This parser accepts a single letter.
        • Predicate: This parser accepts a single digit.

Parsing Some Input

To actually parse a String (or List) we can use the method Parser.parse:

Result id1 = id.parse('yeah');
Result id2 = id.parse('f12');

The method Parser.parse returns a parse Result, which is either an instance of Success or Failure. In both examples above we are successful and can retrieve the parse result using Success.value:

print(id1.value);                   // ['y', ['e', 'a', 'h']]
print(id2.value);                   // ['f', ['1', '2']]

While it seems odd to get these nested arrays with characters as a return value, this is the default decomposition of the input into a parse tree. We'll see in a while how that can be customized.

If we try to parse something invalid we get an instance of Failure as an answer and we can retrieve a descriptive error message using Failure.message:

Result id3 = id.parse('123');
print(id3.message);                 // 'letter expected'
print(id3.position);                // 0

Trying to retrieve the parse result by calling Failure.value would throw the exception UnsupportedError. Context.isSuccess and Context.isFailure can be used to decide if the parse was successful.

If you are only interested if a given string matches or not you can use the helper method Parser.accept:

print(id.accept('foo'));            // true
print(id.accept('123'));            // false

Different Kinds of Parsers

PetitParser provide a large set of ready-made parser that you can compose to consume and transform arbitrarily complex languages. The terminal parsers are the most simple ones. We've already seen a few of those:

  • char('a') parses the character a.
  • string('abc') parses the string abc.
  • any() parses any character.
  • digit() parses any digit from 0 to 9.
  • letter() parses any letter from a to z and A to Z.
  • word() parses any letter or digit.

So instead of using the letter and digit predicate, we could have written our identifier parser like this:

var id = letter().seq(word().star());

The next set of parsers are used to combine other parsers together:

  • p1.seq(p2) and p1 & p2 parse p1 followed by p2 (sequence).
  • p1.or(p2) and p1 | p2 parse p1, if that doesn't work parses p2 (ordered choice).
  • p.star() parses p zero or more times.
  • p.plus() parses p one or more times.
  • p.optional() parses p, if possible.
  • p.and() parses p, but does not consume its input.
  • p.not() parses p and succeed when p fails, but does not consume its input.
  • p.end() parses p and succeed at the end of the input.

To attach an action or transformation to a parser we can use the following methods:

  • p.map((value) => ...) performs the transformation given the function.
  • p.pick(n) returns the n-th element of the list p returns.
  • p.flatten() creates a string from the result of p.
  • p.token() creates a token from the result of p.
  • p.trim() trims whitespaces before and after p.

To return a string of the parsed identifier, we can modify our parser like this:

var id = letter().seq(word().star()).flatten();

To conveniently find all matches in a given input string you can use Parser.matchesSkipping:

var matches = id.matchesSkipping('foo 123 bar4');
print(matches);                     // ['foo', 'bar4']

These are the basic elements to build parsers. There are a few more well documented and tested factory methods in the Parser class. If you want browse their documentation and tests.

Writing a More Complicated Grammar

Now we are able to write a more complicated grammar for evaluating simple arithmetic expressions. Within a file we start with the grammar for a number (actually an integer):

var number = digit().plus().flatten().trim().map(int.parse);

Then we define the productions for addition and multiplication in order of precedence. Note that we instantiate the productions with undefined parsers upfront, because they recursively refer to each other. Later on we can resolve this recursion by setting their reference:

var term = undefined();
var prod = undefined();
var prim = undefined();

term.set(prod.seq(char('+').trim()).seq(term).map((values) {
  return values[0] + values[2];
}).or(prod));
prod.set(prim.seq(char('*').trim()).seq(prod).map((values) {
  return values[0] * values[2];
}).or(prim));
prim.set(char('(').trim().seq(term).seq(char(')'.trim())).map((values) {
  return values[1];
}).or(number));

To make sure that our parser consumes all input we wrap it with the end() parser into the start production:

var start = term.end();

That's it, now we can test our parser and evaluator:

print(start.parse('1 + 2 * 3').value);        // 7
print(start.parse('(1 + 2) * 3').value);      // 9

As an exercise we could extend the parser to also accept negative numbers and floating point numbers, not only integers. Furthermore it would be useful to support subtraction and division as well. All these features can be added with a few lines of PetitParser code.

Functions

Parser debug(Parser root) #

Adds debug handlers to each parser reachable from root.

Parser debug(Parser root) {
 var level = 0;
 return transformParser(root, (parser) {
   return new _ContinuationParser(parser, (context, continuation) {
     print('${_debugIndent(level)}${parser}');
     level++;
     var result = continuation(context);
     level--;
     print('${_debugIndent(level)}${result}');
     return result;
    });
 });
}

Parser removeDuplicates(Parser root) #

Removes duplicated parsers reachable from root in-place.

Parser removeDuplicates(Parser root) {
 var uniques = new Set();
 allParser(root).forEach((parent) {
   parent.children.forEach((source) {
     var target = uniques.firstWhere((each) {
       return source != each && source.match(each);
     }, orElse: () => null);
     if (target == null) {
       uniques.add(source);
     } else {
       parent.replace(source, target);
     }
   });
 });
 return root;
}

Parser removeSetables(Parser root) #

Removes all setable parsers reachable from root in-place.

Parser removeSetables(Parser root) {
 allParser(root).forEach((parent) {
   parent.children.forEach((source) {
     var target = _removeSetable(source);
     if (source != target) {
       parent.replace(source, target);
     }
   });
 });
 return _removeSetable(root);
}

Parser transformParser(Parser root, Parser function(Parser parser)) #

Transforms all parsers reachable from root with the given function. The identity function returns a copy of the the incoming parser.

The implementation first creates a copy of each parser reachable in the input grammar; then the resulting grammar is iteratively transfered and all old parsers are replaced with the transformed ones until we end up with a completely new grammar.

Parser transformParser(Parser root, Parser function(Parser parser)) {
 var mapping = new Map();
 allParser(root).forEach((parser) {
   mapping[parser] = function(parser.copy());
 });
 while (true) {
   var changed = false;
   allParser(mapping[root]).forEach((parser) {
     parser.children.forEach((source) {
       if (mapping.containsKey(source)) {
         parser.replace(source, mapping[source]);
         changed = true;
       }
     });
   });
   if (!changed) {
     return mapping[root];
   }
 }
}

Iterable<Parser> allParser(Parser root) #

Returns a lazy iterable over all parsers reachable from a root. Do not modify the grammar while iterating over it, otherwise you might get unexpected results.

Iterable<Parser> allParser(Parser root) {
 return new _ParserIterable(root);
}

Parser stringIgnoreCase(String element, [String message]) #

Returns a parser that accepts the string element ignoring the case.

For example, stringIgnoreCase('foo') succeeds and consumes the input string 'Foo' or 'FOO'. Fails for any other input.

Parser stringIgnoreCase(String element, [String message]) {
 final lowerElement = element.toLowerCase();
 return new _PredicateParser(element.length,
   (String each) => lowerElement == each.toLowerCase(),
   message != null ? message : '$element expected');
}

Parser string(String element, [String message]) #

Returns a parser that accepts the string element.

For example, string('foo') succeeds and consumes the input string 'foo'. Fails for any other input.

Parser string(String element, [String message]) {
 return new _PredicateParser(element.length,
   (String each) => element == each,
   message != null ? message : '$element expected');
}

Parser anyIn(elements, [String message]) #

Returns a parser that accepts any of the elements.

For example, anyIn('ab') succeeds and consumes either the letter 'a' or the letter 'b'. For any other input the parser fails.

Parser anyIn(dynamic elements, [String message]) {
 return new _PredicateParser(1,
   (each) => elements.indexOf(each) >= 0,
   message != null ? message : 'any of $elements expected');
}

Parser any([String message = 'input expected']) #

Returns a parser that accepts any input element.

For example, any() succeeds and consumes any given letter. It only fails for an empty input.

Parser any([String message = 'input expected']) {
 return new _AnyParser(message);
}

SetableParser undefined([String message = 'undefined parser']) #

Returns a parser that is not defined, but that can be set at a later point in time.

For example, the following code sets up a parser that points to itself and that accepts a sequence of a's ended with the letter b.

var p = undefined();
p.set(char('a').seq(p).or(char('b')));
SetableParser undefined([String message = 'undefined parser']) {
 return failure(message).setable();
}

Parser failure([String message = 'unable to parse']) #

Returns a parser that consumes nothing and fails.

For example, failure() always fails, no matter what input it is given.

Parser failure([String message = 'unable to parse']) {
 return new _FailureParser(message);
}

Parser epsilon([result]) #

Returns a parser that consumes nothing and succeeds.

For example, char('a').or(epsilon()) is equivalent to char('a').optional().

Parser epsilon([dynamic result]) => new _EpsilonParser(result);

Parser word([String message]) #

Returns a parser that accepts any word character.

Parser word([String message]) {
 return new _CharacterParser(
     _wordCharMatcher,
     message != null ? message : 'letter or digit expected');
}

Parser whitespace([String message]) #

Returns a parser that accepts any whitespace character.

Parser whitespace([String message]) {
 return new _CharacterParser(
     _whitespaceCharMatcher,
     message != null ? message : 'whitespace expected');
}

Parser uppercase([String message]) #

Returns a parser that accepts any uppercase character.

Parser uppercase([String message]) {
 return new _CharacterParser(
     _uppercaseCharMatcher,
     message != null ? message : 'uppercase letter expected');
}

Parser range(start, stop, [String message]) #

Returns a parser that accepts any character in the range between start and stop.

Parser range(dynamic start, dynamic stop, [String message]) {
 return new _CharacterParser(
     new _RangeCharMatcher(_toCharCode(start), _toCharCode(stop)),
     message != null ? message : '$start..$stop expected');
}

Parser pattern(String element, [String message]) #

Returns a parser that accepts the given character class pattern.

Parser pattern(String element, [String message]) {
 if (_pattern == null) {
   var single = any().map((each) {
     return new _SingleCharMatcher(_toCharCode(each));
   });
   var multiple = any().seq(char('-')).seq(any()).map((each) {
     return new _RangeCharMatcher(_toCharCode(each[0]), _toCharCode(each[2]));
   });
   var positive = multiple.or(single).plus().map((each) {
     return each.length == 1 ? each[0] : new _AltCharMatcher(each);
   });
   _pattern = char('^').optional().seq(positive).map((each) {
     return each[0] == null ? each[1] : new _NotCharMatcher(each[1]);
   });
 }
 return new _CharacterParser(
     _pattern.parse(element).value,
     message != null ? message : '[$element] expected');
}

Parser lowercase([String message]) #

Returns a parser that accepts any lowercase character.

Parser lowercase([String message]) {
 return new _CharacterParser(
     _lowercaseCharMatcher,
     message != null ? message : 'lowercase letter expected');
}

Parser letter([String message]) #

Returns a parser that accepts any letter character.

Parser letter([String message]) {
 return new _CharacterParser(
     _letterCharMatcher,
     message != null ? message : 'letter expected');
}

Parser digit([String message]) #

Returns a parser that accepts any digit character.

Parser digit([String message]) {
 return new _CharacterParser(
     _digitCharMatcher,
     message != null ? message : 'digit expected');
}

Parser char(element, [String message]) #

Returns a parser that accepts a specific character only.

Parser char(dynamic element, [String message]) {
 return new _CharacterParser(
     new _SingleCharMatcher(_toCharCode(element)),
     message != null ? message : '$element expected');
}

Abstract Classes

Classes

Exceptions