--- title: What are parser combinators? --- Before you start, make sure you've had a look at the installation instructions. ## Parsers ```php append(atLeastOne(punctuationChar())) ->label('smiley'); $result = $parser->tryString(':*{)'); echo $result->output() . " is a valid smiley!"; ``` A parser is a function that takes some unstructured input (like a string) and turns it into structured output, that's easier to work with. This output could be as simple as a slightly better structured string, or an array, an object, up to a complete abstract syntax tree. You can then use this data structure for subsequent processing. You're probably using parsers all the time, such as `json_decode()`. And even just casting a string to a float [footnote 1](#floatval) really is parsing. Parsica helps you build your own parsers, in a concise, declarative way. Behind the scenes it takes care of things like error handling, so you can focus on the parser itself. ## Building a parser There are many ways to build a parser for your own use case, ranging from formal grammars that get compiled into a parser, to regular expressions, to writing a parser entirely from scratch. They all have their own tradeoffs and limitations. One of the great benefits of the parser combinator style is that, once you get the hang of it, they're generally easier to write, understand, and maintain. You start from building blocks, such as `digitChar()`, which returns a function that parses a single digit. ```php tryString($input); $output = $result->output(); assertSame("1", $output); assertIsString($output); ``` ## Parser Combinators Parser Combinators are functions (or methods) that combine parsers into new parsers. Instead of writing one big parser, we can now write smaller parsers and cleverly compose them into larger parsers. ```php append(char('b')); $result = $parser->tryString("abc"); $output = $result->output(); assertEquals("ab", $output); ``` ```php thenIgnore(char(",")), string("world")->thenIgnore(char("!")), ); $result = $parser->tryString("Hello,world!"); $output = $result->output(); assertEquals(["Hello", "world"], $output); ``` To make this work, we need a small change in our original definition of a parser. > A parser is a function that takes some unstructured input (such as a string), and returns a more structured output, as well as the remaining unparsed part of the input. This way, each parser function can parse a chunk of the input, and leave the remainder to another parser. The combinators take care of the heavy lifting: pass the input to the parser functions, pass the remainder to the next one, decide what to do with errors (eg, fail or backtrack or try another parser), ... We can inspect the remainder: ```php tryString("abc"); assertEquals("b", $result->output()); assertEquals("c", $result->remainder()); ``` So when we run our parser using `$parser->tryString($input)`, the `sequence()` combinator first tries to run `char('a')` on the input `"abc"`. If it succeeds, it takes the remainder `"bc"` and successfully runs `char('b')` on it and returns the result. That result consists of the output from the last parser `"b"`, and the remainder `"c"`. In imperative code, it would look something like this: ```php $output2, 'remainder' => $remainder2]; } } $parser = new MyParser(); $result = $parser->try("abc"); assertEquals('b', $result['output']); assertEquals('c', $result['remainder']); ``` If you've been working in PHP long enough and have never used parser combinators, the code above may look more familiar for now. But imagine scaling that to parse anything from simple formats like credit card numbers, recursive structures like JSON or XML, or even entire programming languages like PHP. And that doesn't even include the code you'd need for performance, testing and debugging tooling, code reuse, and reporting on bad input. If you'd rather write `sequence(char('a'), char('b'))`, stick around. ### Footnotes #### Note 1 ```php map(fn($v) => floatval($v)); try { // works: $result = $parser->tryString("1.23"); assertSame(1.23, $result->output()); // throws a ParserHasFailed exception with message "Expected: float, got abc" $result = $parser->tryString("abc"); } catch (ParserHasFailed $e) {} ```