refactor: Cleanup git state - commit all staged changes

Major refactoring cleanup:
- Add new controller architecture (class-controller-*.php)
- Add new settings-v2 UI (views/settings-v2/)
- Add new CSS architecture (agentic-sidebar.css, tokens)
- Add esbuild build pipeline (scripts/build.js, package.json)
- Add composer dependencies (vendor/)
- Add frontend src directory (assets/js/src/index.jsx)
- Add documentation files
- Remove old/obsolete files (class-settings.php, old CSS)

This commits all pending changes from previous refactoring efforts.
This commit is contained in:
Dwindi Ramadhana
2026-06-17 05:27:58 +07:00
parent d3f142222c
commit 690991c526
7963 changed files with 941566 additions and 67372 deletions

View File

@@ -0,0 +1,15 @@
; This file is for unifying the coding style for different editors and IDEs.
; More information at https://editorconfig.org
root = true
[*]
charset = utf-8
indent_size = 4
indent_style = space
end_of_line = lf
insert_final_newline = true
trim_trailing_whitespace = true
[*.md]
trim_trailing_whitespace = false

View File

@@ -0,0 +1 @@
github: [turanct]

View File

@@ -0,0 +1,36 @@
name: Tests
on: [push, pull_request]
jobs:
build:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Validate composer.json and composer.lock
run: composer validate
- name: Cache Composer packages
id: composer-cache
uses: actions/cache@v4
with:
path: vendor
key: ${{ runner.os }}-composer-${{ hashFiles('**/composer.lock') }}
restore-keys: |
${{ runner.os }}-node-
- name: Install dependencies
if: steps.composer-cache.outputs.cache-hit != 'true'
run: composer install --prefer-dist --no-progress --no-suggest
- name: PHPUnit test suite
run: vendor/bin/phpunit
- name: UpToDocs Documentation tester
run: docs/testdocs github
- name: Psalm Static type checker
run: vendor/bin/psalm --output-format=github --threads=4

14
vendor/parsica-php/parsica/.gitignore vendored Normal file
View File

@@ -0,0 +1,14 @@
/vendor/
.idea/
.private/
.php_cs.cache
.phpunit.result.cache
Gherkin
infection.log
per-mutator.md
composer.lock
/bin/clear-psalm-cache.sh
/_storage/
/.phpdoc/
/.phive/
/tools/

View File

@@ -0,0 +1,3 @@
#CHANGELOG
Please see the [release notes on GitHub](https://github.com/parsica-php/parsica/releases).

View File

@@ -0,0 +1,130 @@
---
title: "Contributor Covenant Code of Conduct"
---
## Our Pledge
We as members, contributors, and leaders pledge to make participation in our
community a harassment-free experience for everyone, regardless of age, body
size, visible or invisible disability, ethnicity, sex characteristics, gender
identity and expression, level of experience, education, socio-economic status,
nationality, personal appearance, race, religion, or sexual identity
and orientation.
We pledge to act and interact in ways that contribute to an open, welcoming,
diverse, inclusive, and healthy community.
## Our Standards
Examples of behavior that contributes to a positive environment for our
community include:
* Demonstrating empathy and kindness toward other people
* Being respectful of differing opinions, viewpoints, and experiences
* Giving and gracefully accepting constructive feedback
* Accepting responsibility and apologizing to those affected by our mistakes,
and learning from the experience
* Focusing on what is best not just for us as individuals, but for the
overall community
Examples of unacceptable behavior include:
* The use of sexualized language or imagery, and sexual attention or
advances of any kind
* Trolling, insulting or derogatory comments, and personal or political attacks
* Public or private harassment
* Publishing others' private information, such as a physical or email
address, without their explicit permission
* Other conduct which could reasonably be considered inappropriate in a
professional setting
## Enforcement Responsibilities
Community leaders are responsible for clarifying and enforcing our standards of
acceptable behavior and will take appropriate and fair corrective action in
response to any behavior that they deem inappropriate, threatening, offensive,
or harmful.
Community leaders have the right and responsibility to remove, edit, or reject
comments, commits, code, wiki edits, issues, and other contributions that are
not aligned to this Code of Conduct, and will communicate reasons for moderation
decisions when appropriate.
## Scope
This Code of Conduct applies within all community spaces, and also applies when
an individual is officially representing the community in public spaces.
Examples of representing our community include using an official e-mail address,
posting via an official social media account, or acting as an appointed
representative at an online or offline event.
## Enforcement
Instances of abusive, harassing, or otherwise unacceptable behavior may be
reported to the community leaders responsible for enforcement at
mathias@verraes.net
All complaints will be reviewed and investigated promptly and fairly.
All community leaders are obligated to respect the privacy and security of the
reporter of any incident.
## Enforcement Guidelines
Community leaders will follow these Community Impact Guidelines in determining
the consequences for any action they deem in violation of this Code of Conduct:
### 1. Correction
**Community Impact**: Use of inappropriate language or other behavior deemed
unprofessional or unwelcome in the community.
**Consequence**: A private, written warning from community leaders, providing
clarity around the nature of the violation and an explanation of why the
behavior was inappropriate. A public apology may be requested.
### 2. Warning
**Community Impact**: A violation through a single incident or series
of actions.
**Consequence**: A warning with consequences for continued behavior. No
interaction with the people involved, including unsolicited interaction with
those enforcing the Code of Conduct, for a specified period of time. This
includes avoiding interactions in community spaces as well as external channels
like social media. Violating these terms may lead to a temporary or
permanent ban.
### 3. Temporary Ban
**Community Impact**: A serious violation of community standards, including
sustained inappropriate behavior.
**Consequence**: A temporary ban from any sort of interaction or public
communication with the community for a specified period of time. No public or
private interaction with the people involved, including unsolicited interaction
with those enforcing the Code of Conduct, is allowed during this period.
Violating these terms may lead to a permanent ban.
### 4. Permanent Ban
**Community Impact**: Demonstrating a pattern of violation of community
standards, including sustained inappropriate behavior, harassment of an
individual, or aggression toward or disparagement of classes of individuals.
**Consequence**: A permanent ban from any sort of public interaction within
the community.
## Attribution
This Code of Conduct is adapted from the [Contributor Covenant][homepage],
version 2.0, available at
https://www.contributor-covenant.org/version/2/0/code_of_conduct.html.
Community Impact Guidelines were inspired by [Mozilla's code of conduct
enforcement ladder](https://github.com/mozilla/diversity).
[homepage]: https://www.contributor-covenant.org
For answers to common questions about this code of conduct, see the FAQ at
https://www.contributor-covenant.org/faq. Translations are available at
https://www.contributor-covenant.org/translations.

21
vendor/parsica-php/parsica/LICENSE vendored Normal file
View File

@@ -0,0 +1,21 @@
MIT License
Copyright (c) 2020 Mathias Verraes
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.

46
vendor/parsica-php/parsica/README.md vendored Normal file
View File

@@ -0,0 +1,46 @@
# Parsica
[![Tests](https://github.com/parsica-php/parsica/actions/workflows/tests.yml/badge.svg)](https://github.com/parsica-php/parsica/actions/workflows/tests.yml)
The easiest way to build robust parsers in PHP.
```bash
composer require parsica-php/parsica
```
Documentation & API: [parsica-php.github.io](https://parsica-php.github.io/)
```php
<?php
$parser = between(char('{'), char('}'), atLeastOne(alphaChar()));
$result = $parser->tryString("{Hello}");
echo $result->output(); // Hello
```
## Quality
The code is entirely built with Test-Driven Development, and type-checked with [Psalm](https://github.com/vimeo/psalm). It is likely bug-free or very close to it. It is suitable for complex parsing requirements, and could even be used to build a programming language.
However, it might not be performant enough if you use it at a high scale.
## Project Maintenance & Support
Regrettably, the maintainer of this library (@turanct) has passed away in December 2021 due to cancer. The original author @mathiasverraes is now the maintainer again, and is doing occasional minor updates. If you'd like to contribute to this library, or if you wish to use this library for a project and need consulting, contact Mathias via mathias at verraes net. PR and issues submissions may not be monitored.
## Development
After running `composer install`, run these to validate if everything is in working order:
```
composer run phpunit
composer run psalm
composer run uptodocs
# or all of them:
composer run test
```
As this library uses pure functional programming, it may be hard to wrap your head around if you're used to object-oriented or imperative styles. Our recommendation is to familiarize yourself with the basics of functional programming, for example by reading an intro to Haskell.

8
vendor/parsica-php/parsica/TODO.md vendored Normal file
View File

@@ -0,0 +1,8 @@
- experiment with http://docs.php.net/manual/en/book.ds.php to improve performance. It has polyfil
- iterators in Stream
- generate regex
- look at https://github.com/jubianchi/ppc
- separate StringStream and MBStringStream
- benchmark stream methods individually
- more generally, benchmarking at the lowest levels first
- use try for backtracking?

View File

@@ -0,0 +1,865 @@
<?php
/**
*
* @Name : JsonParser
* @Version : 2.2.1
* @Programmer : Max
* @Date : 2018-06-26, 2018-06-27, 2019-03-23, 2019-03-24, 2019-03-26, 2019-03-27, 2019-04-04
* @Released under : https://github.com/BaseMax/JsonParser/blob/master/LICENSE
* @Repository : https://github.com/BaseMax/JsonParser
*
**/
namespace JPOPHP;
abstract class JsonType
{
// []
const JsonArray = 0;
// {}
const JsonObject = 1;
}
abstract class TokenType
{
// end of file, end of command, end of input string
const TokenEOF = -1;
// {
const TokenArrayOpen = 0;
// }
const TokenArrayClose = 1;
// [
const TokenObjectOpen = 2;
// ]
const TokenObjectClose = 3;
// "...", '...'
const TokenString = 4;
// <int>, <float>, -<int>, -<float>, ...
const TokenNumber = 5;
// ,
const TokenSplit = 6;
// :
const TokenPair = 7;
// null
const TokenNull = 8;
// true, false
const TokenBool = 9;
}
class Json
{
/**
*
* Public variable for whole of the class
*
*/
// token type of current state in decode()
public $token = null;
// length and last index of the input, it update using decode()
public int $length = 0;
// input string, it updates using decode()
public string $input = "";
// current state and index of pointer at the input string
public int $index = 0;
/**
* @function typeToken($token)
* argument: $token
* return : @string
*/
function typeToken($token)
{
switch ($token[0]) {
// EOF
case TokenType::TokenEOF:
return "EOF";
break;
// [
case TokenType::TokenArrayOpen:
return "ArrayOpen";
break;
// ]
case TokenType::TokenArrayClose:
return "ArrayClose";
break;
// {
case TokenType::TokenObjectOpen:
return "ObjectOpen";
break;
// }
case TokenType::TokenObjectClose:
return "ObjectClose";
break;
// "...", '...'
case TokenType::TokenString:
return "String";
break;
// number
case TokenType::TokenNumber:
return "Number";
break;
// ,
case TokenType::TokenSplit:
return "Split";
break;
// :
case TokenType::TokenPair:
return "Pair";
break;
// unknowm, other!
default:
return "None";
break;
}
}
/**
* @function isAssociative($array)
* argument: array $array
* return : @bool
*/
/*
* Arrays are :
* associative or sequential
*/
function isAssociative(array $array)
{
// if(array() === $array)
// return false;
return array_keys($array) !== range(0, count($array) - 1);
}
// function encodeValue($value) {
// if($value === true || $value === false) {//bool
// return $value;
// }
// else if($value === null) {//null
// return $value;
// }
// else if(is_numeric($value) === true) {//number
// return $value;
// }
// else if(is_string($value) === true) {//string
// return "\"".$value."\"";
// }
// else if(is_array($value) === true) {//array
// return encode($value);
// }
// else {
// return false;
// }
// }
/**
* @function encode($array)
* argument: array $array
* return : @string
*/
function encode($array)
{
$response = "";
if ([] !== $array) {
$array_type = $this->isAssociative($array) ? "associative" : "sequential";
if ($array_type == "associative") {//object
$response .= "{";
} else {
$response .= "[";
}
$count = count($array);
$index = 0;
foreach ($array as $key => $value) {
if ($array_type == "associative") {//object
$response .= "\"";
$response .= $key;
$response .= "\"";
$response .= ":";
}
if ($value === true || $value === false) {//bool
$response .= $value;
} elseif ($value === null) {//null
$response .= $value;
} elseif (is_numeric($value) === true) {//number
$response .= $value;
} elseif (is_string($value) === true) {//string
$response .= "\"" . $value . "\"";
} elseif (is_array($value) === true) {//array
$response .= $this->encode($value);
} else {
print "Error: Unknowm type!\n";
break;
}
// if(is_array($value)) {
// $response.=this->encode($value);
// }
// else {
// $response.=encodeValue($value);
// }
if (++$index !== $count) {
$response .= ",";
}
}
if ($array_type == "associative") {//object
$response .= "}";
} else {
$response .= "]";
}
}
// $response.="\n";
return $response;
}
// function nextsIfWithSkips($characterIf,$token,$tok)
// function nextsIfWithSkip($characterIf,$token,$tok)
// function nextIfWithSkips($characterIf,$token,$tok)
// function nextIfWithSkip($characterIf,$token,$tok)
/**
* @function nextsIf($characterIf)
* argument: char $characterIf
* return : @void
*/
function nextsIf($characterIf)
{
$character = $this->input[$this->index];
if (is_array($characterIf)) {
while (in_array($character, $characterIf)) {
if ($this->index + 1 === $this->length) {
break;
}
$this->index++;
$character = $this->input[$this->index];
}
} else {
while ($character == $characterIf) {
if ($this->index + 1 === $this->length) {
break;
}
$this->index++;
$character = $this->input[$this->index];
}
}
}
/**
* @function nextIf($characterIf)
* argument: char $characterIf
* return : @void
*/
function nextIf($characterIf)
{
$character = $this->input[$this->index];
if (is_array($characterIf)) {
if (in_array($character, $characterIf)) {
if ($this->index + 1 === $this->length) {
return;
// break;
}
$this->index++;
// $character=$this->input[$this->index];
}
} else {
if ($character == $characterIf) {
if ($this->index + 1 === $this->length) {
return;
// break;
}
$this->index++;
// $character=$this->input[$this->index];
}
}
}
/**
* @function nextsIf($token,tok)
* argument: token $token, tokentype $tok
* return : @token
*/
function skip($token, $tok)
{
// print "---start\n";
if ($token[0] === $tok) {
$token = $this->nextToken();
// print "---next\n";
}
// print "---finish\n";
return $token;
}
/**
* @function skips($token,tok)
* argument: token $token, tokentype $tok
* return : @token
*/
function skips($token, $tok)
{
// print "---start\n";
while ($token[0] === $tok) {
$token = $this->nextToken();
// print "---next\n";
}
// print "---finish\n";
return $token;
}
/**
* @function nextToken()
* argument: void
* return : @token
*/
function nextToken()
{
if ($this->index + 1 > $this->length) {
return [TokenType::TokenEOF, null];
}
$character = $this->input[$this->index];
while ($character == ' ' || $character == ' ' || $character == "\n") {
if ($this->index + 1 === $this->length) {
break;
}
$this->index++;
$character = $this->input[$this->index];
}
if ($character == '{') {
$this->index++;
return [TokenType::TokenObjectOpen, null];
} elseif ($character == '}') {
$this->index++;
return [TokenType::TokenObjectClose, null];
} elseif ($character == '[') {
$this->index++;
return [TokenType::TokenArrayOpen, null];
} elseif ($character == ']') {
$this->index++;
return [TokenType::TokenArrayClose, null];
} elseif ($character == ',') {
$this->index++;
return [TokenType::TokenSplit, null];
} elseif ($character == ':') {
$this->index++;
return [TokenType::TokenPair, null];
} // n, N
elseif ($character === 'n' || $character === 'N') {
$i = 0;
$i++;
$character = $this->input[$this->index + $i];
// u, U
if ($character === 'u' || $character === 'U') {
$i++;
$character = $this->input[$this->index + $i];
// l, L
if ($character === 'l' || $character === 'L') {
$i++;
$character = $this->input[$this->index + $i];
// l, L
if ($character === 'l' || $character === 'L') {
$this->index++;
$this->index++;
$this->index++;
$this->index++;
return [TokenType::TokenNull, null];
}
}
}
} // t, T
elseif ($character === 't' || $character === 'T') {
$i = 0;
$i++;
$character = $this->input[$this->index + $i];
// r, R
if ($character === 'r' || $character === 'R') {
$i++;
$character = $this->input[$this->index + $i];
// u, U
if ($character === 'u' || $character === 'U') {
$i++;
$character = $this->input[$this->index + $i];
// e, E
if ($character === 'e' || $character === 'E') {
$this->index++;
$this->index++;
$this->index++;
$this->index++;
return [TokenType::TokenBool, true];
}
}
}
} // f, F
elseif ($character === 'f' || $character === 'F') {
$i = 0;
$i++;
$character = $this->input[$this->index + $i];
// a, A
if ($character === 'a' || $character === 'A') {
$i++;
$character = $this->input[$this->index + $i];
// l, L
if ($character === 'l' || $character === 'L') {
$i++;
$character = $this->input[$this->index + $i];
// s, S
if ($character === 's' || $character === 'S') {
$i++;
$character = $this->input[$this->index + $i];
// e, E
if ($character === 'e' || $character === 'E') {
$this->index++;
$this->index++;
$this->index++;
$this->index++;
$this->index++;
return [TokenType::TokenBool, false];
}
}
}
}
} elseif ($character === '"' || $character === '\'') {
$stype = null;
if ($character === '"') {
$stype = 1;
} elseif ($character === '\'') {
$stype = 2;
}
$result = "";
$characterPrev = "";
$this->index++;
$character = $this->input[$this->index];
$characterNext = null;
while (
($stype === 1 && $characterNext !== '"') ||
($stype === 2 && $characterNext !== '\'')
) {
if ($this->index == $this->length) {
break;
}
$character = $this->input[$this->index];
if ($this->index + 1 < $this->length) {
$characterNext = $this->input[$this->index + 1];
} else {
$characterNext = null;
}
// It added by me, not in the standard JSON!
if ($character === '\\' && $characterNext === '\'') {
$this->index++;
// $this->index++;
$character = $characterNext;
if ($this->index + 1 < $this->length) {
$characterNext = $this->input[$this->index + 1];
} else {
$characterNext = null;
}
} /**
*
* @Name : Unicode Support
* @Description : This feature is requested by Frederick Behrends.
* @Url, @Issue : https://github.com/BaseMax/JsonParser/issues/1
*/
elseif ($character === '\\' && $characterNext === 'u') {
$unicode = '';
$this->index++;
$this->index++;
$perform = true;
$i = 1; // We require it after the loop!
for (; $i <= 4; $i++) {
// print "...\n";
// if($perform === false) {
// break;
// }
if ($this->index + 1 < $this->length) {
$characterNext = $this->input[$this->index]; // As temp variable
$perform = true;
if (
$characterNext >= '0' && $characterNext <= '9' ||
$characterNext >= 'A' && $characterNext <= 'F'
) {
$character = $characterNext;// It will use when loop break! ($perform=false)
$unicode .= $character;
$this->index++;
} else { // May be " character!
// print "A stage\n";
// print $unicode."\n";
// print $character."\n";
$perform = false;
break;
}
} else {
// print "B stage\n";
$perform = false;
break;
}
}
if ($perform === true) {
$this->index--; // Required...
// print "C Stage\n";
// print $unicode."\n";
$unicode = "%u" . $unicode;
# $unicode = preg_replace('/%u([0-9A-F]){4}/', '&#x$1;', $unicode);
$unicode = preg_replace('/%u([0-9A-F]+)/', '&#x$1;', $unicode);
// ENT_COMPAT : Will convert double-quotes and leave single-quotes alone.
// https://www.php.net/manual/en/function.htmlentities.php
// print $unicode."\n";
$character = html_entity_decode($unicode, ENT_COMPAT, 'UTF-8');
// print $character."\n";
} else {
// Last index is $i
// We ($i-1) time rub the $index++
// print $i."\n";
// for($ii=1;$ii<$i-1;$ii++) {
// $this->index--;
// }
$this->index--;
$character = "\\u" . $unicode;
}
if ($this->index + 1 < $this->length) {
$characterNext = $this->input[$this->index + 1];
} else {
$characterNext = null;
}
} elseif ($character === '\\' && $characterNext === 'n') {
$this->index++;
// $this->index++;
$character = "\n";
if ($this->index + 1 < $this->length) {
$characterNext = $this->input[$this->index + 1];
} else {
$characterNext = null;
}
} elseif ($character === '\\' && $characterNext === '\\') {
$this->index++;
// $this->index++;
$character = "\\";
if ($this->index + 1 < $this->length) {
$characterNext = $this->input[$this->index + 1];
} else {
$characterNext = null;
}
} elseif ($character === '\\' && $characterNext === '/') {
$this->index++;
// $this->index++;
$character = "/";
if ($this->index + 1 < $this->length) {
$characterNext = $this->input[$this->index + 1];
} else {
$characterNext = null;
}
} elseif ($character === '\\' && $characterNext === 't') {
$this->index++;
// $this->index++;
$character = "\t";
if ($this->index + 1 < $this->length) {
$characterNext = $this->input[$this->index + 1];
} else {
$characterNext = null;
}
} elseif ($character === '\\' && $characterNext === 'r') {
$this->index++;
// $this->index++;
$character = "\r";
if ($this->index + 1 < $this->length) {
$characterNext = $this->input[$this->index + 1];
} else {
$characterNext = null;
}
} elseif ($character === '\\' && $characterNext === 'b') {
$this->index++;
// $this->index++;
$character = "\b";
if ($this->index + 1 < $this->length) {
$characterNext = $this->input[$this->index + 1];
} else {
$characterNext = null;
}
} elseif ($character === '\\' && $characterNext === '"') {
$this->index++;
// $this->index++;
$character = $characterNext;
// Fix: "hi\"!"
if ($this->index + 1 < $this->length) {
$characterNext = $this->input[$this->index + 1];
} else {
$characterNext = null;
}
}
// else {
// $this->index++;
// }
// else if($character == '"') {
// // $this->index--;
// // $this->index--;
// // $character='"';
// break;
// continue;
// }
$result .= $character;
$this->index++;
}
$this->index++;
return [TokenType::TokenString, $result];
}
// <int>(0 .. 9), -, .
// Allow : .9, .04
// Allow : -5
// Allow : -5.048
elseif (($character >= '0' && $character <= '9') || $character == '-' || $character == '.') {
$result = 0;
$bitflag = false;
$bitfloat = false;
$bitfloatindex = 0;
// while($character >='0' && $character <='9') {
while (
($character >= '0' && $character <= '9') ||
$character == '-' ||
$character == '.'
) {
if ($this->index == $this->length) {
break;
}
// $result*=10+(int)$character;
if ($bitflag === false && $character === '-') {
$bitflag = true;
} elseif ($bitflag === true && $character === '-') {
// Error
exit("Aleady expression has a minus!\n");
} else {
if ($bitfloat === false && $character == '.') {
$bitfloat = true;
// $bitfloatindex=0;
} elseif ($bitflag === true && $character == '.') {
// Error
exit("Aleady expression was a float type!\n");
} // else if($bitflag === true && ($character == 'e' || $character == 'E')) {
elseif ($character == 'e' || $character == 'E') {
//soon
exit("Soon, E+5 likely expression will develope....!\n");
} elseif ($bitfloat === true) {
$bitfloatindex++;
$floatcurrent = pow(10, $bitfloatindex);
$result = $result + ((int)$character / $floatcurrent);
} // else if($bitfloat === false) {
else {
$result = $result * 10;
$result = $result + (int)$character;
}
}
$this->index++;
if ($this->index + 1 < $this->length) {
$character = $this->input[$this->index];
} else {
$character = null;
}
}
if ($bitflag === true) {
$result *= -1;
}
// $this->index++;
return [TokenType::TokenNumber, $result];
} else {
$this->index++;
}
return [TokenType::TokenEOF, null];
}
/**
* @function isValue(token)
* argument: token $token
* return : array[bool $status, string $result]
*/
function isValue($token)
{
/**
* Values:
* <int>, <float>, - <int>, - <float>, -<int>(e|E)(+|-)<int>, <int>(e|E)(+|-)<int>, -<float>(e|E)(+|-)<int>, <float>(e|E)(+|-)<int>
*
* <bool> (true, false)
*
* <null> (null)
*
* <string> ("...", '...')
*
* <object> {...}
*
* <array> [...]
*/
if ($token[0] === TokenType::TokenNumber) {
return [true, $token[1]];
// return [true,null];
// return true;
} elseif ($token[0] === TokenType::TokenString) {
return [true, $token[1]];
// return [true,null];
// return true;
} elseif ($token[0] === TokenType::TokenBool) {
return [true, $token[1]];
} elseif ($token[0] === TokenType::TokenNull) {
return [true, null];
} elseif ($token[0] === TokenType::TokenObjectOpen) {
// $tree=0;
// $result=[];
$this->index--;
// print "...\n";
$result = $this->decode(null, false);
// while($token[0] != TokenType::TokenObjectClose) {
// }
// print_r($result);
return [true, $result];
// return true;
} elseif ($token[0] === TokenType::TokenArrayOpen) {
// $tree=0;
// $result=[];
$this->index--;
// print "...\n";
// print $this->input."\n";
// print $this->index."\n";
// print $this->input[$this->index]."\n";
$result = $this->decode(null, false);
// while($token[0] != TokenType::TokenObjectClose) {
//
// }
return [true, $result];
// return true;
}
return [false, null];
// return false;
}
/**
* @function decode(input,init=true)
* argument: string $input, bool init
* return : array[...]
*/
function decode($input, $init = true)
{
if ($init === true) {
// $this->tree=[];
// $this->trees=[];
$this->index = 0;
$this->input = $input;
$this->length = mb_strlen($input);
// $this->tree=null;
} else {
// $this->tree=0;
}
$result = [];
$this->token = $this->nextToken();
// print_r($token);
// $arrayOpen=false;
// $objectOpen=false;
// // skip spaces
// $this->nextsIf([" ","\n"," "]);
if (
$this->token[0] === TokenType::TokenArrayOpen ||
$this->token[0] === TokenType::TokenObjectOpen
) {
$type = null;
if ($this->token[0] === TokenType::TokenArrayOpen) {
$type = JsonType::JsonArray;
} elseif ($this->token[0] === TokenType::TokenObjectOpen) {
$type = JsonType::JsonObject;
}
// $this->tree[]=$this->token[0];
$this->token = $this->nextToken();
// // skip spaces
// $this->nextsIf([" ","\n"," "]);
// // skip split
// $this->skips($token,TokenType::TokenSplit);
// // skip spaces
// $this->nextsIf([" ","\n"," "]);
// // skip space(s) and split(s)
// $this->nextsIfWithSkips([" ","\n"," "],$token,TokenType::TokenSplit);
// skip split
// print_r($token);
$this->token = $this->skips($this->token, TokenType::TokenSplit);
// print_r($token);
// exit;
// parse until arrayClose
while (
($type === JsonType::JsonArray && $this->token[0] !== TokenType::TokenArrayClose) ||
($type === JsonType::JsonObject && $this->token[0] !== TokenType::TokenObjectClose)
) {
// print "==>".$this->typeToken($this->token) ."\n";
if ($this->token[0] === TokenType::TokenEOF) {
exit("Command is finish, but arrayClose not found!\n");
}
$first = null;
$second = null;
$first = $this->isValue($this->token);
if ($first[0] === true) {
// print "----yes\n";
// $first=$token;
// $first=$this->token;
// next may be was pair or split or arrayClose or EOF!
$this->token = $this->nextToken();
// print "\t==>".$this->typeToken($token) ."\n";
if ($type === JsonType::JsonObject) {
if ($this->token[0] === TokenType::TokenPair) {
if (is_string($first[1]) === true) {
$this->token = $this->nextToken();
// $second=$this->token;
$second = $this->isValue($this->token);
if ($second[0] === true) {
$this->token = $this->nextToken();
} else {
// Error!
exit("Unknowm token, pair value is not a value!\n");
}
} else {
// Error!
exit("Unknowm token, key of pair value is not a string!\n");
}
} else {
// Error!
exit("Unknowm token, all item of object should was a pair value!\n");
}
}
// its a array JSON
if ($second === null) {
/**
* result[index]
* =
* <value> (first)
*/
$result[] = $first[1];
} // its a object JSON
else {
/**
* result[
* <value> (first)
* ]
* =
* <value> (second)
*/
$result[$first[1]] = $second[1];
}
// print_r($result);
$this->token = $this->skips($this->token, TokenType::TokenSplit);
// $this->skips($token,TokenType::TokenSplit);
} else {
// print "----no\n";
// print_r($token);
// print $this->input."\n";
// print $this->index."\n";
// print $this->input[$this->index]."\n";
$this->token = $this->nextToken();
// print_r($token);
}
// print "\t==>".$this->typeToken($token) ."\n";
// print_r($token);
}
}
// else if($this->token[0] === TokenType::TokenObjectOpen) {
// // $this->tree[]=$this->token[0];
// }
elseif ($this->token[0] === TokenType::TokenEOF) {
// ;
// return;
// exit;
} else {
// Error!
exit("Unknowm token at the begin of command!\n");
}
// print_r($result);
return $result;
}
}

View File

@@ -0,0 +1 @@
This JSON parser was copied from https://github.com/BaseMax/JPOPHP for comparison against Parsica's JSON parser.

View File

@@ -0,0 +1,85 @@
<?php declare(strict_types=1);
/**
* This file is part of the Parsica library.
*
* Copyright (c) 2020 Mathias Verraes <mathias@verraes.net>
*
* For the full copyright and license information, please view the LICENSE
* file that was distributed with this source code.
*/
use Parsica\Parsica\JSON\JSON as ParsicaJSON;
use Json as BaseMaxJson;
class JSONBench
{
private string $data;
function __construct()
{
$this->data = <<<JSON
{
"name": "mathiasverraes/parsica",
"type": "library",
"alotoftext": [
"Lorem Ipsum dolor sit amet",
"Lorem Ipsum dolor sit amet",
"Lorem Ipsum dolor sit amet",
"Lorem Ipsum dolor sit amet",
"Lorem Ipsum dolor sit amet",
"Lorem Ipsum dolor sit amet",
"Lorem Ipsum dolor sit amet",
"Lorem Ipsum dolor sit amet",
"Lorem Ipsum dolor sit amet",
"Lorem Ipsum dolor sit amet",
"Lorem Ipsum dolor sit amet",
"Lorem Ipsum dolor sit amet",
"Lorem Ipsum dolor sit amet",
"Lorem Ipsum dolor sit amet",
"Lorem Ipsum dolor sit amet",
"Lorem Ipsum dolor sit amet",
"Lorem Ipsum dolor sit amet"
],
"alotmoretext": "Fuga iusto dolores ipsam. Qui excepturi veniam iste autem ducimus porro et voluptas. Veniam veniam ducimus cumque facere repudiandae corrupti sint quas. Cupiditate asperiores iure omnis dolores nihil asperiores qui quo. Assumenda quia iure deserunt deserunt. Perspiciatis velit quia et.\n\nExplicabo non dolores aut facere. Perferendis in est voluptate. Et laboriosam et autem voluptatum rem nam et aut. Voluptatem praesentium et earum fugit accusamus tempore consectetur natus. Beatae sunt nisi rerum blanditiis consequatur rerum ut.\n\nIure ipsa sit assumenda. Vitae nisi qui vero. Eveniet cum aliquam molestiae molestias. Nisi aut ea alias quo ea voluptatem. Minus ea mollitia quis.",
"description": "The easiest way to build robust parsers in PHP.",
"keywords": [
"parser",
"parser-combinator",
"parser combinator",
"parsing"
]
}
JSON;
}
/**
* @Revs(10)
* @Iterations(10)
*/
public function bench_json_encode()
{
json_decode($this->data);
}
/**
* @Revs(10)
* @Iterations(10)
*/
public function bench_Parsica_JSON()
{
$result = ParsicaJSON::json()->tryString($this->data);
}
/**
* @Revs(10)
* @Iterations(10)
*/
public function bench_basemax_jpophp()
{
require_once(__DIR__.'/JPOPHP/JsonParser.php');
(new JPOPHP\Json())->decode($this->data);
}
}

View File

@@ -0,0 +1,96 @@
<?php declare(strict_types=1);
/**
* This file is part of the Parsica library.
*
* Copyright (c) 2020 Mathias Verraes <mathias@verraes.net>
*
* For the full copyright and license information, please view the LICENSE
* file that was distributed with this source code.
*/
use Parsica\Parsica\Parser;
use function Parsica\Parsica\any;
use function Parsica\Parsica\char;
use function Parsica\Parsica\collect;
use function Parsica\Parsica\many;
use function Parsica\Parsica\map;
use function Parsica\Parsica\pure;
use function Parsica\Parsica\recursive;
use function Parsica\Parsica\satisfy;
use function Parsica\Parsica\takeWhile;
class ManyBench
{
private string $data;
function __construct()
{
$this->data = "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa";
$this->takeWhile = takeWhile(fn(string $c): bool => $c === 'a');
$this->manySatisfy = many(satisfy(fn(string $c): bool => $c === 'a'));
$this->manyChar = many(char('a'));
$this->oldManySatisfy = static::oldMany(satisfy(fn(string $c): bool => $c === 'a'));
$this->oldManyChar = static::oldMany(char('a'));
}
/**
* @Revs(10)
* @Iterations(10)
*/
public function bench_takeWhile()
{
$result = $this->takeWhile->tryString($this->data);
}
/**
* @Revs(10)
* @Iterations(10)
*/
public function bench_manySatisfy()
{
$result = $this->manySatisfy->tryString($this->data);
}
/**
* @Revs(10)
* @Iterations(10)
*/
public function bench_manyChar()
{
$result = $this->manyChar->tryString($this->data);
}
/**
* @Revs(10)
* @Iterations(10)
*/
public function bench_oldManySatisfy()
{
$result = $this->oldManySatisfy->tryString($this->data);
}
/**
* @Revs(10)
* @Iterations(10)
*/
public function bench_oldManyChar()
{
$result = $this->oldManyChar->tryString($this->data);
}
public static function oldMany(Parser $parser)
{
$rec = recursive();
$rec->recurse(
any(
map(
collect($parser, $rec),
fn(array $o): array => array_merge([$o[0]], $o[1])
),
pure([]),
)
);
return $rec;
}
}

View File

@@ -0,0 +1,68 @@
{
"name": "parsica-php/parsica",
"type": "library",
"description": "The easiest way to build robust parsers in PHP.",
"keywords": [
"parser",
"parser-combinator",
"parser combinator",
"parsing"
],
"homepage": "https://parsica-php.github.io/",
"license": "MIT",
"authors": [
{
"name": "Mathias Verraes",
"email": "mathias@verraes.net",
"homepage": "https://verraes.net"
},
{
"name": "Toon Daelman",
"email": "spinnewebber_toon@hotmail.com",
"homepage": "https://github.com/turanct"
}
],
"require": {
"php": "^7.4 || ^8.0",
"ext-mbstring": "*"
},
"require-dev": {
"ext-json": "*",
"mathiasverraes/uptodocs": "dev-main",
"phpunit/phpunit": "^9.0",
"phpbench/phpbench": "^1.0.1",
"psr/event-dispatcher": "^1.0",
"vimeo/psalm": "^4.30"
},
"autoload": {
"psr-4": {
"Parsica\\Parsica\\": "src/"
},
"files": [
"src/characters.php",
"src/combinators.php",
"src/numeric.php",
"src/predicates.php",
"src/primitives.php",
"src/recursion.php",
"src/sideEffects.php",
"src/space.php",
"src/strings.php",
"src/Expression/expression.php",
"src/Internal/FP.php",
"src/Curry/functions.php"
]
},
"autoload-dev": {
"psr-4": {
"Tests\\Parsica\\Parsica\\": "tests/"
}
},
"scripts": {
"test": ["@phpunit", "@psalm", "@uptodocs"],
"phpunit": "vendor/bin/phpunit",
"psalm": "vendor/bin/psalm",
"uptodocs": "docs/testdocs",
"benchmark": "phpbench run benchmarks --report=aggregate"
}
}

View File

@@ -0,0 +1,65 @@
<?php declare(strict_types=1);
/**
* This file is part of the Parsica library.
*
* Copyright (c) 2020 Mathias Verraes <mathias@verraes.net>
*
* For the full copyright and license information, please view the LICENSE
* file that was distributed with this source code.
*/
/** @noinspection ALL */
namespace Docs;
// This code is executed by UpToDocs before each code block
require_once __DIR__ . '/../vendor/autoload.php';
use Parsica\Parsica\Parser;
use Parsica\Parsica\ParserHasFailed;
use Parsica\Parsica\StringStream;
use function Parsica\Parsica\{alphaChar,
alphaNumChar,
atLeastOne,
between,
char,
charI,
choice,
collect,
digitChar,
either,
eof,
Expression\binaryOperator,
Expression\expression,
Expression\leftAssoc,
Expression\nonAssoc,
Expression\postfix,
Expression\prefix,
Expression\rightAssoc,
Expression\unaryOperator,
float,
isDigit,
isEqual,
isWhitespace,
keepFirst,
many,
notFollowedBy,
optional,
orPred,
punctuationChar,
recursive,
repeat,
satisfy,
sepBy,
sepBy1,
sequence,
skipHSpace,
skipSpace1,
some,
string,
stringI,
takeWhile,
upperChar,
whitespace,
zeroOrMore};
use function PHPUnit\Framework\{assertEquals, assertFalse, assertInstanceOf, assertIsString, assertTrue, assertSame};

View File

@@ -0,0 +1,14 @@
---
title: Design Goals
sidebar_label: Design Goals
---
Parsica aims to be the mainstream choice for anyone to create parsers. We want to support all use cases. When parsing a short string, Parsica should be worth picking over regular expressions; when parsing an entire language, it should be worth picking over a handwritten imperative parser. The API should be self-evident, it should be easy to get it right and hard to get it wrong.
Developers should not have to learn anything other than this library itself: no need to learn FP, category theory, parser theory, or even the internals of this libary. Under the hood, we use theoretical concepts. However, when adhering to these concepts would require exposing them to the developers, we will prefer a tradeoff that hides them.
The same goes for performance: Parsica should be performant enough to be a viable choice, but for most use cases, developers should not have to worry about learning how to achieve greater performance.
Parsica puts great focus on composability. To achieve this, we use immutability and referential transparency — not for the sake of perfection, but because these help to achieve effortless composition.
Finally, it should be easy for third party library authors to publish their own parsers as Composer packages, which in turn can be composed by other users.

View File

@@ -0,0 +1,36 @@
<?php declare(strict_types=1);
/**
* This file is part of the Parsica library.
*
* Copyright (c) 2020 Mathias Verraes <mathias@verraes.net>
*
* For the full copyright and license information, please view the LICENSE
* file that was distributed with this source code.
*/
/** @noinspection ALL */
namespace Docs;
// This code is executed by UpToDocs before each code block
require_once __DIR__ . '/../vendor/autoload.php';
use Parsica\Parsica\Parser;
use function Parsica\Parsica\Expression\{binaryOperator,
expression,
leftAssoc,
postfix,
prefix,
rightAssoc,
unaryOperator};
use function Parsica\Parsica\{atLeastOne, between, char, choice, digitChar, keepFirst, recursive, skipHSpace, string, alphaChar};
use function PHPUnit\Framework\{assertSame, assertEquals};
assert_options(ASSERT_ACTIVE, 1);
$token = fn(Parser $parser) => keepFirst($parser, skipHSpace());
$term = fn(): Parser => $token(atLeastOne(digitChar()))->map('intval');
$parens = fn (Parser $parser): Parser => $token(between($token(char('(')), $token(char(')')), $parser));

View File

@@ -0,0 +1,42 @@
---
title: Parsica
sidebar_label: About Parsica
---
The easiest way to build robust parsers in PHP.
**Note:** Parsica is very early stage, expect things to break.
* [Releases](releases)
* [Installation & Requirements](installation)
* [API Reference](api/index)
## Donate
Donate via my [GitHub Sponsor Page](https://github.com/sponsors/turanct).
A lot of research & development went into this project. We think it can become the mainstream choice for building reliable parsers in PHP, and serve as a foundation for many advancements. Your support will help us to reach that goal.
## Contribute
* [Design Goals](contribute/design_goals)
* Contribute by submitting code, documentation, examples, ... through pull requests.
* [Code of Conduct](CODE_OF_CONDUCT)
## Support
### Commercial training & support
E-mail us at [contact@value-object.com](mailto:contact@value-object.com) to discuss options.
### Community support
Submit questions as Github issues. Help us help you by submitting short bits of code that demonstrate the problem, and that can easily be copied and run.
## Links
* Official Site: [parsica-php.github.io](https://parsica-php.github.io)
* Twitter: [@parsica_php](https://twitter.com/parsica_php)
* GitHub: [parsica-php/parsica](https://github.com/parsica-php/parsica)
* Packagist: [parsica-php/parsica](https://packagist.org/packages/parsica-php/parsica)
* License: [MIT](LICENSE)

View File

@@ -0,0 +1,67 @@
---
title: Installation & Requirements
---
import Tabs from '@theme/Tabs';
import TabItem from '@theme/TabItem';
<Tabs
defaultValue="cli"
values={[
{ label: 'Command line', value: 'cli', },
{ label: 'composer.json', value: 'composer', },
]
}>
<TabItem value="cli">
```bash
composer require parsica-php/parsica
```
</TabItem>
<TabItem value="composer">
```json
"require": {
"parsica-php/parsica": "dev-main"
}
```
</TabItem>
</Tabs>
## Requirements
- PHP 7.4 or higher
- [The multibyte string extension for PHP (aka mbstring)](https://www.php.net/manual/en/book.mbstring.php)
(@TODO: add polyfill for mbstring).
## Usage
In a .php file, make sure the Composer autoloader is included:
`require_once __DIR__.'/../vendor/autoload.php';`
Import parsers and combinators:
`use function Parsica\Parsica\char;`
You can combine multiple imports in one statement:
`use function Parsica\Parsica\{between, char, atLeastOne, alphaChar};`
Finally, add some code:
```php
<?php
$parser = between(char('{'), char('}'), atLeastOne(alphaChar()));
$result = $parser->tryString("{Hello}");
echo $result->output();
// outputs "Hello"
```

View File

@@ -0,0 +1,43 @@
---
title: Development Status
sidebar_label: Development Status
---
Parsica is early stage, so expect things to break all the time.
This is a rough wishlist of features to do before 1.0:
### Done
- [x] API Documentation
- [x] All essential parsers
- [x] Basic error messages
- [x] PHPUnit tooling
- [x] Recursive parsers
- [x] Versioned documentation
- [x] Essential combinators
- [x] JSON parser
- [x] Parser position in error messages
- [x] Expression parser helpers
- [x] Tutorial
### TODO
- [ ] Streaming input
- [ ] Change the behaviour of or, add try and lookAhead
- [ ] Better parser assertions
- [ ] Better exceptions
- [ ] Character categories
- [ ] Comparison tests for canonical and performant implementations
- [ ] Debug trees
- [ ] Inliner
- [ ] Lexer
- [ ] Monoidal parser types
- [ ] More [monad combinators](https://hackage.haskell.org/package/base-4.14.0.0/docs/Control-Monad.html#v:-62--61--62-)
- [ ] Other popular test frameworks
- [ ] [Permutation phrases](https://www.cs.ox.ac.uk/jeremy.gibbons/wg21/meeting56/loeh-paper.pdf)
- [ ] Parser state
- [ ] Profiling & performance
- [ ] Publish documentation in e-reader and pdf formats

View File

@@ -0,0 +1,83 @@
---
title: Performance
sidebar_label: Performance
---
At the time of writing, no effort has been made to measure the performance of Parsica. That doesn't mean it's slow; it means that we don't know yet. If you're going to use this on large amounts of data, do some profiling yourself first. Compute == carbon, and we'd like to keep this planet a little longer. You can help by contributing your profiling and optimisations.
We have some ideas that will allow us to make it very efficient, and we intend to do that before getting to a 1.0 release.
## XDebug
Turn off XDebug, as it will make things much slower. If you do turn on XDebug, you may get `Maximum function nesting level of '256' reached, aborting!`. Increase the nesting level until the error goes away, either in code or in `php.ini`:
```php
<?php
ini_set('xdebug.max_nesting_level', '1024');
```
```ini
xdebug.max_nesting_level=1024
```
## Recursion
If you encounter a "Maximum function nesting level" error, the more likely problem is that you're building a recursive parser incorrectly. Have a look at the documentation page about recursion to learn more.
## Performance tips
Below we'll list some approaches to improve performance.
The actual difference in performance depends on many factors, so measure your parsers' performance to know if it is actually faster.
### Reusing parsers is faster than rebuilding them
Storing parsers in a variable or property is often faster than rebuilding them. Compare the these two equivalent parsers:
```php
<?php
$slow = between(
choice(char('"'), char("'")),
choice(char('"'), char("'")),
atLeastOne(alphaNumChar())
);
$quote = choice(char('"'), char("'"));
$fast = between(
$quote,
$quote,
atLeastOne(alphaNumChar())
);
```
### Use predicates over higher level combinators
Often, a combinator may be replaced with lower level combinators to get the same result faster. For example, the following parsers are equivalent, but the second one is a lot faster:
```php
<?php
$somePredicate = isDigit();
$slow = zeroOrMore(satisfy($somePredicate));
$fast = takeWhile($somePredicate);
```
The reason is that `$slow` reads one token at a time, and then appends it to the previous tokens. `$fast` on the other hand, reads all the tokens until `$predicate` fails, and then returns them all at once.
## Backtracking is slower
If your parser parses a long input, only to need to backtrack the whole thing when it fails, it's going to be slow. A better alternative is to organise your usage of choice in a way that only small chunks of the input need to be backtracked.
```php
<?php
$parser = choice(
atLeastOne(alphaChar())->thenEof(),
atLeastOne(alphaNumChar())->thenEof()
);
$result = $parser->tryString("abc123");
```
In this example, the choice parser parses "abc", fails on "1", backtracks, and then parses all of "abc123". If we switch the two parsers inside the choice parser, we are more likely to reach the end of the input without doing any backtracking.

View File

@@ -0,0 +1,38 @@
---
title: Naming Conventions
sidebar_label: Naming Conventions
---
## String and Character
PHP doesn't have a separate type for strings and characters, as opposed to some languages where string is defined as a list of characters. Still, as a convention in Parsica and its documentation, we generally use `'a'`, `'1'` (single quoted) to indicate a single character, and `"a"`, `"abc123"` (double quoted) to indicate a string.
We also use single quotes to indicate constant strings or symbols, such as `'STATUS_SUCCESS'`;
## Predicates
Predicates are either prefixed with `is` or suffixed with `pred`.
```php
<?php
$predicate = orPred(isEqual('5'), isEqual('6'));
assertTrue($predicate('6'));
```
## Character Parsers
A parser for a single character is always suffixed with `Char`, as in `digitChar()`. These always output a string.
## Case
Some parsers have case-insensitive versions. These are sufficed with 'I'.
```php
<?php
$parser = stringI('hello world');
$result = $parser->tryString("hElLO WoRlD");
assertEquals("hElLO WoRlD", $result->output());
```

27
vendor/parsica-php/parsica/docs/testdocs vendored Executable file
View File

@@ -0,0 +1,27 @@
#!/usr/bin/env bash
# Execute the docs to make sure all code examples are in sync with the Parsica code.
#
set -e
OUTPUTFORMAT=${1:-console}
vendor/bin/uptodocs run README.md --before=docs/before.php --output-format=$OUTPUTFORMAT
vendor/bin/uptodocs run docs/contribute/design_goals.md --before=docs/before.php --output-format=$OUTPUTFORMAT
vendor/bin/uptodocs run docs/resources/01_development_status.md --before=docs/before.php --output-format=$OUTPUTFORMAT
vendor/bin/uptodocs run docs/resources/02_performance.md --before=docs/before.php --output-format=$OUTPUTFORMAT
vendor/bin/uptodocs run docs/resources/03_naming_conventions.md --before=docs/before.php --output-format=$OUTPUTFORMAT
vendor/bin/uptodocs run docs/tutorial/01_introduction.md --before=docs/before.php --output-format=$OUTPUTFORMAT
vendor/bin/uptodocs run docs/tutorial/02_building_blocks.md --before=docs/before.php --output-format=$OUTPUTFORMAT
vendor/bin/uptodocs run docs/tutorial/03_combinators.md --before=docs/before.php --output-format=$OUTPUTFORMAT
vendor/bin/uptodocs run docs/tutorial/04_running_parsers.md --before=docs/before.php --output-format=$OUTPUTFORMAT
vendor/bin/uptodocs run docs/tutorial/05_mapping_to_objects.md --before=docs/before.php --output-format=$OUTPUTFORMAT
vendor/bin/uptodocs run docs/tutorial/06_order_matters.md --before=docs/before.php --output-format=$OUTPUTFORMAT
vendor/bin/uptodocs run docs/tutorial/07_recursion.md --before=docs/before.php --output-format=$OUTPUTFORMAT
vendor/bin/uptodocs run docs/tutorial/08_look_ahead.md --before=docs/before.php --output-format=$OUTPUTFORMAT
vendor/bin/uptodocs run docs/tutorial/09_errors_and_labels.md --before=docs/before.php --output-format=$OUTPUTFORMAT
vendor/bin/uptodocs run docs/tutorial/10_side_effects.md --before=docs/before.php --output-format=$OUTPUTFORMAT
vendor/bin/uptodocs run docs/tutorial/11_dealing_with_space.md --before=docs/before.php --output-format=$OUTPUTFORMAT
vendor/bin/uptodocs run docs/tutorial/20_expressions.md --before=docs/expressions.php --output-format=$OUTPUTFORMAT
vendor/bin/uptodocs run docs/tutorial/90_functional_paradigms.md --before=docs/before.php --output-format=$OUTPUTFORMAT

View File

@@ -0,0 +1,149 @@
---
title: What are parser combinators?
---
Before you start, make sure you've had a look at the installation instructions.
## Parsers
```php
<?php
$parser = char(':')
->append(atLeastOne(punctuationChar()))
->label('smiley');
$result = $parser->tryString(':*{)');
echo $result->output() . " is a valid smiley!";
```
A parser is a function that takes some unstructured input (like a string) and turns it into structured output, that's easier to work with. This output could be as simple as a slightly better structured string, or an array, an object, up to a complete abstract syntax tree. You can then use this data structure for subsequent processing.
You're probably using parsers all the time, such as `json_decode()`. And even just casting a string to a float <sup>[footnote 1](#floatval)</sup> really is parsing.
Parsica helps you build your own parsers, in a concise, declarative way. Behind the scenes it takes care of things like error handling, so you can focus on the parser itself.
## Building a parser
There are many ways to build a parser for your own use case, ranging from formal grammars that get compiled into a parser, to regular expressions, to writing a parser entirely from scratch. They all have their own tradeoffs and limitations.
One of the great benefits of the parser combinator style is that, once you get the hang of it, they're generally easier to write, understand, and maintain. You start from building blocks, such as `digitChar()`, which returns a function that parses a single digit.
```php
<?php
$parser = digitChar();
$input = "1. Write Docs";
$result = $parser->tryString($input);
$output = $result->output();
assertSame("1", $output);
assertIsString($output);
```
## Parser Combinators
Parser Combinators are functions (or methods) that combine parsers into new parsers. Instead of writing one big parser, we can now write smaller parsers and cleverly compose them into larger parsers.
```php
<?php
$parser = char('a')->append(char('b'));
$result = $parser->tryString("abc");
$output = $result->output();
assertEquals("ab", $output);
```
```php
<?php
$parser =
collect(
string("Hello")->thenIgnore(char(",")),
string("world")->thenIgnore(char("!")),
);
$result = $parser->tryString("Hello,world!");
$output = $result->output();
assertEquals(["Hello", "world"], $output);
```
To make this work, we need a small change in our original definition of a parser.
> A parser is a function that takes some unstructured input (such as a string), and returns a more structured output, as well as the remaining unparsed part of the input.
This way, each parser function can parse a chunk of the input, and leave the remainder to another parser. The combinators take care of the heavy lifting: pass the input to the parser functions, pass the remainder to the next one, decide what to do with errors (eg, fail or backtrack or try another parser), ...
We can inspect the remainder:
```php
<?php
$parser = sequence(char('a'), char('b'));
$result = $parser->tryString("abc");
assertEquals("b", $result->output());
assertEquals("c", $result->remainder());
```
So when we run our parser using `$parser->tryString($input)`, the `sequence()` combinator first tries to run `char('a')` on the input `"abc"`. If it succeeds, it takes the remainder `"bc"` and successfully runs `char('b')` on it and returns the result. That result consists of the output from the last parser `"b"`, and the remainder `"c"`.
In imperative code, it would look something like this:
```php
<?php
final class MyParser
{
public function try(string $input) : array
{
$output1 = substr($input, 0, 1); // "a"
if ($output1 == 'a') {
$remainder1 = substr($input, 1); // "bc"
$output2 = substr($remainder1, 0, 1); // "b"
if ($output2 == 'b') {
$remainder2 = substr($remainder1, 1); // "c"
} else {
throw new Exception("Parser failed");
}
} else {
throw new Exception("Parser failed");
}
return ['output' => $output2, 'remainder' => $remainder2];
}
}
$parser = new MyParser();
$result = $parser->try("abc");
assertEquals('b', $result['output']);
assertEquals('c', $result['remainder']);
```
If you've been working in PHP long enough and have never used parser combinators, the code above may look more familiar for now. But imagine scaling that to parse anything from simple formats like credit card numbers, recursive structures like JSON or XML, or even entire programming languages like PHP. And that doesn't even include the code you'd need for performance, testing and debugging tooling, code reuse, and reporting on bad input. If you'd rather write `sequence(char('a'), char('b'))`, stick around.
### Footnotes
#### <a name="floatval">Note 1</a>
```php
<?php
$v = floatval("1.23");
assertSame(1.23, $v);
```
The above looks fine at first sight, but `floatval()` really isn't a very good parser.
```php
<?php
assertSame(0.0, floatval("abc"));
```
`floatval()` claims that the float of `"abc"` is `0`, which really should be an error. So you can only use `floatval` when you already know that the string doesn't contain anything non-float. Parsica can help you do that:
```php
<?php
$parser = float()->map(fn($v) => floatval($v));
try {
// works:
$result = $parser->tryString("1.23");
assertSame(1.23, $result->output());
// throws a ParserHasFailed exception with message "Expected: float, got abc"
$result = $parser->tryString("abc");
} catch (ParserHasFailed $e) {}
```

View File

@@ -0,0 +1,94 @@
---
title: Building Blocks
---
## Predicates
The simplest building block is a parser that only considers the first character of an input. If the character satisfies some condition, we consume it from the input. We could write that with some `if` statements and `substr` calls, but Parsica provides abstractions for that.
```php
<?php
$parser = satisfy(isEqual('a'));
$input = "abc";
$result = $parser->tryString($input);
assertEquals("a", $result->output());
assertEquals("bc", $result->remainder());
```
`isEqual('a')` is a predicate. If you call it with another argument, you get a boolean: `isEqual('a')('b') == false`.
`satisfy($predicate)` is a function returns a `Parser` object. You can think of it as a parser constructor. This object will do the heavy lifting of taking the first character of `$input`, and testing it with the predicate.
Parsica comes with some useful predicates, including boolean and/or/not combinators:
```php
<?php
$parser = satisfy(orPred(isDigit(), isWhitespace()));
```
## Character parsers
In practice, you may not need to use predicates and `satisfy` very often. The characters API provides commonly used parsers for single characters instead:
```php
<?php
$parser = char('a');
```
`char($x)` is defined as `satisfy(isEqual($x))` so the code above is equivalent to the first example. `charI()` is the case-insensitive version of `char()`. It preserves the case as is:
```php
<?php
$parser = charI('a');
$result = $parser->tryString("ABC");
assertEquals("A", $result->output());
$result = $parser->tryString("abc");
assertEquals("a", $result->output());
```
Parsica provides various parsers for groups of characters, like `alphaNumChar`, `upperChar`, `punctuationChar`, `newline`, and `digitChar`. You can find them all listed in the API Reference.
```php
<?php
$parser = digitChar();
$result = $parser->tryString('123');
assertEquals('1', $result->output());
```
Note that even though we parsed a `digitChar`, the output is a string, not an int. That's because at this point, we're parsing characters. We'll talk about outputting other types than string later.
## Strings
For longer sequences of characters, you can use `string` and `stringI`. Keep in mind that `stringI`is not just case-insensitive, but also case-preserving.
```php
<?php
$parser = stringI("parsica");
$result = $parser->tryString("PARSICA");
assertEquals("PARSICA", $result->output());
$result = $parser->tryString("pArSiCa");
assertEquals("pArSiCa", $result->output());
```
If you want the output to be consistent, you can use `map` to convert it.
```php
<?php
$parser = stringI("parsica")
->map(fn($output) => strtolower($output));
$result = $parser->tryString("pArSiCa");
assertEquals("parsica", $result->output());
```
## Other parsers
Parsica comes with a growing library of other useful parsers, such as numeric types, and spaces. Always make sure to check the API documentation to know what the type of a parser is (aka the tpye of the output that the parser will produce.) For example, parsers like `space`, `tab`, and `newline` all output strings containing the characters they matched. On the other hand, `skipSpace` will output `null`, no matter if it consumed spaces or not. This makes sense because the point is to ignore them, not use them.
`skipSpace` consumes all kinds of space, whereas `skipHSpace` will stop consuming at newlines and carriage returns. They also come with two friends, `skipSpace1` and `skipHSpace1`, which expect at least on space to present.

View File

@@ -0,0 +1,53 @@
---
title: Using Combinators
---
## Fluent interface
Many combinators come both as a standalone function, and as a method on the `Parser`object. They behave the same, and exist as a convenience for writing more readable code. Choosing one or the other will mostly depend on your usecase.
The general rule is that `combinator($parserA, $parserB) ≡ $parserA->combinator($parserB)`, in other words, they are equivalent.
In the example below, the `sequence` and `optional` combinators are used as functions and as methods, and both parsers are fully equivalent.
```php
<?php
$parser1 = sequence(
optional(char('a')),
char('b')
);
$parser2 = char('a')->optional()
->sequence(char('b'));
```
Sometimes combinators have different names for the same behaviour: `$parserA->or($parserB) ≡ either($parserA, $parserB)`. In this case, the reason is partially because `or` is a reserved keyword in PHP, and partially because `either` reads better in this case. Some combinators have aliases, such as `Parser#sequence()` and `Parser#followedBy()`, again these exist purely for convenience.
## Sequences
`sequence` is one of the most basic combinators you'll find. `sequence($parser1, $parser2)` means *"Try the first parser. If it fails, return the failure. If it succeeds, take the remaining input that was not consumed by `$parser1`, and try `$parser2`. Return the result of `$parser2`."*
It's important to understand that this drops whatever output `$parser1` produced. That's useful when you're only interested in what comes after `$parser1`. This example extracts a value that is prefixed by a string.
```php
<?php
$parser = sequence(string('My name is '), atLeastOne(alphaChar()));
$result = $parser->tryString("My name is Parsica");
assertEquals("Parsica", $result->output());
```
## Alternatives
@TODO
## Appending
@TODO
## Folding combinators
There are also combinators that extend the behaviour of others. For example, `choice` is a left fold over the `either` combinator, effectively turning it from a combinator that takes two arguments, to one that take n arguments. `choice($parser1, $parser2, $parser3, ...) ≡ $parser1->or($parser2)->or($parser3)->or...`
The same happens with the `assemble` combinator, which call appends all its arguments. `assemble($parser1, $parser2, $parser3, ...) ≡ $parser1->and($parser2)->and($parser3)->...`
In general, you should use the simplest form available, so if you only have two choices, favour `or` over `choice`.

View File

@@ -0,0 +1,79 @@
---
title: Running Parsers
---
There are different ways of running your parser on an input.
## try() and tryString()
Most of the time, you'll want to use `try`. It will run the parser on an input `Stream`, return a `Succeed` (which implements `ParseResult`) on success, and throw a `ParserHasFailed` exception if the input can't successfully be parsed.
The `Stream` type generalises over a different ways of providing input. The simplest implementation is `StringStream`. This is really a wrapper around a PHP multibyte string.
(In v0.6.0, `StringStream` is also the _only_ implementation of `Stream`, but this will change.)
`ParseResult` has an `output()` method, which has the type `T` for a `Parser<T>` (see [Mapping to Objects](mapping_to_objects)). It also has a `remainder()` method, which gives you the part of the input that wasn't consumed by the parser.
`ParserHasFailed` has the usual `Exception` methods. It also gives you access to the `Fail implements ParseResult` object. This contains all the relevant information about the failure, such as `expected()`, `got()`, and `position()`.
```php
<?php
$parser = string('hello');
$result = $parser->try(new StringStream("hello world"));
// Or, use tryString(string), which is an alias of try(StringStream):
$result = $parser->tryString("hello world");
echo $result->output(); // "hello"
echo $result->remainder(); // StringStream(" world")
// Now let's make it fail
try {
$result = $parser->tryString("hi world");
} catch(ParserHasFailed $e) {
$result = $e->parseResult();
echo $result->expected(); // "string(hello)"
echo $result->got(); // StringStream("hi world")
$position = $result->position(); // A Position object containing the line number,
// column, and filename where the failure happened
}
```
## run()
`run` is mostly intended for internal use.
The main difference between `run` and `try` is that `run` doesn't throw exceptions when parsing an input fails. (It might throw exceptions if your parser itself is incorrectly defined.) Instead, you'll always get a `ParseResult`, and you can inspect it with the same methods as above. You'll also get `isSuccess` and `isFail`, so you know what you're dealing with.
```php
<?php
$parser = string("hello");
$result = $parser->run(new StringStream("some input"));
if($result->isSuccess()) {
echo $result->output();
echo $result->remainder();
} elseif ($result->isFail()) {
echo $result->expected();
echo $result->got();
}
```
## Continue with a result
Using `run` instead of `try` is a good choice when you want to do something with the result, such as:
- Building your own combinators
- Interacting with `ParseResult` while in the middle of a parse flow
To do that, `ParseResult` lets you continue parsing:
```php
<?php
$parser1 = string("hello");
$result1 = $parser1->run(new StringStream("hello world"));
$parser2 = string("world");
$result2 = $result1->continueWith($parser2);
```
`continueWith` takes another parser, and uses it to parse the remainder of the of the result. You may have noticed we didn't check for `isSuccess`. That's becasue we don't need to. `continueWith` is smart; if `$parser1` fails, trying to continue parsing on the result will not have any effect. In fact, the example above will fail, because `$parser1` doesn't take into account the space between "hello" and "world".

View File

@@ -0,0 +1,132 @@
---
title: Mapping to Objects
---
## Parser types
Most of the parsers that come with Parsica, return strings as outputs.
```php
<?php
$parser = digitChar();
assertInstanceOf('Parsica\Parsica\Parser', $parser);
$result = $parser->tryString('1');
assertIsString('Parsica\Parsica\StringStream', $result->output());
assertEquals('1', $result->output());
```
In PHP 7.x, the type of `$parser` is `Parser`, but you can think of it having the type `Parser<string>`. PHP doesn't support generics, so it doesn't enforce that. However, working with Parsica is easier if you always think of parsers having an inner type.
> `Parser<T>` means that if we successfully run the parser on an input, it will output a value of type `T`.
Here's an example of a parser of type `Parser<array<string>>`:
```php
<?php
$parser = sepBy(char(','), atLeastOne(digitChar()));
$result = $parser->tryString('123,9,55');
assertEquals(["123", "9", "55"], $result->output());
```
## The map combinator
The point of parsing to turn strings into more useful data structures. The combinator `map` can help you with that. It does the same thing as PHP's `array_map` function. You combine a parser and a `callable`, and you get a new parser. This new parser will apply the callable to the output of the parser.
We can use it for manipulating the output. Here's a simple example:
```php
<?php
$parser = atLeastOne(alphaChar())
->map(fn(string $val) => strtolower($val));
$result = $parser->tryString('PaRsIcA');
assertEquals("parsica", $result->output());
```
If the parser fails, the callable is not applied to the output (because there is no output). So you don't need to worry about error handling.
## Casting to scalars
We can now use this to cast the parser's output to scalars:
```php
<?php
$parser = atLeastOne(digitChar())
->map(fn(string $val) => intval($val));
$result = $parser->tryString("123"); // input is still a string
assertSame(123, $result->output()); // output is an int
```
It also works inside nested parsers. We can use this on the `sepBy` example from above:
```php
<?php
$parser = sepBy(
char(','),
atLeastOne(digitChar())
->map(fn($val) => intval($val))
);
$result = $parser->tryString('123,9,55');
assertSame([123, 9, 55], $result->output()); // array of ints
```
The type of this last parser is now `Parser<array<int>>` instead of the original `Parser<array<string>>`.
## Casting to objects
We'll want to cast to much more interesting data structures than scalars and arrays. Let's parse some monetary values into a nested value object structure. `Money` is composed of an integer value and a `Currency` value object:
```php
final class Currency
{
private string $currency;
function __construct(string $currency)
{
$this->currency = $currency;
}
}
// Side warning: don't actually use floats to do computations with money.
final class Money
{
private float $amount;
private Currency $currency;
function __construct(float $amount, Currency $currency)
{
$this->amount = $amount;
$this->currency = $currency;
}
}
// $currency is a parser of type Parser<Currency>
$currency = repeat(3, upperChar())
->map(fn(string $c) => new Currency($c));
// $amount has type Parser<float>
$amount = float()
->map(fn(string $val) => floatval($val));
// $money has type Parser<[Currency, float]) because collect() has type Parser<[T]>
$money = collect($currency, skipHSpace()->followedBy($amount));
// Let's change $money to type Parser<Money>
$money = $money->map(fn(array $a) => new Money($a[1], $a[0]));
$result = $money->tryString('EUR 12.34');
assertEquals(new Money(12.34, new Currency('EUR')), $result->output());
// We can now composer our Parser<Money> in larger parsers
// $pricelist has type Parser<array<Money>>
$priceList = collect(
string("exVAT ")->followedBy($money)->thenIgnore(whitespace()),
string("incVAT ")->followedBy($money)
);
$result = $priceList->tryString('exVAT EUR 100.00 incVAT EUR 121.00');
```

View File

@@ -0,0 +1,27 @@
---
title: Order matters
sidebar_label: Order matters
---
The order of clauses in an or() matters. If we do the following parser definition, the parser will consume "http", even if the strings starts with "https", leaving "s://..." as the remainder.
```php
<?php
$parser = string('http')->or(string('https'));
$input = "https://parsica.verraes.net";
$result = $parser->tryString($input);
assertEquals("http", $result->output());
assertEquals("s://parsica.verraes.net", $result->remainder());
```
The solution is to consider the order of or clauses:
```php
<?php
$parser = string('https')->or(string('http'));
$input = "https://parsica.verraes.net";
$result = $parser->tryString($input);
assertEquals("https", $result->output());
assertEquals("://parsica.verraes.net", $result->remainder());
```

View File

@@ -0,0 +1,108 @@
---
title: Recursion
---
Often we want to parse arbitrarily nested structures. Arrays, JSON, XML are such example. To do that, we need to be able to pass the parser to itself. Because of a limitation in PHP, we cannot pass a value around before it is created. The solution is to split this in two steps: create a placeholder for a recursive parser, and then define the parser in terms of itself.
## Example
We need to parse nested pairs such as `[1,[2,[3,4]]]`. The structure repeats itself, every item in the pair can be either a digit or another pair.
We cannot write this:
```
<?php
$pair = collect(
ignore(char('[')),
digit()->or($pair),
ignore(char(',')),
digit()->or($pair),
ignore(char(']')),
);
```
The above results in "Undefined variable: pair" because we're trying to use `$pair` before it's defined. Instead, we need to mark the parser as `recursive` in a first step, and then define how the parser should `recurse`:
```php
<?php
// Create a recursive parser first
$pair = recursive();
// Then define the parser
$pair->recurse(
between(
char('['),
char(']'),
collect(
digitChar()->or($pair)
->thenIgnore(char(',')),
digitChar()->or($pair)
)
),
);
$result = $pair->tryString("[1,[2,[3,4]]]");
assertSame(['1', ['2', ['3', '4']]], $result->output());
```
It's possible to nest multiple recursive parsers. Simply initialise them all first using `recursive()` and then define them in terms of each other:
```php
<?php
$curlyPair = recursive();
$squarePair = recursive();
$anyPair = $curlyPair->or($squarePair);
$inner = collect(
digitChar()->or($anyPair)
->thenIgnore(char(',')),
digitChar()->or($anyPair)
);
$curlyPair->recurse(
between(char('{'), char('}'), $inner),
);
$squarePair->recurse(
between(char('['), char(']'), $inner),
);
$mixed = "{1,[2,{3,4}]}";
$result = $anyPair->tryString($mixed);
assertSame(['1', ['2', ['3', '4']]], $result->output());
```
Note that when you initialize a parser with `recursive()`, it is in fact mutable, and the `recurse()` method mutates it. All parsers are immutable, and this is the only exception. After calling `recurse()`, the parser is immutable again and behaves just like any other parser.
## Using recusion to avoid loops
Let's say we want to parse the character `'a'` at least one time, so that `"aaab"` outputs `"aaa"`, but `"bbb"` fails. Imperatively, you could solve this by running the `char('a')` parser in a while loop, and stop on the first failure. We can express it more concisely with recursion though:
1. Start by parsing `char('a')`.
2. Append another `char('a')`, but this second one is `optional()`.
3. Append another `optional(char('a'))`
4. Notice the similarity between the first two steps. This suggest an opportunity for recursion.
5. Wrap our `char('a')->append(optional(char('a')))` in a `recurse()` parser.
6. Replace the second `char('a')` by the recursive parser.
The end result looks like this:
```php
<?php
$rec = recursive();
$rec->recurse(char('a')->append(optional($rec)));
$result = $rec->tryString("aaab");
assertEquals("aaa", $result->output());
```
In fact the code above is how the `atLeastOne()` combinator works, so you can simplify that code by writing this:
```php
<?php
$parser = atLeastOne(char('a'));
$result = $parser->tryString("aaab");
assertEquals("aaa", $result->output());
```

View File

@@ -0,0 +1,58 @@
---
title: Looking ahead
---
## notFollowedBy
Say you want to match the `print` keyword in a programming language. You can express that with the `string("print")` parser, but it will match more than you'd like:
```php
<?php
$print = string("print");
$result = $print->tryString("print('Hello World');");
assertEquals("print", $result->output());
$result = $print->tryString("printXYZ('Hello World');");
assertEquals("print", $result->output()); // oops!
```
As you can see, "printXYZ" also results in "print", but it wasn't our intention, because "printXYZ" is not a valid keyword.
We can solve it by using the `notFollowedBy` combinator.
```php
<?php
$print = keepFirst(string("print"), notFollowedBy(alphaNumChar()));
$result = $print->run(new StringStream("printXYZ('Hello World');"));
assertTrue($result->isFail());
```
There's a fluent interface as well:
```php
<?php
$print = string("print")->notFollowedBy(alphaNumChar());
$result = $print->run(new StringStream("printXYZ('Hello World');"));
assertTrue($result->isFail());
```
In practice, we'll have a lot more keywords than just the one. A good habit is to first generalize this to all the keywords in our language. Then, using our new `$keyword` parser constructor, we can match the exact variations we like:
```php
<?php
$keyword = fn(string $name) => keepFirst(string($name), notFollowedBy(alphaNumChar()));
$parser = choice(
$keyword('printf'),
$keyword('print'),
$keyword('sprintf')
);
$result = $parser->tryString("print('Hello World');");
assertEquals("print", $result->output());
$result = $parser->tryString("printf('Hello %s', 'world');");
assertEquals("printf", $result->output());
```

View File

@@ -0,0 +1,75 @@
---
title: Errors and labels
---
Error messages in Parsica give you information about what the parser expected, where the problem happened, and what it got instead.
```php
<?php
$parser = sepBy1(char(','), atLeastOne(alphaChar()));
$input = "Ÿellow,Red,Green";
//$parser->tryString($input);
```
_(Note: We're using [UpToDocs](https://github.com/mathiasverraes/uptodocs) to automatically test all the code samples in this documentation. The downside is that throwing an exception in the docs causes the build to fail! That's why some of the code above is commented out.)_
If you uncomment and run the above code, you'll get an exception like this:
```
<input>:1:1
|
1 | Ÿellow,Red,Green
| ^— column 1
Unexpected 'Ÿ'
Expecting at least one A-Z or a-z, separated by ','
```
It shows the filename, line number and column position, as well as an autogenerated expectation.
Often you'll want something a bt more meaningful than "at least one A-Z or a-z". You can do that by attaching your own labels to some of your parsers. For example, we can label the colours:
```php
<?php
$parser = sepBy1(
string(','),
atLeastOne(alphaChar())->label("colour")
);
```
That will yield:
```
...
Expecting colour, separated by ','
```
Or you can attach the label to the entire parser:
```php
<?php
$parser = sepBy1(
string(','),
atLeastOne(alphaChar())
)->label("a list of colours");
```
```
...
Expecting a list of colours
```
The best approach will of course depend on your specific use case. A good habit is to keep in mind that ultimately, the errors are there for the end user who will see them. This could be a user who enters some values in a form, a programmer using your API, someone building other parsers on top of your parser... Feed your parser with some wrong inputs that you are likely to get in the real world, and get a feel for what makes a helpful error message.
## Doing your own error reporting
The information in the error message is also available from `ParseResult`. You can use this to make your own error messages, if you want to render them as HTML or send them to an API. For example, these are the line and column number where the parser ended up:
```php
<?php
$result = string('Hello')->run(new StringStream("Hello, World"));
assertSame(1, $result->position()->line());
assertSame(6, $result->position()->column());
```
Have a look at the `ParseResult` API to see what else it can do.

View File

@@ -0,0 +1,78 @@
---
title: Side Effects and Events
---
Sometimes you may want to perform actions when your parser encounters something you're interested in. Parsica provides combinator called `emit()`. It allows you to inject side effects at any point. It's intentionally very barebones: It's really just a callback function, that gets called only when the parser succeeds.
```php
<?php
// Define a function that takes the output and performs some side effect:
$print = fn(string $output) => print($output);
// Define a parser:
$parser = many(either(
char('a'),
// Combine the 'b' parser with emit:
char('b')->emit($print)
));
// Running the parser calls print() whenever a 'b' is encountered:
$parser->tryString('aababba'); // Prints "bbb"
```
Using closures and mutable objects, you can embed mutability into a parsing process.
```php
<?php
final class Counter
{
private int $count = 0;
function incr(): void { $this->count++; }
function count(): int{ return $this->count; }
}
// Make a mutable object:
$counter = new Counter();
// Use it inside a closure:
$incr = fn(string $output) => $counter->incr();
$parser = many(either(
char('a'),
// Increment counter when we hit 'b'
char('b')->emit($incr)
));
$parser->tryString('aababba');
assertSame(3, $counter->count());
```
For most use cases, we suggest using `emit()` with an adapter for your application's event dispatching mechanism. The following shows how to adapt `emit()` to any [PSR-14](https://www.php-fig.org/psr/psr-14/) compatible event dispatcher.
```php
<?php
// Your (or your framework's) event dispatcher:
final class YourDispatcher implements \Psr\EventDispatcher\EventDispatcherInterface
{
public function dispatch(object $event) { /* ... */ }
}
$yourDispatcher = new YourDispatcher();
// An adapter that turns a value into an event and sends it to your dispatcher:
$yourAdapter = function (Colour $colour) use ($yourDispatcher) : void {
$timestamp = new DateTimeImmutable("now");
$event = new ColourWasEncountered($timestamp, $colour);
$yourDispatcher->dispatch($event);
};
$parser = many(
either(
string('red'),
string('green'),
string('blue'),
)
// The parser outputs string, the map() combinator turns those into domain objects:
->map(fn(string $output) : Colour => new Colour($output))
// Emit the Colour object to the adapter:
->emit($yourAdapter)
);
```
This way, you can neatly separate the occurrence of a parsing event, from the actual side effect. If the dispatcher is asynchronous, the parsing process can keep continuing, without being interrupted by blocking side effects, such as writing to a database. Or when parsing a large input file or continuous input stream, you can start processing the results before the parsing has finished.

View File

@@ -0,0 +1,107 @@
---
title: Dealing with Space
---
Parsica comes with a number of useful parsers for dealing with different types of whitespace and newlines, as well with required or optional whitespace. We recommend browsing `src/space.php` to see what is available, so you don't need to build your own parsers for that.
## Space consumers
When building a parser for say a language or a file format, you often have specific rules about space. Whitespace can be required or optional, and expressions can be valid or invalid if they contain newlines. All of these are could be valid or invalid depending on your case:
```
// with 0, 1, or more spaces
1+1
1 + 1
1 + 1
// multiline
1 +
2
// tabs
1
+ 2
```
There's too much variation for Parsica to provide a single solution. However, you don't want to litter your code with space parsers everywhere:
```php
<?php
$term = digitChar();
$operator = char('+');
$parser = collect(
$term,
skipSpace1(),
$operator,
skipSpace1(),
$term,
skipSpace1(),
)->map(fn($o) => $o[0] + $o[4]);
$result = $parser->tryString("1 +\n 2\t");
assertSame(3, $result->output());
```
This is noisy. And if you want to change the rules about whitespace or build more complex parsers, you have to deal with this problem all the time, making it unmaintainable (or at least annoying).
The idea is to build a space consumer that you can reuse everywhere. The space consumer is a parser combinator that you wrap around another parser, and that returns the output of the inner parser, ignoring whitespace. A typical approach is to consistently ignore space after the thing you're interested in.
```php
<?php
// $token behaves just like $parser, but requires the parsed
// value to be followed by at least 1 space
$token = fn(Parser $parser) => keepFirst($parser, skipSpace1());
// Now we wrap our parsers
$term = $token(digitChar());
$operator = $token(char('+'));
// Our main parser now has the same "shape" as the expression we're trying to parse:
$parser = collect(
$term,
$operator,
$term,
)->map(fn($o) => $o[0] + $o[2]);
$result = $parser->tryString("1 +\n 2\t");
assertSame(3, $result->output());
```
Now, all the logic for skipping space is nicely contained in `$token`. If we wanted to disallow multiline expressions, we only need to replace `skipSpace1()` with `skipHSpace1()` in one place.
As an example, here's an excerpt from the JSON parser, using the ws (whitespace) as defined in the JSON spec:
```php
final class MyJSON
{
public static function ws(): Parser
{
return zeroOrMore(satisfy(isCharCode([0x20, 0x0A, 0x0D, 0x09])))->voidLeft(null)
->label('whitespace');
}
public static function token(Parser $parser): Parser
{
return keepFirst($parser, JSON::ws());
}
public static function object(): Parser
{
return map(
between(
JSON::token(char('{')),
JSON::token(char('}')),
sepBy(
JSON::token(char(',')),
JSON::member()
)
),
fn(array $members):object => (object)array_merge(...$members));
}
// see src/JSON/JSON.php for the full code
}
```
If you have multiple ways of handling space in one parser, you can of course define multiple space consumers and give them relevant names.

View File

@@ -0,0 +1,246 @@
---
title: Parsing Expression Languages
---
Can Parsica parse expression? Why yes, I'm glad you asked!
An expression, roughly, is anything that can evaluated to a value, such as
- arithmetic expressions `(1 + 2) * 3`,
- boolean expressions `x and (y or z)`,
- code inside a template language `{{ user.loggedIn ? 'Hello ' ~ user.name : 'Log in' }}`,
- spreadsheet formulas `=SUM(A1:A10) * B1`,
- rules in a rule engine
- logic inside a configuration language,
- and anything else you can think of!
The tricky thing about parsing expressions is that you often have to deal with things like recursion, associativity, and operator precedence. These can make it pretty tricky to build a parser. Parsica provides the `expression()` function, which offers a simple way to create a parser for your custom expression language.
## Arithmetic
Let's build a simple calculator, that can evaluate expressions like `1 + 2 * (2 - 3)` to `-1`.
Let's handle whitespace first. (See the chapter on "Dealing with Space" for details.)
```php
<?php
$token = fn(Parser $parser) => keepFirst($parser, skipHSpace());
```
Next, we define a parser for our terms. For this example, let's keep it simple and support only natural numbers:
```php
<?php
$term = fn(): Parser => $token(atLeastOne(digitChar()))->map('intval');
```
Let's do parentheses next. Parsica's `between()` combinator will do the job nicely, but let's wrap it in our combinator for clarity and reusability:
```php
<?php
$parens = fn (Parser $parser): Parser => $token(between($token(char('(')), $token(char(')')), $parser));
```
Now let's define our first expression, using `expression()`. In our language, an expression can be:
1. A naked term like `12`
2. A term between parentheses `(12)`
3. An operator and its arguments `1 + 2`
4. The arguments are expressions themselves, as in `1 + (2 + 3)`
An expression is defined using expressions, so this calls for recursion. (See the chapter on Recursion.) Let's ignore operators for now, and do the simplest recursive expression parser:
```php
<?php
$expr = recursive();
$primary = $parens($expr)->or($term());
$expr->recurse(
expression($primary, [])
);
$result = $expr->tryString("(((12)))");
assertSame(12, $result->output());
```
We're saying here that `$primary` is either an expression wrapped in parens, or a term. `$expr` is an expression that uses `$primary` as its primary parser.
Now let's add the plus operator. We need a parser for the symbol itself, in this case a simple `char('+')` will do, but it could be anything. For example, PHP has two 'not equal' operators, which we could parse in one go `either(string('!='), string('<>'))`.
We also need to decide what to do with the terms that we parse, using a transformation. This is a function that will take the left and the right operands from our `+`. As we're building a calculator, we're simple going to add up the two terms, using `fn($left, $right) => $left + $right`. (Later we will use this to create abstract syntax trees.)
Finally, we need to tell the expression parser that `+` is a binary operator, and that we want it to be left associative. Let's put it all together:
```php
<?php
$expr = recursive();
$primary = $parens($expr)->or($term());
$expr->recurse(
expression(
$primary,
[
leftAssoc(
binaryOperator($token(char('+')), fn($l, $r) => $l + $r)
)
]
)
);
$result = $expr->tryString("1 + 2 + 3");
assertSame(6, $result->output());
$result = $expr->tryString("(1 + (2 + 3) + 4)");
assertSame(10, $result->output());
```
The second argument to `expression()` is an array of operators. The order is important: it determines the precedence. `+` and `-` have the same precedence, whereas `*` and `/` have the same precedence as each other, but higher precedence than `+` and `-`. We can solve this easily by grouping each precedence level, and putting the highest precedence levels first.
```php
<?php
$expr = recursive();
$primary = $parens($expr)->or($term());
$expr->recurse(
expression(
$primary,
[
leftAssoc(
binaryOperator($token(char('*')), fn($l, $r) => $l * $r),
binaryOperator($token(char('/')), fn($l, $r) => $l / $r),
),
leftAssoc(
binaryOperator($token(char('+')), fn($l, $r) => $l + $r),
binaryOperator($token(char('-')), fn($l, $r) => $l - $r),
),
]
)
);
$result = $expr->tryString("1 + 2 * 3");
assertSame(7, $result->output());
$result = $expr->tryString("(1 + 2) * 3");
assertSame(9, $result->output());
$result = $expr->tryString("1 - 2 - 3"); // interpreted as ((1 - 2) - 3)
assertSame(-4, $result->output());
```
You can play around with the precedence and the associativity to see how it impacts the result. As an exercise, make a parser that solves `1 - 2 - 3 = (1 - (2 - 3) = (1 - (-1)) = 2`.
## Non-associative operators
Non-associative means that an expression like `1 + 2 + 3` cannot be resolved, because there is no way to decide whether it's associates left `(1 + 2) + 3` or right `1 + (2 + 3)`. The parser will simply fail. Of course, for addition, non-associativity wouldn't make sense, but for other languages or operators it might.
## Unary operators
You can add unary operators, such as the negation prefix operator `-`, and the increment and decrement postfix operators `++` and `--`.
```php
// ...
[
prefix(
unaryOperator(char('-'), fn($v) => -$v)
),
postfix(
unaryOperator(string('++'), fn($v) => $v + 1),
unaryOperator(string('--'), fn($v) => $v - 1),
),
// ...
];
```
## Parsing to an AST
Building calculators isn't that interesting of course. Typically you'll want your parser to output a datastructure that represents your expression, called an Abstract Syntax Tree or AST. This structure can then be used for whatever the next step in your program is, ranging from evaluation to compilation, static analysis, typechecking, optimisation, rendering and formatting...
Let's build a simple Boolean expression language, starting with the types for AST. Everything else will be pretty similar to the calculator example above, but instead of evaluating the expressions on the fly, we use the transform functions to create the datastructure.
```php
<?php
// every term or expression in our language is a Boolean:
interface Boolean {}
// Literals
class True_ implements Boolean {}
class False_ implements Boolean {}
// A variable will be replaced with a value at evaluation stage
class Variable implements Boolean {
private string $name;
function __construct(string $name){$this->name = $name;}
}
// Our operators are Booleans that are composed of other Booleans
class Not_ implements Boolean {
private Boolean $boolean;
function __construct(Boolean $boolean){$this->boolean = $boolean;}
}
class And_ implements Boolean {
private Boolean $l, $r;
function __construct(Boolean $l, Boolean $r){
$this->l = $l;
$this->r = $r;
}
}
class Or_ implements Boolean {
private Boolean $l, $r;
function __construct(Boolean $l, Boolean $r){
$this->l = $l;
$this->r = $r;
}
}
// Now let's write the parser
$token = fn(Parser $parser) : Parser => keepFirst($parser, skipHSpace());
$parens = fn (Parser $parser): Parser => $token(between($token(char('(')), $token(char(')')), $parser));
// A term is a literal TRUE/FALSE or a variable
$term = fn(): Parser => $token(choice(
char('$')->followedBy(atLeastOne(alphaChar()))->map(fn($name) => new Variable($name)),
string("TRUE")->map(fn($v) => new True_),
string("FALSE")->map(fn($v) => new False_),
));
$expr = recursive();
// When the parser encounters NOT, AND, or OR, it returns a Not_, And_, or Or_ object.
// The $v, $l and $r arguments can be Boolean objects themselves, creating the tree.
$expr->recurse(expression(
$parens($expr)->or($term()),
[
prefix(
unaryOperator($token(string("NOT")), fn($v) => new Not_($v))
),
leftAssoc(
binaryOperator($token(string("AND")), fn($l, $r) => new And_($l, $r))
),
leftAssoc(
binaryOperator($token(string("OR")), fn($l, $r) => new Or_($l, $r))
),
]
));
$parser = $expr->thenEof(); // check if we reached the end of the input
$result = $parser->tryString('$isBlue AND NOT ($isEdible OR $isDrinkable)');
assertEquals(
new And_(
new Variable('isBlue'),
new Not_(
new Or_(
new Variable('isEdible'),
new Variable('isDrinkable'),
)
)
),
$result->output()
);
```
Now the AST can be used for whatever purposes you need. In our Boolean example above, as an exercise you can
- add a `render()` method to write the expression back to a pretty formatted string,
- add a `reduce()` method that simplifies the AST (eg turning `TRUE AND TRUE` into `TRUE`),
- add an `evaluate(['isBlue' => true, 'isEdible' => false, ...])` method that calculates the final result
- ...

View File

@@ -0,0 +1,142 @@
---
title: Functional Paradigms
sidebar_label: Functional Paradigms
---
Internally, Parsica is designed using paradigms from functional programming. We list them here for anybody who's interested in FP, but you don't need to know them to work with Parsica.
Throughout this document, `$parser1 ≡ $parser2` means that you can swap `$parser1` with `$parser2` and vice-versa, and it will not affect the outcome of your program.
## Purity
Almost all the code is pure and referentially transparent. [A notable exception](recursion) is the combo of `recursive()` and `Parser::recurse()`. The latter mutates a `Parser`. We constrained this so that you can't use the parser when it's not set up yet, and after calling `recurse()`, you can't call it again. So not strictly pure, but close enough not to matter much in practice.
The combinators are all pure. Some combinators are implemented as instance methods on `Parser`, but these are also pure. You can think of them as functions that take `$this` as the first argument.
```
$parser1->combinator($parser2)
≡ combinator($parser1, $parser2)
```
In fact, very often there are both a function and an instance method for the same combinator, where one is an alias for the other.
## Types
There are no generics in PHP 7.4, but we use the Psalm static typechecker to simulate some of it. The two type are really `Parser<T>` and `ParseResult<T>`, where `T` is the type of the resulting output in the case of a successful parse.
## Either
`ParseResult<T>` is approximately an `Either<ParseFailure, ParseSuccess<T>>` type.
## Functors
`ParseResult` and `Parser` are functors, using the `map` method.
For `ParseResult`, the function is only applied to the output if `ParseResult::isSuccess()` is true, and ignored in other cases.
Similarly, mapping over `Parser` is really mapping over the future `ParseResult`.
## Monoids
`ParseResult<T>` is a monoid under the `ParseResult::append()` operation, when `T` is a monoid as well. `discard()` is the zero value.
`Parser<T>` is a monoid under the `Parser::append()`, when `T` is a monoid as well. `nothing()` is the zero value.
### Laws
#### Identity
```
$parser->append(nothing()) ≡ $parser
```
```
nothing()->append($parser) ≡ $parser
```
#### Associativity
```
$p1->append($p2)->append($p3)
≡ $p1->append($p2->append($p3))
```
## Applicative Functors
`Parser<T>` is an applicative functor.
- `pure()` is a parser that will always output its argument, no matter what the input was. Type: `T -> Parser<T>`.
- `apply()` is sequential application, aka `<*>`. `pure($callable)->apply($parser)` is a parser that applies `$callable` to the output of `$parser`. It works for callables with multiple arguments, if the callable is curried: `pure(curry($callable))->apply($p1)->apply($p2)`. We used [matteosister/php-curry](https://github.com/matteosister/php-curry) to test this, but any method for currying functions should work.
- `keepFirst()` and `keepSecond()` are `<*` and `*>` respectively. Both parsers need to succeed but only the result from one of them is returned.
### Laws
#### Identity
```
pure(identity())->apply($parser) ≡ $parser
```
#### Homomorphism
```
pure($f)->apply(pure($x)) ≡ pure($f($x))
```
#### Interchange
```
$p->apply(pure($x))
≡ pure(fn($f) => $f($x))->apply($p)
```
#### Composition
```
// Assuming that
$compose = fn($f, $g) => fn($x) => $f($g($x))
pure($compose)->apply($p1)->apply($p2)->apply($p3)
≡ $p1->apply($p2->apply($p3))
```
#### Map
```
pure($f)->apply($parser) ≡ $parser->map($f)
```
## Monads
`Parser<T>` is a monad.
- `pure()`: see above.
- `sequence()` runs two parsers in sequence, dropping the result of the first one. Both parsers consume input. You may know this as `>>`. The type of sequence is `Parser<T> -> Parser<T2> -> Parser<T2>`.
- `bind()` sequentially composes a parser and a parser-constructing function, passing the output produced by the first parser as an argument to the second. Both parsers consume input. You may know this as `>>=` or `flatmap`. Type: `Parser<T> -> (T -> Parser<T2>) -> Parser<T2>`.
### Laws
Left identity:
```
bind(pure($a), $f)
≡ pure($a)->bind($f)
≡ $f($a)
```
Right identity:
```
bind($parser, 'pure')
≡ $parser->bind('pure')
≡ $parser
```
Associativity:
```
$parser->bind($f)->bind($g)
≡ $parser->bind(fn($x) (use $f, $g) => $f($x)->bind($g))
```

View File

@@ -0,0 +1,19 @@
{
"source": {
"directories": [
"src"
],
"excludes": [
"src/PHPUnit/ParserAssertions.php"
]
},
"timeout": 5,
"logs": {
"text": "infection.log",
"perMutator": "per-mutator.md"
},
"mutators": {
"@default": true
},
"testFramework":"phpunit"
}

View File

@@ -0,0 +1,3 @@
{
"runner.bootstrap": "vendor/autoload.php"
}

27
vendor/parsica-php/parsica/phpunit.xml vendored Normal file
View File

@@ -0,0 +1,27 @@
<?xml version="1.0" encoding="UTF-8"?>
<phpunit xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
bootstrap="vendor/autoload.php"
backupGlobals="false"
backupStaticAttributes="false"
colors="true"
convertErrorsToExceptions="true"
convertNoticesToExceptions="true"
convertWarningsToExceptions="true"
processIsolation="false"
stopOnFailure="false"
xsi:noNamespaceSchemaLocation="https://schema.phpunit.de/9.3/phpunit.xsd">
<coverage>
<include>
<directory suffix=".php">src/</directory>
</include>
</coverage>
<testsuites>
<testsuite name="Parsica Unit Tests">
<directory>tests</directory>
</testsuite>
</testsuites>
<php>
<env name="APP_ENV" value="testing"/>
<ini name="xdebug.max_nesting_level" value="512"/>
</php>
</phpunit>

21
vendor/parsica-php/parsica/psalm.xml vendored Normal file
View File

@@ -0,0 +1,21 @@
<?xml version="1.0"?>
<psalm
errorLevel="1"
resolveFromConfigFile="true"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns="https://getpsalm.org/schema/config"
xsi:schemaLocation="https://getpsalm.org/schema/config vendor/vimeo/psalm/config.xsd"
findUnusedPsalmSuppress="true"
>
<projectFiles>
<directory name="src"/>
<ignoreFiles>
<directory name="vendor"/>
</ignoreFiles>
</projectFiles>
<issueHandlers>
<DeprecatedFunction errorLevel="suppress"/>
<DeprecatedMethod errorLevel="suppress"/>
</issueHandlers>
</psalm>

View File

@@ -0,0 +1,21 @@
<?php declare(strict_types=1);
/**
* This code is forked from https://github.com/matteosister/php-curry, which is abandoned. It could be integrated into
* the rest of Parsica.
*/
namespace Parsica\Parsica\Curry;
/**
* This class is created simply to define a special type
* for the placeholder. As defining a constant, even
* a random one, could collide with other values.
* @psalm-immutable
*/
final class Placeholder
{
public function __toString() : string
{
return '__';
}
}

View File

@@ -0,0 +1,87 @@
# php-curry
An implementation for currying in PHP
Currying a function means the ability to pass a subset of arguments to a function, and receive back another function that accepts the rest of the arguments. As soon as the last one is passed it gets back the final result.
Like this:
``` php
$adder = function ($a, $b, $c, $d) {
return $a + $b + $c + $d;
};
$firstTwo = C\curry($adder, 1, 2);
echo $firstTwo(3, 4); // output 10
$firstThree = $firstTwo(3);
echo $firstThree(14); // output 20
```
Currying is a powerful (yet simple) concept, very popular in other, more purely functional languages. In haskell for example, currying is the default behavior for every function.
In PHP we still need to rely on a wrapper to simulate the behavior
### Right to left
It's possible to curry a function from left (default) or from right.
``` php
$divider = function ($a, $b) {
return $a / $b;
};
$divide10By = C\curry($divider, 10);
$divideBy10 = C\curry_right($divider, 10);
echo $divide10By(10); // output 1
echo $divideBy10(100); // output 10
```
### Optional parameters
Optional parameters and currying do not play very nicely together. This library excludes optional parameters by default.
``` php
$haystack = "haystack";
$searches = ['h', 'a', 'z'];
$strpos = C\curry('strpos', $haystack); // You can pass function as string too!
var_dump(array_map($strpos, $searches)); // output [0, 1, false]
```
But strpos has an optional $offset parameter that by default has not been considered.
If you want to take this optional $offset parameter into account you should "fix" the curry to a given length.
``` php
$haystack = "haystack";
$searches = ['h', 'a', 'z'];
$strpos = C\curry_fixed(3, 'strpos', $haystack);
$finders = array_map($strpos, $searches);
var_dump(array_map(function ($finder) {
return $finder(2);
}, $finders)); // output [false, 5, false]
```
*curry_right* has its own fixed version named *curry_right_fixed*
### Placeholders
The function `__()` gets a special placeholder value used to specify "gaps" within curried functions, allowing partial application of any combination of arguments, regardless of their positions.
```php
$add = function($x, $y)
{
return $x + $y;
};
$reduce = C\curry('array_reduce');
$sum = $reduce(C\__(), $add);
echo $sum([1, 2, 3, 4], 0); // output 10
```
**Notes**:
- Placeholders should be used only for required arguments.
- When used, optional arguments must be at the end of the arguments list.

View File

@@ -0,0 +1,214 @@
<?php declare(strict_types=1);
/**
* This code is forked from https://github.com/matteosister/php-curry, which is abandoned. It could be integrated into
* the rest of Parsica.
*/
namespace Parsica\Parsica\Curry;
use Closure;
use Exception;
use ReflectionClass;
use ReflectionFunction;
/**
* @psalm-param pure-callable $callable
*
* @psalm-return pure-callable
* @throws Exception
* @psalm-pure
*/
function curry(callable $callable) : callable
{
return _number_of_required_params($callable) === 0
? _make_function($callable)
: _curry_array_args($callable, _rest(func_get_args()));
}
/**
* @psalm-param pure-callable $callable
*
* @psalm-return pure-callable
* @psalm-pure
*/
function curry_right(callable $callable) : callable
{
return _number_of_required_params($callable) < 2
? _make_function($callable)
: _curry_array_args($callable, _rest(func_get_args()), false);
}
/**
* @psalm-param pure-callable $callable
* @psalm-param array $args
* @psalm-param bool $left
*
* @psalm-return pure-callable
* @psalm-pure
*/
function _curry_array_args(callable $callable, array $args, bool $left = true) : callable
{
return function () use ($callable, $args, $left) {
if (_is_fullfilled($callable, $args)) {
return _execute($callable, $args, $left);
}
$newArgs = array_merge($args, func_get_args());
if (_is_fullfilled($callable, $newArgs)) {
return _execute($callable, $newArgs, $left);
}
return _curry_array_args($callable, $newArgs, $left);
};
}
/**
* @psalm-param pure-callable $callable
* @param array<mixed> $args
* @param mixed $left
*
* @return mixed
* @internal
* @psalm-pure
*/
function _execute(callable $callable, array $args, bool $left = true)
{
if (!$left) {
$args = array_reverse($args);
}
$placeholderPositions = _placeholder_positions($args);
if (0 < count($placeholderPositions)) {
$reqdParams = _number_of_required_params($callable);
if ($reqdParams <= _last($placeholderPositions)) {
// This means that we have more placeholderPositions than needed
// I know that throwing exceptions is not really the
// functional way, but this case should not happen.
throw new Exception("Argument Placeholder found on unexpected position!");
}
foreach ($placeholderPositions as $placeholderPosition) {
/** @psalm-suppress MixedAssignment */
$args[$placeholderPosition] = $args[$reqdParams];
array_splice($args, $reqdParams, 1);
}
}
return call_user_func_array($callable, $args);
}
/**
* @param array $args
*
* @return array
* @internal
* @psalm-pure
*/
function _rest(array $args) : array
{
return array_slice($args, 1);
}
/**
* @psalm-param pure-callable $callable
* @param array $args
*
* @return bool
* @throws Exception
* @internal
* @psalm-pure
*/
function _is_fullfilled(callable $callable, array $args) : bool
{
$nonPlaceholderArgs = array_filter(
$args,
fn($arg) => !($arg instanceof Placeholder)
);
return count($nonPlaceholderArgs) >= _number_of_required_params($callable);
}
/**
* @psalm-param pure-callable $callable
* @internal
* @psalm-pure
*/
function _number_of_required_params(callable $callable) : int
{
if (is_array($callable)) {
/** @psalm-suppress ImpureMethodCall */
$refl = new ReflectionClass($callable[0]);
/** @psalm-suppress ImpureMethodCall */
$method = $refl->getMethod($callable[1]);
/** @psalm-suppress ImpureMethodCall */
return $method->getNumberOfRequiredParameters();
} elseif (is_string($callable) || $callable instanceof Closure) {
/** @psalm-suppress ImpureMethodCall */
$refl = new ReflectionFunction($callable);
/** @psalm-suppress ImpureMethodCall */
return $refl->getNumberOfRequiredParameters();
}
throw new Exception("Unexpected other type of callable");
}
/**
* if the callback is an array(instance, method),
* it returns an equivalent function for PHP 5.3 compatibility.
*
* @psalm-param pure-callable $callable
*
* @psalm-return pure-callable
* @internal
* @psalm-pure
*/
function _make_function(callable $callable) : callable
{
if (is_array($callable)) {
return /** @return mixed */ fn() => call_user_func_array($callable, func_get_args());
}
return $callable;
}
/**
* Gets an array of placeholders positions in the given arguments.
*
* @param array $args
*
* @return list<int|string>
* @internal
* @psalm-pure
*/
function _placeholder_positions(array $args) : array
{
return array_keys(
array_filter(
$args,
fn($arg) : bool => $arg instanceof Placeholder
)
);
}
/**
* Get the last element in an array.
*
* @psalm-param array<T> $array
*
* @psalm-return null|T
* @template T
* @internal
* @psalm-pure
*/
function _last(array $array)
{
$lastKey = array_key_last($array);
return is_null($lastKey) ? null : $array[$lastKey];
}
/**
* Gets a special placeholder value used to specify "gaps" within curried
* functions, allowing partial application of any combination of arguments,
* regardless of their positions. Should be used only for required arguments.
* When used, optional arguments must be at the end of the argument list.
* @psalm-pure
*/
function __() : Placeholder
{
return new Placeholder;
}

View File

@@ -0,0 +1,75 @@
<?php declare(strict_types=1);
/*
* This file is part of the Parsica library.
*
* Copyright (c) 2020 Mathias Verraes <mathias@verraes.net>
*
* For the full copyright and license information, please view the LICENSE
* file that was distributed with this source code.
*/
namespace Parsica\Parsica\Expression;
use Parsica\Parsica\Parser;
/**
* @internal
* @template TSymbol
* @template TExpressionAST
* @psalm-immutable
*/
final class BinaryOperator
{
/**
* @psalm-var Parser<TSymbol>
*/
private Parser $symbol;
/**
* @psalm-var pure-callable(TExpressionAST, TExpressionAST):TExpressionAST
*/
private $transform;
private string $label;
/**
* @psalm-param Parser<TSymbol> $symbol
* @psalm-param pure-callable(TExpressionAST, TExpressionAST):TExpressionAST $transform
* @psalm-param string $label
* @psalm-pure
* @psalm-suppress ImpureVariable
*/
function __construct(Parser $symbol, callable $transform, string $label = "")
{
$this->symbol = $symbol;
$this->transform = $transform;
$this->label = $label ?: $symbol->getLabel() . " operator";
}
/**
* @psalm-return Parser<TSymbol>
* @psalm-mutation-free
*/
function symbol(): Parser
{
return $this->symbol;
}
/**
* @psalm-return pure-callable(TExpressionAST, TExpressionAST):TExpressionAST
* @psalm-mutation-free
*/
function transform(): callable
{
return $this->transform;
}
/**
* @psalm-mutation-free
*/
function label(): string
{
return $this->label;
}
}

View File

@@ -0,0 +1,27 @@
<?php declare(strict_types=1);
/*
* This file is part of the Parsica library.
*
* Copyright (c) 2020 Mathias Verraes <mathias@verraes.net>
*
* For the full copyright and license information, please view the LICENSE
* file that was distributed with this source code.
*/
namespace Parsica\Parsica\Expression;
use Parsica\Parsica\Parser;
/**
* @internal
* @template TExpressionAST
* @psalm-immutable
*/
interface ExpressionType
{
/**
* @psalm-param Parser<TExpressionAST> $previousPrecedenceLevel
* @psalm-return Parser<TExpressionAST>
*/
public function buildPrecedenceLevel(Parser $previousPrecedenceLevel): Parser;
}

View File

@@ -0,0 +1,90 @@
<?php declare(strict_types=1);
/*
* This file is part of the Parsica library.
*
* Copyright (c) 2020 Mathias Verraes <mathias@verraes.net>
*
* For the full copyright and license information, please view the LICENSE
* file that was distributed with this source code.
*/
namespace Parsica\Parsica\Expression;
use Parsica\Parsica\Parser;
use function Parsica\Parsica\Curry\curry;
use function Parsica\Parsica\choice;
use function Parsica\Parsica\collect;
use function Parsica\Parsica\Internal\FP\flip;
use function Parsica\Parsica\Internal\FP\foldl;
use function Parsica\Parsica\many;
use function Parsica\Parsica\map;
use function Parsica\Parsica\pure;
/**
* @internal
* @template TSymbol
* @template TExpressionAST
* @psalm-immutable
*/
final class LeftAssoc implements ExpressionType
{
/** @psalm-var non-empty-list<BinaryOperator<TSymbol, TExpressionAST>> */
private array $operators;
/**
* @internal
* @psalm-param non-empty-list<BinaryOperator<TSymbol, TExpressionAST>> $operators
* @psalm-pure
* @psalm-suppress ImpureVariable
*/
function __construct(array $operators)
{
$this->operators = $operators;
}
/**
* @psalm-param Parser<TExpressionAST> $previousPrecedenceLevel
* @psalm-return Parser<TExpressionAST>
* @psalm-mutation-free
*/
public function buildPrecedenceLevel(Parser $previousPrecedenceLevel): Parser
{
/**
* @psalm-var list<Parser<pure-callable(Parser<TExpressionAST>):Parser<TExpressionAST>>> $operatorParsers
*/
$operatorParsers = [];
// @todo use folds?
foreach ($this->operators as $operator) {
$operatorParsers[] =
pure(curry(flip($operator->transform())))
->apply($operator->symbol()->followedBy($previousPrecedenceLevel))
->label($operator->label());
}
return map(
collect(
$previousPrecedenceLevel,
many(choice(...$operatorParsers))
),
/**
* @psalm-param array{0: TExpressionAST, 1: list<pure-callable(TExpressionAST):TExpressionAST>} $o
* @psalm-return TExpressionAST
* @psalm-pure
*/
fn(array $o) => foldl(
$o[1],
/**
* @psalm-param TExpressionAST $acc
* @psalm-param pure-callable(TExpressionAST):TExpressionAST $appl
* @psalm-return TExpressionAST
* @psalm-pure
*/
fn($acc, callable $appl) => $appl($acc),
$o[0]
)
);
}
}

View File

@@ -0,0 +1,65 @@
<?php declare(strict_types=1);
/*
* This file is part of the Parsica library.
*
* Copyright (c) 2020 Mathias Verraes <mathias@verraes.net>
*
* For the full copyright and license information, please view the LICENSE
* file that was distributed with this source code.
*/
namespace Parsica\Parsica\Expression;
use Parsica\Parsica\Parser;
use function Parsica\Parsica\choice;
use function Parsica\Parsica\collect;
use function Parsica\Parsica\map;
/**
* @internal
* @template TSymbol
* @template TExpressionAST
* @psalm-immutable
*/
final class NonAssoc implements ExpressionType
{
/**
* @psalm-var BinaryOperator<TSymbol, TExpressionAST>
*/
private BinaryOperator $operator;
/**
* @psalm-param BinaryOperator<TSymbol, TExpressionAST> $operator
* @psalm-pure
* @psalm-suppress ImpureVariable
*/
function __construct(BinaryOperator $operator)
{
$this->operator = $operator;
}
/**
* @psalm-param Parser<TExpressionAST> $previousPrecedenceLevel
* @psalm-return Parser<TExpressionAST>
*/
public function buildPrecedenceLevel(Parser $previousPrecedenceLevel): Parser
{
return choice(
map(
collect(
$previousPrecedenceLevel,
$this->operator->symbol(),
$previousPrecedenceLevel
),
/**
* @psalm-param array{0: TExpressionAST, 1: TSymbol, 2: TExpressionAST} $o
* @psalm-return TExpressionAST
* @psalm-pure
* @psalm-suppress ImpureVariable
*/
fn(array $o) => $this->operator->transform()($o[0], $o[2])),
$previousPrecedenceLevel
);
}
}

View File

@@ -0,0 +1,51 @@
<?php declare(strict_types=1);
/*
* This file is part of the Parsica library.
*
* Copyright (c) 2020 Mathias Verraes <mathias@verraes.net>
*
* For the full copyright and license information, please view the LICENSE
* file that was distributed with this source code.
*/
namespace Parsica\Parsica\Expression;
use Parsica\Parsica\Parser;
use function Parsica\Parsica\choice;
use function Parsica\Parsica\keepFirst;
use function Parsica\Parsica\pure;
/**
* @internal
* @template TSymbol
* @template TExpressionAST
* @psalm-immutable
*/
final class Postfix implements ExpressionType
{
/** @psalm-var non-empty-list<UnaryOperator<TSymbol, TExpressionAST>> */
private array $operators;
/**
* @psalm-param non-empty-list<UnaryOperator<TSymbol, TExpressionAST>> $operators
* @psalm-pure
* @psalm-suppress ImpureVariable
*/
function __construct(array $operators)
{
$this->operators = $operators;
}
public function buildPrecedenceLevel(Parser $previousPrecedenceLevel): Parser
{
$operatorParsers = [];
foreach ($this->operators as $operator) {
$operatorParsers[] =
pure($operator->transform())
->apply(keepFirst($previousPrecedenceLevel, $operator->symbol()))
->label($operator->label());
}
return choice(...$operatorParsers)->or($previousPrecedenceLevel);
}
}

View File

@@ -0,0 +1,51 @@
<?php declare(strict_types=1);
/*
* This file is part of the Parsica library.
*
* Copyright (c) 2020 Mathias Verraes <mathias@verraes.net>
*
* For the full copyright and license information, please view the LICENSE
* file that was distributed with this source code.
*/
namespace Parsica\Parsica\Expression;
use Parsica\Parsica\Parser;
use function Parsica\Parsica\choice;
use function Parsica\Parsica\pure;
/**
* @internal
* @template TSymbol
* @template TExpressionAST
* @psalm-immutable
*/
final class Prefix implements ExpressionType
{
/** @psalm-var non-empty-list<UnaryOperator<TSymbol, TExpressionAST>> */
private array $operators;
/**
* @psalm-param non-empty-list<UnaryOperator<TSymbol, TExpressionAST>> $operators
* @psalm-pure
* @psalm-suppress ImpureVariable
*/
function __construct(array $operators)
{
$this->operators = $operators;
}
public function buildPrecedenceLevel(Parser $previousPrecedenceLevel): Parser
{
$operatorParsers = [];
foreach ($this->operators as $operator) {
$operatorParsers[] =
pure($operator->transform())
->apply($operator->symbol()->followedBy($previousPrecedenceLevel))
->label($operator->label());
}
return choice(...$operatorParsers)->or($previousPrecedenceLevel);
}
}

View File

@@ -0,0 +1,86 @@
<?php declare(strict_types=1);
/*
* This file is part of the Parsica library.
*
* Copyright (c) 2020 Mathias Verraes <mathias@verraes.net>
*
* For the full copyright and license information, please view the LICENSE
* file that was distributed with this source code.
*/
namespace Parsica\Parsica\Expression;
use Parsica\Parsica\Parser;
use function Parsica\Parsica\Curry\curry;
use function Parsica\Parsica\choice;
use function Parsica\Parsica\collect;
use function Parsica\Parsica\Internal\FP\foldr;
use function Parsica\Parsica\keepFirst;
use function Parsica\Parsica\many;
use function Parsica\Parsica\map;
use function Parsica\Parsica\pure;
/**
* @internal
* @template TSymbol
* @template TExpressionAST
* @psalm-immutable
*/
final class RightAssoc implements ExpressionType
{
/** @var non-empty-list<BinaryOperator<TSymbol, TExpressionAST>> */
private array $operators;
/**
* @internal
* @psalm-param non-empty-list<BinaryOperator<TSymbol, TExpressionAST>> $operators
* @psalm-pure
* @psalm-suppress ImpureVariable
*/
function __construct(array $operators)
{
$this->operators = $operators;
}
/**
* @psalm-param Parser<TExpressionAST> $previousPrecedenceLevel
* @psalm-return Parser<TExpressionAST>
*/
public function buildPrecedenceLevel(Parser $previousPrecedenceLevel): Parser
{
/**
* @psalm-var list<Parser<pure-callable(Parser<TExpressionAST>):Parser<TExpressionAST>>> $operatorParsers
*/
$operatorParsers = [];
foreach ($this->operators as $operator) {
$operatorParsers[] =
pure(curry($operator->transform()))
->apply(keepFirst($previousPrecedenceLevel, $operator->symbol()))
->label($operator->label());
}
return map(
collect(
many(choice(...$operatorParsers)),
$previousPrecedenceLevel
),
/**
* @psalm-param array{0: list<pure-callable(TExpressionAST):TExpressionAST>, 1: TExpressionAST} $o
* @psalm-return TExpressionAST
*/
fn(array $o) => foldr(
$o[0],
/**
* @psalm-param pure-callable(TExpressionAST):TExpressionAST $appl
* @psalm-param TExpressionAST $acc
* @psalm-return TExpressionAST
*/
fn(callable $appl, $acc) => $appl($acc),
$o[1]
)
);
}
}

View File

@@ -0,0 +1,69 @@
<?php declare(strict_types=1);
/*
* This file is part of the Parsica library.
*
* Copyright (c) 2020 Mathias Verraes <mathias@verraes.net>
*
* For the full copyright and license information, please view the LICENSE
* file that was distributed with this source code.
*/
namespace Parsica\Parsica\Expression;
use Parsica\Parsica\Parser;
/**
* @internal
* @template TSymbol
* @template TExpressionAST
* @psalm-immutable
*/
final class UnaryOperator
{
/**
* @psalm-var Parser<TSymbol>
*/
private Parser $symbol;
/**
* @psalm-var callable(TExpressionAST):TExpressionAST
*/
private $transform;
private string $label;
/**
* @psalm-param Parser<TSymbol> $symbol
* @psalm-param callable(TExpressionAST):TExpressionAST $transform
* @psalm-param string $label
* @psalm-pure
* @psalm-suppress ImpureVariable
*/
function __construct(Parser $symbol, callable $transform, string $label = "")
{
$this->symbol = $symbol;
$this->transform = $transform;
$this->label = $label ?: $symbol->getLabel() . " operator";
}
/**
* @psalm-return Parser<TSymbol>
*/
function symbol(): Parser
{
return $this->symbol;
}
/**
* @psalm-return callable(TExpressionAST):TExpressionAST
*/
function transform(): callable
{
return $this->transform;
}
function label(): string
{
return $this->label;
}
}

View File

@@ -0,0 +1,159 @@
<?php declare(strict_types=1);
/**
* This file is part of the Parsica library.
*
* Copyright (c) 2020 Mathias Verraes <mathias@verraes.net>
*
* For the full copyright and license information, please view the LICENSE
* file that was distributed with this source code.
*/
namespace Parsica\Parsica\Expression;
use Parsica\Parsica\Internal\Assert;
use Parsica\Parsica\Parser;
use function Parsica\Parsica\Internal\FP\foldl;
/**
* Build an expression parser from a term parser and an expression table.
*
* @api
*
* @template TTerm
* @template TExpressionAST
*
* @psalm-param Parser<TTerm> $term
* @psalm-param list<ExpressionType> $expressionTable
*
* @psalm-return Parser<TExpressionAST>
* @psalm-pure
*/
function expression(Parser $term, array $expressionTable): Parser
{
/**
* @psalm-var Parser<TExpressionAST> $parser
*/
$parser = foldl(
$expressionTable,
fn(Parser $previous, ExpressionType $next) => $next->buildPrecedenceLevel($previous),
$term
);
return $parser;
}
/**
* A binary operator in an expression. The operands of the expression will be passed into $transform to produce the
* output of the expression parser.
*
* @api
*
* @template TSymbol
* @template TExpressionAST
* @psalm-param Parser<TSymbol> $symbol
* @psalm-param pure-callable(TExpressionAST, TExpressionAST):TExpressionAST $transform
* @psalm-param string $label
*
* @psalm-return BinaryOperator<TSymbol, TExpressionAST>
* @psalm-pure
*/
function binaryOperator(Parser $symbol, callable $transform, string $label = ""): BinaryOperator
{
return new BinaryOperator($symbol, $transform, $label);
}
/**
* A unary operator in an expression. The operands of the expression will be passed into $transform to produce the
* output of the expression parser.
*
* @api
*
* @template TSymbol
* @template TExpressionAST
* @psalm-param Parser<TSymbol> $symbol
* @psalm-param callable(TExpressionAST):TExpressionAST $transform
* @psalm-param string $label
*
* @return UnaryOperator<TSymbol, TExpressionAST>
* @psalm-pure
*/
function unaryOperator(Parser $symbol, callable $transform, string $label = ""): UnaryOperator
{
return new UnaryOperator($symbol, $transform, $label);
}
/**
* @api
* @template TSymbol
* @template TExpressionAST
* @psalm-param non-empty-list<BinaryOperator<TSymbol, TExpressionAST>> $operators
* @psalm-return LeftAssoc<TSymbol, TExpressionAST>
* @psalm-pure
*/
function leftAssoc(BinaryOperator ...$operators): LeftAssoc
{
/** @psalm-suppress ImpureMethodCall */
Assert::nonEmptyList($operators, "LeftAssoc expects at least one Operator");
return new LeftAssoc($operators);
}
/**
* @api
* @template TSymbol
* @template TExpressionAST
* @psalm-param non-empty-list<BinaryOperator<TSymbol,TExpressionAST>> $operators
* @psalm-return RightAssoc<TSymbol, TExpressionAST>
* @psalm-pure
*/
function rightAssoc(BinaryOperator ...$operators): RightAssoc
{
/** @psalm-suppress ImpureMethodCall */
Assert::nonEmptyList($operators, "RightAssoc expects at least one Operator");
return new RightAssoc($operators);
}
/**
* @api
*
* @template TSymbol
* @template TExpressionAST
* @psalm-param BinaryOperator<TSymbol, TExpressionAST> $operator
* @psalm-return NonAssoc<TSymbol, TExpressionAST>
* @psalm-pure
*/
function nonAssoc(BinaryOperator $operator): NonAssoc
{
return new NonAssoc($operator);
}
/**
* @api
*
* @template TSymbol
* @template TExpressionAST
*
* @psalm-param non-empty-list<UnaryOperator<TSymbol, TExpressionAST>> $operators
* @psalm-return Prefix<TSymbol, TExpressionAST>
* @psalm-pure
*/
function prefix(UnaryOperator ...$operators): Prefix
{
/** @psalm-suppress ImpureMethodCall */
Assert::nonEmptyList($operators, "Prefix expects at least one Operator");
return new Prefix($operators);
}
/**
* @api
*
* @template TSymbol
* @template TExpressionAST
* @psalm-param non-empty-list<UnaryOperator<TSymbol, TExpressionAST>> $operators
* @psalm-return Postfix<TSymbol, TExpressionAST>
* @psalm-pure
*/
function postfix(UnaryOperator ...$operators): Postfix
{
/** @psalm-suppress ImpureMethodCall */
Assert::nonEmptyList($operators, "Postfix expects at least one Operator");
return new Postfix($operators);
}

View File

@@ -0,0 +1,125 @@
<?php declare(strict_types=1);
/**
* This file is part of the Parsica library.
*
* Copyright (c) 2020 Mathias Verraes <mathias@verraes.net>
*
* For the full copyright and license information, please view the LICENSE
* file that was distributed with this source code.
*/
namespace Parsica\Parsica\Internal;
/**
* @internal
* @psalm-immutable
*/
final class Ascii
{
private function __construct()
{
}
/**
* @psalm-pure
*/
public static function printable(string $char): string
{
switch (mb_ord($char)) {
case 0:
return "<null>";
case 1:
return "<start of header>";
case 2:
return "<start of text>";
case 3:
return "<end of text>";
case 4:
return "<end of transmission>";
case 5:
return "<enquiry>";
case 6:
return "<acknowledge>";
case 7:
return "<bell>";
case 8:
return "<backspace>";
case 9:
return "<horizontal tab>";
case 10:
return "<line feed>";
case 11:
return "<vertical tab>";
case 12:
return "<form feed>";
case 13:
return "<carriage return>";
case 14:
return "<shift out>";
case 15:
return "<shift in>";
case 16:
return "<data link escape>";
case 17:
return "<device control 1>";
case 18:
return "<device control 2>";
case 19:
return "<device control 3>";
case 20:
return "<device control 4>";
case 21:
return "<negative acknowledge>";
case 22:
return "<synchronize>";
case 23:
return "<end of transmission block>";
case 24:
return "<cancel>";
case 25:
return "<end of medium>";
case 26:
return "<substitute>";
case 27:
return "<escape>";
case 28:
return "<file separator>";
case 29:
return "<group separator>";
case 30:
return "<record separator>";
case 31:
return "<unit separator>";
case 32:
return "<space>";
case 34:
return "<double quote>";
case 39:
return "<single quote>";
case 47:
return "<slash>";
case 92:
return "<backslash>";
case 96:
return "<accent>";
case 127:
return "<delete>";
case 130:
return "<single low-9 quotation mark>";
case 132:
return "<double low-9 quotation mark>";
case 145:
return "<left single quotation mark>";
case 146:
return "<right single quotation mark>";
case 147:
return "<left double quotation mark>";
case 148:
return "<right double quotation mark>";
case 160:
return "<non-breaking space>";
default:
return "'$char'";
}
}
}

View File

@@ -0,0 +1,111 @@
<?php declare(strict_types=1);
/**
* This file is part of the Parsica library.
*
* Copyright (c) 2020 Mathias Verraes <mathias@verraes.net>
*
* For the full copyright and license information, please view the LICENSE
* file that was distributed with this source code.
*/
namespace Parsica\Parsica\Internal;
use InvalidArgumentException;
/**
* @internal
* @psalm-immutable
*/
final class Assert
{
private function __construct()
{
}
/**
* @throws InvalidArgumentException
* @internal
*/
public static function nonEmpty(string $str): void
{
Assert::minLength($str, 1, "The string must not be empty.");
}
/**
* @psalm-assert list $l
* @psalm-assert !empty $l
* @throws InvalidArgumentException
*/
public static function nonEmptyList(array $l, string $message): void
{
if (empty($l)) {
throw new InvalidArgumentException($message);
}
}
/**
* @throws InvalidArgumentException
* @internal
*/
public static function minLength(string $value, int $length, string $message): void
{
if (mb_strlen($value) < $length) {
throw new InvalidArgumentException($message);
}
}
/**
* @psalm-param list<string> $chars
*
* @throws InvalidArgumentException
* @internal
*/
public static function singleChars(array $chars): void
{
foreach ($chars as $char) {
Assert::singleChar($char);
}
}
/**
* @throws InvalidArgumentException
* @internal
*/
public static function singleChar(string $char): void
{
Assert::length($char, 1, "The argument must be a single character");
}
/**
* @throws InvalidArgumentException
* @internal
*/
public static function length(string $value, int $length, string $message): void
{
if ($length !== mb_strlen($value)) {
throw new InvalidArgumentException($message);
}
}
/**
* @psalm-param mixed $f
* @internal
* @param callable|mixed $f
*/
public static function isCallable($f, string $message) : void
{
if (!is_callable($f)) {
throw new InvalidArgumentException($message);
}
}
/**
* @throws InvalidArgumentException
* @internal
*/
public static function atLeastOneArg(array $args, string $source): void
{
if (0 == count($args)) {
throw new InvalidArgumentException("$source expects at least one Parser");
}
}
}

View File

@@ -0,0 +1,16 @@
<?php declare(strict_types=1);
/**
* This file is part of the Parsica library.
*
* Copyright (c) 2020 Mathias Verraes <mathias@verraes.net>
*
* For the full copyright and license information, please view the LICENSE
* file that was distributed with this source code.
*/
namespace Parsica\Parsica\Internal;
final class EndOfStream extends \Exception
{
}

View File

@@ -0,0 +1,70 @@
<?php declare(strict_types=1);
/*
* This file is part of the Parsica library.
*
* Copyright (c) 2020 Mathias Verraes <mathias@verraes.net>
*
* For the full copyright and license information, please view the LICENSE
* file that was distributed with this source code.
*/
namespace Parsica\Parsica\Internal\FP;
/**
* Swaps the arguments of the callable, returning a callable.
*
* @internal
* @template Ta
* @template Tb
* @template Tc
* @psalm-param pure-callable(Ta, Tb):Tc $f
* @psalm-return pure-callable(Tb, Ta):Tc
* @psalm-pure
*/
function flip(callable $f): callable
{
/**
* @psalm-param Ta $x
* @psalm-param Tb $y
* @psalm-return Tc
*/
return fn($x, $y) => $f($y, $x);
}
/**
* @template TA
* @template TB
*
* @psalm-param list<TA> $input
* @psalm-param callable(TB, TA):TB $function
* @psalm-param TB $initial
* @psalm-return TB
*
* @internal
* @psalm-pure
*/
function foldl(array $input, callable $function, $initial) {
/** @psalm-suppress ImpureFunctionCall */
return array_reduce($input, $function, $initial);
}
/**
* @template TA
* @template TB
*
* @psalm-param list<TA> $input
* @psalm-param pure-callable(TA, TB):TB $function
* @psalm-param TB $initial
* @psalm-return TB
*
* @internal
* @psalm-pure
*/
function foldr(array $input, callable $function, $initial) {
while($head = array_pop($input))
{
$initial = $function($head, $initial);
}
return $initial;
}

View File

@@ -0,0 +1,162 @@
<?php declare(strict_types=1);
/**
* This file is part of the Parsica library.
*
* Copyright (c) 2020 Mathias Verraes <mathias@verraes.net>
*
* For the full copyright and license information, please view the LICENSE
* file that was distributed with this source code.
*/
namespace Parsica\Parsica\Internal;
use BadMethodCallException;
use Parsica\Parsica\Parser;
use Parsica\Parsica\ParseResult;
use Parsica\Parsica\ParserHasFailed;
use Parsica\Parsica\Stream;
use function Parsica\Parsica\isEqual;
use function Parsica\Parsica\notPred;
/**
* The return value of a failed parser.
*
* @template T
* @internal
* @psalm-immutable
*/
final class Fail implements ParseResult
{
private string $expected;
private Stream $got;
/**
* @internal
*/
public function __construct(string $expected, Stream $got)
{
$this->expected = $expected;
$this->got = $got;
}
/**
* @api
*/
public function errorMessage(): string
{
try {
$firstChar = $this->got->take1()->chunk();
$unexpected = Ascii::printable($firstChar);
$body = $this->got()->takeWhile(notPred(isEqual("\n")))->chunk();
} catch (EndOfStream $e) {
$unexpected = $body = "<EOF>";
}
$lineNumber = $this->got->position()->line();
$spaceLength = str_repeat(" ", strlen((string)$lineNumber));
$expecting = $this->expected;
$position = $this->got->position()->pretty();
$columnNumber = $this->got->position()->column();
$leftDots = $columnNumber == 1 ? "" : "...";
$leftSpace = $columnNumber == 1 ? "" : " ";
$bodyLine = "$lineNumber | $leftDots$body";
$bodyLine = strlen($bodyLine) > 80 ? (substr($bodyLine, 0, 77) . "...") : $bodyLine;
return
"$position\n"
. "$spaceLength |\n"
. "$bodyLine\n"
. "$spaceLength | $leftSpace^— column $columnNumber\n"
. "Unexpected $unexpected\n"
. "Expecting $expecting";
}
public function got(): Stream
{
return $this->got;
}
public function expected(): string
{
return $this->expected;
}
public function isSuccess(): bool
{
return false;
}
public function isFail(): bool
{
return !$this->isSuccess();
}
/**
* @psalm-return T
*/
public function output()
{
throw new BadMethodCallException("Can't read the output of a failed ParseResult.");
}
/**
* @psalm-param ParseResult<T> $other
*
* @psalm-return ParseResult<T>
*/
public function append(ParseResult $other): ParseResult
{
return $this;
}
/**
* Map a function over the output
*
* @template T2
*
* @psalm-param callable(T) : T2 $transform
*
* @psalm-return ParseResult<T2>
*/
public function map(callable $transform): ParseResult
{
return $this;
}
/**
* @template T2
*
* @psalm-param Parser<T2> $parser
*
* @psalm-return ParseResult<T2>
*/
public function continueWith(Parser $parser): ParseResult
{
return $this;
}
/**
* @inheritDoc
*/
public function remainder(): Stream
{
throw new BadMethodCallException("Can't read the remainder of a failed ParseResult.");
}
/**
* @inheritDoc
*/
public function position(): Position
{
return $this->got->position();
}
/**
* @inheritDoc
*/
public function throw() : void
{
throw new ParserHasFailed($this);
}
}

View File

@@ -0,0 +1,88 @@
<?php declare(strict_types=1);
/**
* This file is part of the Parsica library.
*
* Copyright (c) 2020 Mathias Verraes <mathias@verraes.net>
*
* For the full copyright and license information, please view the LICENSE
* file that was distributed with this source code.
*/
namespace Parsica\Parsica\Internal;
/**
* File, line, and column position of the parser.
*
* @psalm-immutable
* @psalm-external-mutation-free
*/
final class Position
{
/** @psalm-readonly */
private string $filename;
/** @psalm-readonly */
private int $line;
/** @psalm-readonly */
private int $column;
function __construct(string $filename, int $line, int $column)
{
$this->filename = $filename;
$this->line = $line;
$this->column = $column;
}
/**
* Initial position (line 1, column 1). The optional filename is the source of the input, and is really just a label
* to make more useful error messages.
*/
public static function initial(string $filename = "<input>"): Position
{
return new Position($filename, 1, 1);
}
/**
* Pretty print as "filename:line:column"
*/
public function pretty(): string
{
return $this->filename . ":" . $this->line . ":" . $this->column;
}
public function advance(string $parsed): Position
{
$column = $this->column;
$line = $this->line;
foreach (mb_str_split($parsed, 1) as $char) {
switch ($char) {
case "\n":
case "\r":
$line++;
$column = 1;
break;
case "\t":
$column = $column + 4 - (($column - 1) % 4);
break;
default:
$column++;
}
}
return new Position($this->filename, $line, $column);
}
public function filename(): string
{
return $this->filename;
}
public function line(): int
{
return $this->line;
}
public function column(): int
{
return $this->column;
}
}

View File

@@ -0,0 +1,192 @@
<?php declare(strict_types=1);
/**
* This file is part of the Parsica library.
*
* Copyright (c) 2020 Mathias Verraes <mathias@verraes.net>
*
* For the full copyright and license information, please view the LICENSE
* file that was distributed with this source code.
*/
namespace Parsica\Parsica\Internal;
use BadMethodCallException;
use Exception;
use Parsica\Parsica\Parser;
use Parsica\Parsica\ParseResult;
use Parsica\Parsica\Stream;
/**
* @internal
*
* @template T
* @psalm-immutable
*/
final class Succeed implements ParseResult
{
/**
* @psalm-var T
*/
private $output;
private Stream $remainder;
/**
* @psalm-param T $output
*
* @internal
* @psalm-pure
* @psalm-suppress ImpureVariable
*/
public function __construct($output, Stream $remainder)
{
$this->output = $output;
$this->remainder = $remainder;
}
/**
* @psalm-return T
* @psalm-mutation-free
*/
public function output()
{
return $this->output;
}
/**
* @psalm-mutation-free
*/
public function remainder(): Stream
{
return $this->remainder;
}
/**
* @psalm-mutation-free
*/
public function isSuccess(): bool
{
return true;
}
/**
* @psalm-mutation-free
*/
public function isFail(): bool
{
return !$this->isSuccess();
}
/**
* @psalm-mutation-free
*/
public function expected(): string
{
throw new BadMethodCallException("Can't read the expectation of a succeeded ParseResult.");
}
/**
* @psalm-mutation-free
*/
public function got(): Stream
{
throw new BadMethodCallException("Can't read the expectation of a succeeded ParseResult.");
}
/**
* @inheritDoc
*
* @psalm-param ParseResult<T> $other
* @psalm-return ParseResult<T>
* @psalm-mutation-free
*/
public function append(ParseResult $other): ParseResult
{
if ($other->isFail()) {
return $other;
} else {
/** @psalm-suppress ArgumentTypeCoercion */
return $this->appendSuccess($other);
}
}
/**
* @TODO This is hardcoded to only deal with certain types. We need an interface with a append() for arbitrary types.
*/
private function appendSuccess(Succeed $other): ParseResult
{
$type1isNull = is_null($this->output);
$type2isNull = is_null($other->output);
// Ignore nulls
if ($type1isNull && $type2isNull) {
return new Succeed(null, $other->remainder);
} elseif(!$type1isNull && $type2isNull) {
return new Succeed($this->output, $other->remainder);
} elseif($type1isNull) {
return new Succeed($other->output, $other->remainder);
}
if (is_string($this->output) && is_string($other->output)) {
return new Succeed($this->output . $other->output, $other->remainder);
} elseif (is_array($this->output) && is_array($other->output)) {
return new Succeed(
array_merge($this->output, $other->output),
$other->remainder
);
}
$type1 = gettype($this->output);
$type2 = gettype($other->output);
throw new Exception("Append only works for ParseResult<T> instances with the same type T, got ParseResult<$type1> and ParseResult<$type2>.");
}
/**
* Map a function over the output
*
* @template T2
*
* @psalm-param pure-callable(T):T2 $transform
*
* @psalm-return ParseResult<T2>
* @psalm-mutation-free
*/
public function map(callable $transform): ParseResult
{
return new Succeed($transform($this->output), $this->remainder);
}
/**
* @template T2
*
* @psalm-param Parser<T2> $parser
*
* @psalm-return ParseResult<T2>
*/
public function continueWith(Parser $parser): ParseResult
{
return $parser->run($this->remainder);
}
public function errorMessage(): string
{
throw new BadMethodCallException("A succeeded ParseResult has no error message.");
}
/**
* @inheritDoc
*/
public function position(): Position
{
return $this->remainder->position();
}
/**
* @inheritDoc
*/
public function throw() : void
{
throw new BadMethodCallException("You can't throw a successful ParseResult.");
}
}

View File

@@ -0,0 +1,41 @@
<?php declare(strict_types=1);
/**
* This file is part of the Parsica library.
*
* Copyright (c) 2020 Mathias Verraes <mathias@verraes.net>
*
* For the full copyright and license information, please view the LICENSE
* file that was distributed with this source code.
*/
namespace Parsica\Parsica\Internal;
use Parsica\Parsica\Stream;
/**
* The result of Stream::take*() functions
*
* @internal
* @psalm-immutable
*/
final class TakeResult
{
private string $chunk;
private Stream $stream;
function __construct(string $chunk, Stream $stream)
{
$this->chunk = $chunk;
$this->stream = $stream;
}
function chunk(): string
{
return $this->chunk;
}
function stream(): Stream
{
return $this->stream;
}
}

View File

@@ -0,0 +1,228 @@
<?php declare(strict_types=1);
/**
* This file is part of the Parsica library.
*
* Copyright (c) 2020 Mathias Verraes <mathias@verraes.net>
*
* For the full copyright and license information, please view the LICENSE
* file that was distributed with this source code.
*/
namespace Parsica\Parsica\JSON;
use Parsica\Parsica\Parser;
use function Parsica\Parsica\{any,
between,
char,
choice,
collect,
float,
hexDigitChar,
isCharCode,
keepFirst,
map,
recursive,
repeat,
satisfy,
sepBy,
string,
takeWhile,
zeroOrMore};
/**
* JSON parser and utility parsers
*
* @TODO fix psalm annotations
* @psalm-immutable
*/
final class JSON
{
private function __construct()
{
}
/**
* Fully compliant JSON parser, built entirely in Parsica. The output is compatible with PHP's native json_decode().
*
* It was built to illustrate the usage of Parsica on a real world format, and to benchmark Parsica against
* json_decode(). It will probably never reach the same performance as a C extension, so it shouldn't be used for
* typical production JSON parsing.
*
* It could however be useful as a basis to expand into a custom JSON parser, for example to expand JSON with custom
* notations or comments, or to return a custom AST instead of json_decode()'s plain PHP objects & arrays.
*
* To understand the terminology and the structure, have a peak at {@see https://www.json.org/json-en.html}
*
* @api
* @psalm-return Parser<mixed>
*/
public static function json(): Parser
{
return JSON::ws()->sequence(JSON::element());
}
/**
* @template T
* @psalm-return Parser<mixed>
* @psalm-suppress DocblockTypeContradiction
*/
public static function element(): Parser
{
// Memoize $element so we can keep reusing it for recursion.
/** @psalm-var Parser<mixed> $element */
static $element;
if (!isset($element)) {
$element = recursive();
$element->recurse(
any(
JSON::object(),
JSON::array(),
JSON::stringLiteral(),
JSON::number(),
JSON::true(),
JSON::false(),
JSON::null(),
)
);
}
return $element;
}
/**
* @psalm-return Parser<object>
*/
public static function object(): Parser
{
return map(
between(
JSON::token(char('{')),
JSON::token(char('}')),
sepBy(
JSON::token(char(',')),
JSON::member()
)
),
/**
* @psalm-param list<array{string:mixed}> $members
* @psalm-return object
*/
fn(array $members):object => (object)array_merge(...$members));
}
/**
* @psalm-return Parser<list<mixed>>
*/
public static function array(): Parser
{
return between(
JSON::token(char('[')),
JSON::token(char(']')),
sepBy(
JSON::token(char(',')), JSON::element()
)
);
}
/**
* @psalm-return Parser<bool>
*/
public static function true(): Parser
{
return JSON::token(string('true'))->map(fn($_) => true)->label('true');
}
/**
* @psalm-return Parser<bool>
*/
public static function false(): Parser
{
return JSON::token(string('false'))->map(fn($_) => false)->label('false');
}
/**
* @psalm-return Parser<null>
*/
public static function null(): Parser
{
return JSON::token(string('null'))->map(fn($_) => null)->label('null');
}
/**
* Whitespace
*
* @psalm-return Parser<null>
*/
public static function ws(): Parser
{
return takeWhile(isCharCode([0x20, 0x0A, 0x0D, 0x09]))->voidLeft(null)
->label('whitespace');
}
/**
* Apply $parser and consume all the following whitespace.
*
* @template T
* @psalm-param Parser<T> $parser
* @psalm-return Parser<T>
*/
public static function token(Parser $parser): Parser
{
return keepFirst($parser, JSON::ws());
}
public static function number(): Parser
{
return JSON::token(float())->map('floatval')->label("number");
}
/**
* @psalm-return Parser<string>
*/
public static function stringLiteral(): Parser
{
return JSON::token(
between(
char('"'),
char('"'),
zeroOrMore(
choice(
satisfy(fn(string $char): bool => !in_array($char, ['"', '\\'])),
char("\\")->followedBy(
choice(
char("\"")->map(fn($_) => '"'),
char("\\")->map(fn($_) => '\\'),
char("/")->map(fn($_) => '/'),
char("b")->map(fn($_) => mb_chr(8)),
char("f")->map(fn($_) => mb_chr(12)),
char("n")->map(fn($_) => "\n"),
char("r")->map(fn($_) => "\r"),
char("t")->map(fn($_) => "\t"),
char("u")->sequence(repeat(4, hexDigitChar()))->map(fn($o) => mb_chr(hexdec($o))),
)
)
)
)
)->map(fn($o): string => (string)$o) // because the empty json string returns null
)->label("string literal");
}
/**
* @return Parser<array{string:mixed}>
*/
public static function member(): Parser
{
return map(
collect(
JSON::stringLiteral(),
JSON::token(char(':')),
JSON::token(JSON::element())
),
/**
* @psalm-param array{0:string, 1:string, 2:mixed} $o
* @psalm-return array{string:mixed}
* @psalm-suppress MoreSpecificReturnType
* @psalm-suppress LessSpecificReturnStatement
*/
fn(array $o): array => [$o[0] => $o[2]]);
}
}

View File

@@ -0,0 +1,167 @@
<?php declare(strict_types=1);
/**
* This file is part of the Parsica library.
*
* Copyright (c) 2020 Mathias Verraes <mathias@verraes.net>
*
* For the full copyright and license information, please view the LICENSE
* file that was distributed with this source code.
*/
namespace Parsica\Parsica\PHPUnit;
use Exception;
use Parsica\Parsica\Parser;
use Parsica\Parsica\StringStream;
/**
* Convenience assertion methods. When writing tests for your own parsers, extend from this instead of PHPUnit's TestCase.
*
* @TODO move to standalone package
* @api
*/
trait ParserAssertions
{
/**
* @psalm-param mixed $expectedOutput
*
* @api
*/
protected function assertParses(string $input, Parser $parser, $expectedOutput, string $message = ""): void
{
$input = new StringStream($input);
$actualResult = $parser->run($input);
if ($actualResult->isSuccess()) {
$this->assertStrictlyEquals(
$expectedOutput,
$actualResult->output(),
$message . "\n" . "The parser succeeded but the output doesn't match your expected output."
);
} else {
$this->fail(
$message . "\n"
."The parser failed with the following error message:\n"
.$actualResult->errorMessage()."\n"
);
}
}
/**
* Behaves like assertSame for primitives, behaves like assertEquals for objects of the same type, and fails
* for everything else.
*
* @psalm-param mixed $expected
* @psalm-param mixed $actual
* @psalm-param string $message
*
* @throws Exception
* @api
*
* @psalm-suppress MixedArgument
* @psalm-suppress MixedAssignment
* @psalm-suppress MixedArrayAccess
*/
protected function assertStrictlyEquals($expected, $actual, string $message = ''): void
{
if (is_null($expected) || is_scalar($expected)) {
$this->assertSame($expected, $actual, $message);
} elseif (is_object($expected)) {
$this->assertEquals(get_class($expected), get_class($actual),
"Expected type didn't match actual type");
$this->assertEquals($expected, $actual, $message);
} elseif (is_array($expected)) {
foreach ($expected as $k => $v) {
$this->assertStrictlyEquals($expected[$k], $actual[$k], "Item $k from the actual array differs from item $k in the expected array");
}
$this->assertSame(count($expected), count($actual), "The length of the actual array differs from the length of the expected array.");
} else {
throw new Exception("@todo Not implemented");
}
}
abstract public static function assertSame($expected, $actual, string $message = ''): void;
abstract public static function assertEquals($expected, $actual, string $message = ''): void;
abstract public static function fail(string $message = ''): void;
/**
* @param string $input
* @param Parser $parser
* @param string $expectedRemaining
* @param string $message
*
* @api
*/
protected function assertRemainder(string $input, Parser $parser, string $expectedRemaining, string $message = ""): void
{
$input = new StringStream($input);
$actualResult = $parser->run($input);
if ($actualResult->isSuccess()) {
$this->assertEquals(
$expectedRemaining,
$actualResult->remainder(),
$message . "\n" . "The parser succeeded but the expected remaining input doesn't match."
);
} else {
$this->fail(
$message . "\n"
. "The parser failed with the following error message:\n"
.$actualResult->errorMessage()."\n"
);
}
}
/**
* @param string $input
* @param Parser $parser
* @param string|null $expectedFailure
* @param string $message
*
* @api
*/
protected function assertParseFails(string $input, Parser $parser, ?string $expectedFailure = null, string $message = ""): void
{
$input = new StringStream($input);
$actualResult = $parser->run($input);
$this->assertTrue(
$actualResult->isFail(),
$message . "\n" . "The parser succeeded but expected a failure."
);
if (isset($expectedFailure)) {
$this->assertEquals(
$expectedFailure,
$actualResult->expected(),
$message . "\n" . "The expected failure message is not the same as the actual one."
);
}
}
abstract public static function assertTrue($condition, string $message = ''): void;
/**
* @api
*/
protected function assertFailOnEOF(Parser $parser, string $message = ""): void
{
$actualResult = $parser->run(new StringStream(""));
$this->assertTrue(
$actualResult->isFail(),
$message . "\n" . "Expected the parser to fail on EOL."
);
}
/**
* @api
*/
protected function assertSucceedOnEOF(Parser $parser, string $message = ""): void
{
$actualResult = $parser->run(new StringStream(""));
$this->assertTrue(
$actualResult->isSuccess(),
$message . "\n" . "Expected the parser to succeed on EOL."
);
$this->assertSame("", $actualResult->output());
}
}

View File

@@ -0,0 +1,136 @@
<?php declare(strict_types=1);
/**
* This file is part of the Parsica library.
*
* Copyright (c) 2020 Mathias Verraes <mathias@verraes.net>
*
* For the full copyright and license information, please view the LICENSE
* file that was distributed with this source code.
*/
namespace Parsica\Parsica;
use BadMethodCallException;
use Parsica\Parsica\Internal\Position;
/**
* @template T
* @psalm-immutable
*/
interface ParseResult
{
/**
* True if the parser was successful.
*
* @api
* @psalm-mutation-free
*/
public function isSuccess(): bool;
/**
* True if the parser has failed.
*
* @api
* @psalm-mutation-free
*/
public function isFail(): bool;
/**
* The output of the parser.
*
* @psalm-return T
* @api
* @psalm-mutation-free
*/
public function output();
/**
* The part of the input that did not get parsed.
*
* @api
* @psalm-mutation-free
*/
public function remainder(): Stream;
/**
* A message that indicates what the failed parser expected to find at its position in the input. It contains the
* label that was attached to the parser.
*
* @see Parser::label()
*
* @api
* @psalm-mutation-free
*/
public function expected(): string;
/**
* A message indicating the input that the failed parser got at the point where it failed. It's only informational,
* so don't use this for processing. A future version might change this behaviour.
*
* @api
* @psalm-mutation-free
*/
public function got(): Stream;
/**
* Append the output of two successful ParseResults. If one or both have failed, it returns the first failed
* ParseResult.
*
* @psalm-param ParseResult<T> $other
*
* @psalm-return ParseResult<T>
*
* @api
* @psalm-mutation-free
*/
public function append(ParseResult $other): ParseResult;
/**
* Map a function over the output
*
* @template T2
*
* @psalm-param pure-callable(T):T2 $transform
*
* @psalm-return ParseResult<T2>
*
* @api
* @psalm-mutation-free
*/
public function map(callable $transform): ParseResult;
/**
* Use the remainder of this ParseResult as the input for a parser.
*
* @template T2
*
* @psalm-param Parser<T2> $parser
*
* @psalm-return ParseResult<T2>
*
* @api
* @psalm-mutation-free
*/
public function continueWith(Parser $parser): ParseResult;
/**
* @psalm-mutation-free
*/
public function errorMessage() : string;
/**
* Get the last position of where the parser ended up when producing this result.
* @psalm-mutation-free
*/
public function position(): Position;
/**
* Throw a ParserFailure exception if the Parser failed, or complain if you're trying to throw a successful
* ParseResult.
*
* @api
* @throws ParserHasFailed
* @throws BadMethodCallException
*/
public function throw() : void;
}

View File

@@ -0,0 +1,469 @@
<?php declare(strict_types=1);
/**
* This file is part of the Parsica library.
*
* Copyright (c) 2020 Mathias Verraes <mathias@verraes.net>
*
* For the full copyright and license information, please view the LICENSE
* file that was distributed with this source code.
*/
namespace Parsica\Parsica;
use Exception;
use Parsica\Parsica\Internal\Fail;
/**
* A parser is any function that takes a string input and returns a {@see ParseResult}. The Parser class is a wrapper
* around such functions. The {@see Parser::make()} static constructor takes a callable that does the actual parsing.
* Usually you don't need to instantiate this class directly. Instead, build your parser from existing parsers and
* combinators.
*
* At the moment, there is no Parser interface, and no Parser abstract class to extend from. This is intentional, but
* will be changed if we find use cases where those would be the best solutions.
*
* The type is Parser<T>, where T is the type of the output that the parser will produce after completing successfully.
*
* @template T
* @api
*/
final class Parser
{
/**
* @psalm-var pure-callable(Stream) : ParseResult<T> $parserF
*/
private $parserFunction;
/** @psalm-var 'non-recursive'|'awaiting-recurse'|'recursion-was-setup' */
private string $recursionStatus;
private string $label;
/**
* @psalm-param pure-callable(Stream) : ParseResult<T> $parserFunction
* @psalm-param 'non-recursive'|'awaiting-recurse'|'recursion-was-setup' $recursionStatus
* @psalm-pure
* @psalm-suppress ImpureVariable
*/
private function __construct(callable $parserFunction, string $recursionStatus, string $label)
{
$this->parserFunction = $parserFunction;
$this->recursionStatus = $recursionStatus;
$this->label = $label;
}
/**
* Create a recursive parser. Used in combination with recurse(Parser).
*
* @see recursive()
*
* @psalm-return Parser<mixed>
* @api
* @psalm-pure
*/
public static function recursive(): Parser
{
return new Parser(
// Make a placeholder parser that will throw when you try to run it.
static function (Stream $_): ParseResult {
throw new Exception(
"Can't run a recursive parser that hasn't been setup properly yet. "
. "A parser created by recursive(), must then be called with ->recurse(Parser) "
. "before it can be used."
);
}, 'awaiting-recurse', "<recursive>");
}
/**
* Make a new parser.
*
* @internal
*
* @template T2
*
* @psalm-param pure-callable(Stream):ParseResult<T2> $parserFunction
*
* @psalm-return Parser<T2>
* @psalm-pure
*/
public static function make(string $label, callable $parserFunction): Parser
{
return new Parser($parserFunction, 'non-recursive', $label);
}
/**
* Recurse on a parser. Used in combination with {@see recursive()}. After calling this method, this parser behaves
* like a regular parser.
*
* @psalm-param Parser<mixed> $parser
*
* @api
*/
public function recurse(Parser $parser): void
{
switch ($this->recursionStatus) {
case 'non-recursive':
throw new Exception(
"You can't recurse on a non-recursive parser. Create a recursive parser first using recursive(), "
. "then call ->recurse() on it."
);
case 'recursion-was-setup':
throw new Exception("You can only call recurse() once on a recursive parser.");
case 'awaiting-recurse':
// Replace the placeholder parser from recursive() with a call to the inner parser. This must be dynamic,
// because it's possible that the inner parser is also a recursive parser that has not been set up yet.
$this->parserFunction = fn(Stream $input): ParseResult => $parser->run($input);
$this->recursionStatus = 'recursion-was-setup';
$this->label = $parser->getLabel();
break;
default:
throw new Exception("Unexpected recursionStatus value");
}
}
/**
* Run the parser on an input
*
* @psalm-return ParseResult<T>
* @api
* @psalm-mutation-free
*/
public function run(Stream $input): ParseResult
{
return ($this->parserFunction)($input);
}
/**
* Optionally parse something, but still succeed if the thing is not there.
*
*
* @psalm-return Parser<T|null>
* @see optional()
* @api
* @psalm-mutation-free
*/
public function optional(): Parser
{
return optional($this);
}
/**
* Try the first parser, and failing that, try the second parser. Returns the first succeeding result, or the first
* failing result.
*
* Caveat: The order matters!
* string('http')->or(string('https')
*
* @psalm-param Parser<T> $other
*
* @psalm-return Parser<T>
* @api
* @psalm-mutation-free
*/
public function or(Parser $other): Parser
{
return either($this, $other);
}
/**
* Parse something, then follow by something else. Ignore the result of the first parser and return the result of
* the second parser.
*
* @template T2
* @psalm-param Parser<T2> $second
* @psalm-return Parser<T2>
* @api
* @see sequence()
* @psalm-mutation-free
*/
public function followedBy(Parser $second): Parser
{
return sequence($this, $second);
}
/**
* Parse something, then follow by something else. Ignore the result of the first parser and return the result of
* the second parser.
*
* @template T2
* @psalm-param Parser<T2> $second
* @psalm-return Parser<T2>
* @api
* @see sequence()
* @psalm-mutation-free
*/
public function sequence(Parser $second): Parser
{
return sequence($this, $second);
}
/**
* Parse something, then follow by something else. Ignore the result of the first parser and return the result of
* the second parser. Alias for sequence().
*
* @template T2
* @psalm-param Parser<T2> $second
* @psalm-return Parser<T2>
* @api
* @see sequence()
* @psalm-mutation-free
*/
public function then(Parser $second): Parser
{
return sequence($this, $second);
}
/**
* Create a parser that takes the output from the first parser (if successful) and feeds it to the callable. The
* callable must return another parser. If the first parser fails, the first parser is returned.
*
* @template T2
*
* @psalm-param pure-callable(T) : Parser<T2> $f
*
* @psalm-return Parser<T2>
* @see bind()
* @api
* @psalm-mutation-free
*/
public function bind(callable $f): Parser
{
return bind($this, $f);
}
/**
* Map a function over the parser (which in turn maps it over the result).
*
* @template T2
*
* @psalm-param pure-callable(T) : T2 $transform
*
* @psalm-return Parser<T2>
* @api
* @psalm-mutation-free
*/
public function map(callable $transform): Parser
{
return map($this, $transform);
}
/**
* Take the remaining input from the result and parse it.
*
* @api
* @psalm-mutation-free
*/
public function continueFrom(ParseResult $result): ParseResult
{
return $this->run($result->remainder());
}
/**
* Combine the parser with another parser of the same type, which will cause the results to be appended.
*
* @psalm-param Parser<T|null> $other
* @psalm-return Parser<T|null>
* @api
* @psalm-mutation-free
*/
public function append(Parser $other): Parser
{
return append($this, $other);
}
/**
* Combine the parser with another parser of the same type, which will cause the results to be appended.
*
* @psalm-param Parser<T|null> $other
* @psalm-return Parser<T|null>
* @api
* @psalm-mutation-free
*/
public function and(Parser $other): Parser
{
return append($this, $other);
}
/**
* Try to parse a string. Alias of `try(new StringStream($string))`.
*
* @TODO Try should fail when it doesn't consume the whole input.
*
* @psalm-param string $input
*
* @psalm-return ParseResult<T>
*
* @throws ParserHasFailed
* @api
*/
public function tryString(string $input): ParseResult
{
return $this->try(new StringStream($input));
}
/**
* Try to parse the input, or throw an exception.
*
* @TODO Try should fail when it doesn't consume the whole input.
*
* @psalm-return ParseResult<T>
*
* @throws ParserHasFailed
* @api
*/
public function try(Stream $input): ParseResult
{
$result = $this->run($input);
if ($result->isFail()) {
$result->throw();
}
return $result;
}
/**
* Sequential application. Given a parser which outputs a callable, return a new parser that applies the callable on the
* output of the second parser.
*
* The first parser must be of type Parser<callable(T1):T2>. {@see pure()} can be used to wrap a callable in a Parser.
*
* Callables with more than 1 argument need to be curried: pure(curry(fn($x, $y)))->apply($parser2)->apply($parser3)
*
* @template T2
* @template T3
* @psalm-param Parser<T2> $parser
* @psalm-return Parser<T3>
* @psalm-suppress MixedArgumentTypeCoercion
*
* @api
* @psalm-mutation-free
*/
public function apply(Parser $parser): Parser
{
return apply($this, $parser);
}
/**
* Sequence two parsers, and return the output of the first one, ignore the second.
*
* @api
* @psalm-mutation-free
*/
public function thenIgnore(Parser $other): Parser
{
return keepFirst($this, $other);
}
/**
* notFollowedBy only succeeds when $second fails. It never consumes any input.
*
* Example:
*
* `string("print")` will also match "printXYZ"
*
* `string("print")->notFollowedBy(alphaNumChar()))` will match "print something" but not "printXYZ something"
*
* @psalm-param Parser<T2> $parser
*
* @psalm-return Parser<T>
* @see notFollowedBy()
*
* @template T2
* @api
* @psalm-mutation-free
*/
public function notFollowedBy(Parser $second): Parser
{
return keepFirst($this, notFollowedBy($second));
}
/**
* The parser's label.
*
* @internal
* @psalm-mutation-free
*/
public function getLabel(): string
{
return $this->label;
}
/**
* Label a parser. When a parser fails, you'll see your label as the "expected" value. As a best practice, the
* labels should make sense to the person who provides the input for your parser. That's often an end user or a
* third party, so keep them in mind.
*
* @psalm-return Parser<T>
* @api
* @psalm-mutation-free
*/
public function label(string $label): Parser
{
$parserFn = $this->parserFunction;
$newParserFunction = static function (Stream $input) use ($parserFn, $label) : ParseResult {
/** @psalm-var ParseResult $result */
$result = ($parserFn)($input);
return ($result->isSuccess())
? $result
: new Fail($label, $result->got());
};
return new Parser($newParserFunction, $this->recursionStatus, $label);
}
/**
* If the parser is successful, call the $receiver function with the output of the parser. The resulting parser
* behaves identical to the original one. This combinator is useful for expressing side effects during the parsing
* process. It can be hooked into existing event publishing libraries by using $receiver as an adapter for those.
* Other use cases are logging, caching, performing an action whenever a value is matched in a long running input
* stream, ...
*
* @psalm-param callable(T): void $receiver
*
* @psalm-return Parser<T>
* @api
*/
public function emit(callable $receiver): Parser
{
return emit($this, $receiver);
}
/**
* Ignore the output of the parser and return the new output instead.
*
* @template T2
* @psalm-param T2 $output
* @psalm-return Parser<T2>
*
* @deprecated @TODO needs test
* @psalm-mutation-free
*/
public function voidLeft($output): Parser
{
return $this->map(
/**
* @psalm-param T $_
* @psalm-return T2
*/
fn($_) => $output
);
}
/**
* Make sure that the input ends after the parser has successfully completed. The output is the output of the
* original parser.
*
* Also useful in unit tests to make sure a parser doesn't consume more than you intended.
*
* Alias for $parser->thenIgnore(eof()).
*
* @api
* @psalm-return Parser<T>
* @psalm-mutation-free
*/
public function thenEof(): Parser
{
return keepFirst($this, eof());
// aka $this->thenIgnore(eof());
}
}

View File

@@ -0,0 +1,38 @@
<?php declare(strict_types=1);
/**
* This file is part of the Parsica library.
*
* Copyright (c) 2020 Mathias Verraes <mathias@verraes.net>
*
* For the full copyright and license information, please view the LICENSE
* file that was distributed with this source code.
*/
namespace Parsica\Parsica;
use Exception;
use Parsica\Parsica\Internal\Fail;
/**
* @api
*/
final class ParserHasFailed extends Exception
{
private Fail $parseResult;
/**
* @inheritDoc
*/
function __construct(Fail $parseResult)
{
$this->parseResult = $parseResult;
parent::__construct($this->parseResult->errorMessage());
}
function parseResult() : Fail
{
return $this->parseResult;
}
}

View File

@@ -0,0 +1,77 @@
<?php declare(strict_types=1);
/**
* This file is part of the Parsica library.
*
* Copyright (c) 2020 Mathias Verraes <mathias@verraes.net>
*
* For the full copyright and license information, please view the LICENSE
* file that was distributed with this source code.
*/
namespace Parsica\Parsica;
use Parsica\Parsica\Internal\Position;
use Parsica\Parsica\Internal\TakeResult;
/**
* Represents an input stream. This allows us to have different types of input, each with their own optimizations.
*
* @psalm-immutable
*/
interface Stream
{
/**
* Extract a single token from the stream. Throw if the stream is empty.
*
* @throw EndOfStream
* @psalm-mutation-free
*/
public function take1(): TakeResult;
/**
* Try to extract a chunk of length $n, or if the stream is too short, the rest of the stream.
*
* Valid implementation should follow the rules:
*
* 1. If the requested length <= 0, the empty token and the original stream should be returned.
* 2. If the requested length > 0 and the stream is empty, throw EndOfStream.
* 3. In other cases, take a chunk of length $n (or shorter if the stream is not long enough) from the input stream
* and return the chunk along with the rest of the stream.
*
* @throw EndOfStream
* @psalm-mutation-free
*/
public function takeN(int $n): TakeResult;
/**
* Extract a chunk of the stream, by taking tokens as long as the predicate holds. Return the chunk and the rest of
* the stream.
*
* @TODO This method isn't strictly necessary but let's see.
*
* @psalm-param pure-callable(string):bool $predicate
* @psalm-mutation-free
*/
public function takeWhile(callable $predicate) : TakeResult;
/**
* @deprecated We will need to get rid of this again at some point, we can't assume all streams will be strings
* @psalm-mutation-free
*/
public function __toString(): string;
/**
* Test if the stream is at its end.
* @psalm-mutation-free
*/
public function isEOF(): bool;
/**
* The position of the parser in the stream.
*
* @internal
* @psalm-mutation-free
*/
public function position() : Position;
}

View File

@@ -0,0 +1,133 @@
<?php declare(strict_types=1);
/**
* This file is part of the Parsica library.
*
* Copyright (c) 2020 Mathias Verraes <mathias@verraes.net>
*
* For the full copyright and license information, please view the LICENSE
* file that was distributed with this source code.
*/
namespace Parsica\Parsica;
use Parsica\Parsica\Internal\EndOfStream;
use Parsica\Parsica\Internal\Position;
use Parsica\Parsica\Internal\TakeResult;
/**
* @psalm-immutable
*/
final class StringStream implements Stream
{
private string $string;
private Position $position;
/**
* @api
*/
public function __construct(string $string, ?Position $position = null)
{
$this->string = $string;
$this->position = $position ?? Position::initial();
}
/**
* @inheritDoc
* @internal
* @psalm-mutation-free
*/
public function take1(): TakeResult
{
if ($this->string === '') {
throw new EndOfStream("End of stream was reached in " . $this->position->pretty());
}
$token = mb_substr($this->string, 0, 1);
$position = $this->position->advance($token);
return new TakeResult(
$token,
new StringStream(mb_substr($this->string, 1), $position)
);
}
/**
* @inheritDoc
* @psalm-mutation-free
*/
public function isEOF(): bool
{
return $this->string === '';
}
/**
* @inheritDoc
* @psalm-mutation-free
*/
public function takeN(int $n): TakeResult
{
if ($n <= 0) {
return new TakeResult("", $this);
}
if ($this->string === '') {
throw new EndOfStream("End of stream was reached in " . $this->position->pretty());
}
$chunk = mb_substr($this->string, 0, $n);
return new TakeResult(
$chunk,
new StringStream(
mb_substr($this->string, $n),
$this->position->advance($chunk)
)
);
}
/**
* @psalm-param pure-callable(string) : bool $predicate
* @psalm-mutation-free
* @inheritDoc
*/
public function takeWhile(callable $predicate): TakeResult
{
if ($this->string === '') {
return new TakeResult("", $this);
}
$remaining = $this->string;
$nextToken = mb_substr($remaining, 0, 1);
$chunk = "";
while ($predicate($nextToken)) {
$chunk .= $nextToken;
$remaining = mb_substr($remaining, 1);
if ($remaining !== '') {
$nextToken = mb_substr($remaining, 0, 1);
} else {
break;
}
}
return new TakeResult(
$chunk,
new StringStream($remaining, $this->position->advance($chunk))
);
}
/**
* @psalm-mutation-free
*/
public function __toString(): string
{
return $this->string;
}
/**
* @inheritDoc
* @psalm-mutation-free
*/
public function position(): Position
{
return $this->position;
}
}

View File

@@ -0,0 +1,190 @@
<?php declare(strict_types=1);
/**
* This file is part of the Parsica library.
*
* Copyright (c) 2020 Mathias Verraes <mathias@verraes.net>
*
* For the full copyright and license information, please view the LICENSE
* file that was distributed with this source code.
*/
namespace Parsica\Parsica;
use Parsica\Parsica\Internal\Assert;
/**
* Parse a single character.
*
* @psalm-param string $c A single character
*
* @psalm-return Parser<string>
* @api
* @see charI()
* @psalm-pure
*/
function char(string $c): Parser
{
/** @psalm-suppress ImpureMethodCall */
Assert::singleChar($c);
return satisfy(isEqual($c))->label("'$c'");
}
/**
* Parse a single character, case-insensitive and case-preserving. On success, it returns the string cased as the
* actually parsed input.
*
* eg charI('a'')->run("ABC") will succeed with "A", not "a".
*
* @psalm-param string $c A single character
*
* @psalm-return Parser<string>
* @api
*
* @see char()
* @psalm-pure
*/
function charI(string $c): Parser
{
/** @psalm-suppress ImpureMethodCall */
Assert::singleChar($c);
$lower = mb_strtolower($c);
$upper = mb_strtoupper($c);
$label = $lower==$upper ? "'$c'" : "'$lower' or '$upper'";
return satisfy(orPred(isEqual($lower), isEqual($upper)))->label($label);
}
/**
* Parse a control character (a non-printing character of the Latin-1 subset of Unicode).
*
* @psalm-return Parser<string>
* @api
* @psalm-pure
*/
function controlChar(): Parser
{
return satisfy(isControl())->label("<controlChar>");
}
/**
* Parse an uppercase character A-Z.
*
* @psalm-return Parser<string>
* @api
* @psalm-pure
*/
function upperChar(): Parser
{
return satisfy(isUpper())->label("A-Z");
}
/**
* Parse a lowercase character a-z.
*
* @psalm-return Parser<string>
* @api
* @psalm-pure
*/
function lowerChar(): Parser
{
return satisfy(isLower())->label("a-z");
}
/**
* Parse an uppercase or lowercase character A-Z, a-z.
*
* @psalm-return Parser<string>
* @api
* @psalm-pure
*/
function alphaChar(): Parser
{
return satisfy(isAlpha())->label("A-Z or a-z");
}
/**
* Parse an alpha or numeric character A-Z, a-z, 0-9.
*
* @psalm-return Parser<string>
* @api
* @psalm-pure
*/
function alphaNumChar(): Parser
{
return satisfy(isAlphaNum())->label("A-Z or a-z or 0-9");
}
/**
* Parse a printable ASCII char.
*
* @psalm-return Parser<string>
* @api
* @psalm-pure
*/
function printChar(): Parser
{
return satisfy(isPrintable())->label("<printChar>");
}
/**
* Parse a single punctuation character !"#$%&'()*+,-./:;<=>?@[\]^_`{|}~
*
* @psalm-return Parser<string>
* @api
* @psalm-pure
*/
function punctuationChar(): Parser
{
return satisfy(isPunctuation())->label("<punctuation>");
}
/**
* Parse 0-9. Returns the digit as a string. Use ->map('intval')
* or similar to cast it to a numeric type.
*
* @psalm-return Parser<string>
* @api
* @psalm-pure
*/
function digitChar(): Parser
{
return satisfy(isDigit())->label('0-9');
}
/**
* Parse a binary character 0 or 1.
*
* @psalm-return Parser<string>
* @api
* @psalm-pure
*/
function binDigitChar(): Parser
{
return satisfy(isCharCode([0x30, 0x31]))->label("'0' or '1'");
}
/**
* Parse an octodecimal character 0-7.
*
* @psalm-return Parser<string>
*
* @api
* @psalm-pure
*/
function octDigitChar(): Parser
{
return satisfy(isCharCode(range(0x30, 0x37)))->label("0-7");
}
/**
* Parse a hexadecimal numeric character 0123456789abcdefABCDEF.
*
* @psalm-return Parser<string>
* @api
* @psalm-pure
*/
function hexDigitChar(): Parser
{
return satisfy(isHexDigit())->label("<hexadecimal>");
}

View File

@@ -0,0 +1,679 @@
<?php declare(strict_types=1);
/**
* This file is part of the Parsica library.
*
* Copyright (c) 2020 Mathias Verraes <mathias@verraes.net>
*
* For the full copyright and license information, please view the LICENSE
* file that was distributed with this source code.
*/
namespace Parsica\Parsica;
use InvalidArgumentException;
use Parsica\Parsica\Internal\Assert;
use Parsica\Parsica\Internal\Fail;
use Parsica\Parsica\Internal\Succeed;
use function Parsica\Parsica\Internal\FP\foldl;
/**
* Identity parser, returns the Parser as is.
*
* @psalm-param Parser<T> $parser
*
* @psalm-return Parser<T>
* @api
*
* @template T
* @psalm-pure
*/
function identity(Parser $parser): Parser
{
return $parser;
}
/**
* A parser that will have the argument as its output, no matter what the input was. It doesn't consume any input.
*
* @psalm-param T $output
*
* @psalm-return Parser<T>
* @api
*
* @template T
* @psalm-pure
*/
function pure($output): Parser
{
return Parser::make("<pure>", fn(Stream $input) => new Succeed($output, $input));
}
/**
* Optionally parse something, but still succeed if the thing is not there
*
* @psalm-param Parser<T> $parser
*
* @psalm-return Parser<T|null>
* @api
* @template T
* @psalm-pure
*/
function optional(Parser $parser): Parser
{
return either($parser, succeed())->label("optional " . $parser->getLabel());
}
/**
* Create a parser that takes the output from the first parser (if successful) and feeds it to the callable. The callable
* must return another parser. If the first parser fails, the first parser is returned.
*
* This is a monadic bind aka flatmap.
*
* @psalm-param Parser<T1> $parser
* @psalm-param pure-callable(T1) : Parser<T2> $f
*
* @psalm-return Parser<T2>
* @api
* @template T1
* @template T2
* @psalm-pure
*/
function bind(Parser $parser, callable $f): Parser
{
/**
* @psalm-var pure-callable(Stream) : ParseResult<T2> $parserFunction
*/
$parserFunction = static function (Stream $input) use ($parser, $f): ParseResult {
$result = $parser->run($input)->map($f);
if ($result->isFail()) {
return $result;
}
$p2 = $result->output();
return $result->continueWith($p2);
};
$finalParser = Parser::make($parser->getLabel(), $parserFunction);
return $finalParser;
}
/**
* Sequential application. Given a parser which outputs a callable, return a new parser that applies the callable on the
* output of the second parser.
*
* The first parser must be of type Parser<callable(T1):T2>. {@see pure()} can be used to wrap a callable in a Parser.
*
* Callables with more than 1 argument need to be curried: pure(curry(fn($x, $y)))->apply($parser2)->apply($parser3)
*
* @template T1
* @template T2
* @psalm-param Parser<pure-callable(T1):T2> $parser1
* @psalm-param Parser<T1> $parser2
* @psalm-return Parser<T2>
* @api
* @psalm-pure
*/
function apply(Parser $parser1, Parser $parser2): Parser
{
/**
* @psalm-var pure-callable(Stream): ParseResult<T2>
*/
$parserFunction = static function (Stream $input) use ($parser2, $parser1): ParseResult {
$r1 = $parser1->run($input);
if ($r1->isFail()) {
return $r1;
}
$f = $r1->output();
Assert::isCallable($f, "apply() can only be used when the output of the first parser is a callable with 1 argument. Use currying for functions with more than 1 argument.");
// @todo assert that the arity of $f == 1
return $r1->continueWith($parser2)->map($f);
};
$parser = Parser::make($parser1->getLabel(), $parserFunction);
return $parser;
}
/**
* Parse something, then follow by something else. Ignore the result of the first parser and return the result of the
* second parser.
*
* @psalm-param Parser<T1> $first
* @psalm-param Parser<T2> $second
*
* @psalm-return Parser<T2>
* @template T1
* @template T2
* @api
* @see Parser::sequence()
* @psalm-pure
*/
function sequence(Parser $first, Parser $second): Parser
{
return bind($first, /** @psalm-param mixed $_ */ fn($_) => $second);
}
/**
* Sequence two parsers, and return the output of the first one.
*
* @template T1
* @template T2
* @psalm-param Parser<T1> $first
* @psalm-param Parser<T2> $second
* @psalm-return Parser<T1>
* @api
* @psalm-pure
*/
function keepFirst(Parser $first, Parser $second): Parser
{
return bind(
$first,
/** @psalm-suppress MissingClosureParamType */
fn($a): Parser => sequence($second, pure($a))
);
}
/**
* Sequence two parsers, and return the output of the second one.
*
* @template T1
* @template T2
* @psalm-param Parser<T1> $first
* @psalm-param Parser<T2> $second
* @psalm-return Parser<T2>
* @api
* @psalm-pure
*/
function keepSecond(Parser $first, Parser $second): Parser
{
return sequence($first, $second);
}
/**
* Either parse the first thing or the second thing
*
* @psalm-param Parser<T1> $first
* @psalm-param Parser<T2> $second
*
* @psalm-return Parser<T1|T2>
* @api
*
* @see Parser::or()
*
* @template T1
* @template T2
* @psalm-pure
*/
function either(Parser $first, Parser $second): Parser
{
$label = $first->getLabel() . " or " . $second->getLabel();
/**
* @psalm-var pure-callable(Stream): ParseResult<T1|T2> $parserFunction
*/
$parserFunction = static function (Stream $input) use ($second, $first, $label): ParseResult {
// @todo Megaparsec doesn't do automatic rollback, for performance reasons, and requires the user to add try
// combinators. We could mimic that behaviour as it is probably more performant
$r1 = $first->run($input);
if ($r1->isSuccess()) {
return $r1;
}
$r2 = $second->run($input);
if ($r2->isSuccess()) {
return $r2;
}
return new Fail($label, $r2->got());
};
return Parser::make($label, $parserFunction);
}
/**
* Combine the parser with another parser of the same type, which will cause the results to be appended.
*
* @psalm-param Parser<T|null> $left
* @psalm-param Parser<T|null> $right
*
* @psalm-return Parser<T|null>
* @api
* @template T
* @psalm-pure
*/
function append(Parser $left, Parser $right): Parser
{
return Parser::make($right->getLabel(), static function (Stream $input) use ($left, $right): ParseResult {
$r1 = $left->run($input);
$r2 = $r1->continueWith($right);
return $r1->append($r2);
});
}
/**
* Append all the passed parsers.
*
* @psalm-param list<Parser<T|null>> $parsers
* @psalm-return Parser<T|null>
* @api
* @template T
* @psalm-suppress MixedReturnStatement
* @psalm-suppress MixedInferredReturnType
* @psalm-pure
*/
function assemble(Parser ...$parsers): Parser
{
/** @psalm-suppress ImpureMethodCall */
Assert::atLeastOneArg($parsers, "assemble()");
$first = array_shift($parsers);
/** @psalm-suppress InvalidArgument */
return array_reduce($parsers, fn(Parser $p1, Parser $p2): Parser => append($p1, $p2), $first);
}
/**
* Parse into an array that consists of the results of all parsers.
*
* @psalm-param list<Parser<mixed>> $parsers
* @psalm-return Parser<mixed>
* @api
* @psalm-pure
*/
function collect(Parser ...$parsers): Parser
{
$toArray =
/**
* @psalm-param mixed $v
* @psalm-return list<mixed>
*/
fn($v): array => [$v];
$arrayParsers = array_map(
fn(Parser $parser): Parser => map($parser, $toArray),
$parsers
);
return assemble(...$arrayParsers);
}
/**
* Tries each parser one by one, returning the result of the first one that succeeds.
*
* @no-named-arguments
* @psalm-param non-empty-list<Parser<mixed>> $parsers
* @psalm-return Parser<mixed>
* @api
* @psalm-pure
*/
function any(Parser ...$parsers): Parser
{
if (empty($parsers)) {
throw new InvalidArgumentException("any() expects at least one parser");
}
$labels = array_map(fn(Parser $p): string => $p->getLabel(), $parsers);
$label = implode(' or ', $labels);
return foldl(
$parsers,
fn(Parser $first, Parser $second): Parser => either($first, $second),
fail("")
)->label($label);
}
/**
* Tries each parser one by one, returning the result of the first one that succeeds.
*
* Alias for {@see any()}
*
* @no-named-arguments
* @psalm-param non-empty-list<Parser<mixed>> $parsers
* @psalm-return Parser<mixed>
* @api
* @psalm-pure
*/
function choice(Parser ...$parsers): Parser
{
return any(...$parsers);
}
/**
* One or more repetitions of Parser, with the outputs appended.
*
* @api
* @psalm-param Parser<T> $parser
* @psalm-return Parser<T>
* @template T
* @psalm-suppress MixedArgumentTypeCoercion
* @psalm-pure
*/
function atLeastOne(Parser $parser): Parser
{
/**
* @psalm-var pure-callable(Stream): ParseResult<T> $parserFunction
*/
$parserFunction = static function (Stream $input) use ($parser): ParseResult {
$result = $parser->run($input);
if ($result->isFail()) {
return $result;
}
$final = new Succeed(null, $result->remainder());
while ($result->isSuccess()) {
$final = $final->append($result);
$result = $parser->continueFrom($result);
}
return $final;
};
return Parser::make(
"at least one " . $parser->getLabel(), $parserFunction
);
}
/**
* Zero or more repetitions of Parser, with the outputs appended.
*
* @TODO Untested
*
* @api
* @psalm-param Parser<T> $parser
* @psalm-return Parser<T>
* @template T
* @psalm-suppress MixedArgumentTypeCoercion
* @psalm-pure
*/
function zeroOrMore(Parser $parser): Parser
{
/** @var pure-callable(Stream):ParseResult<T> $parserFunction */
$parserFunction = static function (Stream $input) use ($parser): ParseResult {
$result = new Succeed(null, $input);
$final = $result;
while ($result->isSuccess()) {
$final = $final->append($result);
$result = $parser->continueFrom($result);
}
return $final;
};
return Parser::make(
"zero or more " . $parser->getLabel(), $parserFunction
);
}
/**
* Parse something exactly n times
*
* @template T
*
* @psalm-param Parser<T> $parser
*
* @psalm-return Parser<T>
* @api
* @psalm-pure
*/
function repeat(int $n, Parser $parser): Parser
{
return foldl(
array_fill(0, $n - 1, $parser),
fn(Parser $l, Parser $r): Parser => append($l, $r),
$parser
)->label("$n times " . $parser->getLabel());
}
/**
* Parse something exactly n times and return as an array
*
* @TODO This doesn't feel very elegant.
*
* @template T
*
* @psalm-param positive-int $n
* @psalm-param Parser<T> $parser
*
* @psalm-return Parser<list<T>>
* @api
* @psalm-pure
*/
function repeatList(int $n, Parser $parser): Parser
{
/** @palm-var Parser<list<T>> $parser */
$parser = map(
$parser,
/**
* @psalm-param T $output
* @psalm-return list<T>
*/
fn($output): array => [$output]
);
$parsers = array_fill(0, $n - 1, $parser);
return foldl(
$parsers,
/**
* @psalm-param Parser<list<T>> $l
* @psalm-param Parser<list<T>> $r
* @psalm-return Parser<list<T>>
*
* @psalm-suppress InvalidReturnType
* @psalm-suppress InvalidReturnStatement
* @psalm-pure
*/
fn(Parser $l, Parser $r): Parser => append($l, $r),
$parser
)->label("$n times " . $parser->getLabel());
}
/**
* Parse something one or more times, and output an array of the successful outputs.
*
* @template T
*
* @psalm-param Parser<T> $parser
* @psalm-return Parser<list<T>>
*
* @api
* @psalm-pure
*/
function some(Parser $parser): Parser
{
return map(
collect($parser, many($parser)),
/**
* @psalm-param array{0: T, 1: list<T>} $o
* @psalm-return list<T>
*/
fn(array $o):array => array_merge([$o[0]], $o[1])
);
}
/**
* Parse something zero or more times, and output an array of the successful outputs.
*
* @template T
*
* @psalm-param Parser<T> $parser
* @psalm-return Parser<list<T>>
*
* @api
* @psalm-pure
*/
function many(Parser $parser): Parser
{
return Parser::make(
"many {$parser->getLabel()}",
function (Stream $remainder) use ($parser): ParseResult {
$result = [];
while (true) {
$lastResult = $parser->run($remainder);
if ($lastResult->isFail()) {
break;
}
$remainder = $lastResult->remainder();
$result[] = $lastResult->output();
}
/** @psalm-var ParseResult<list<T>> $succeed */
$succeed = new Succeed($result, $remainder);
return $succeed;
}
);
}
/**
* Parse $open, followed by $middle, followed by $close, and return the result of $middle. Useful for eg. "(value)".
*
* @template TO
* @template TM
* @template TC
*
* @psalm-param Parser<TO> $open
* @psalm-param Parser<TC> $close
* @psalm-param Parser<TM> $middle
*
* @psalm-return Parser<TM>
* @api
* @psalm-pure
*/
function between(Parser $open, Parser $close, Parser $middle): Parser
{
return keepSecond($open, keepFirst($middle, $close));
}
/**
* Parses zero or more occurrences of $parser, separated by $separator. Returns a list of values.
*
* The sepBy parser always succeed, even if it doesn't find anything. Use {@see sepBy1()} if you want it to find at
* least one value.
*
* @template TSeparator
* @template T
*
* @psalm-param Parser<TSeparator> $separator
* @psalm-param Parser<T> $parser
*
* @psalm-return Parser<list<T>>
*
* @api
* @psalm-pure
*/
function sepBy(Parser $separator, Parser $parser): Parser
{
return sepBy1($separator, $parser)->or(pure([]));
}
/**
* Parses one or more occurrences of $parser, separated by $separator. Returns a list of values.
*
* @template TS
* @template T
*
* @psalm-param Parser<TS> $separator
* @psalm-param Parser<T> $parser
*
* @psalm-return Parser<list<T>>
*
* @psalm-suppress MissingClosureReturnType
*
* @api
* @psalm-pure
*/
function sepBy1(Parser $separator, Parser $parser): Parser
{
/** @psalm-suppress MissingClosureParamType */
$prepend = fn($x) => fn(array $xs): array => array_merge([$x], $xs);
$label = $parser->getLabel() . ", separated by " . $separator->getLabel();
return pure($prepend)->apply($parser)->apply(many($separator->sequence($parser)))->label($label);
}
/**
* Parses 2 or more occurrences of $parser, separated by $separator. Returns a list of values.
*
* @template TS
* @template T
*
* @psalm-param Parser<TS> $separator
* @psalm-param Parser<T> $parser
*
* @psalm-return Parser<list<T>>
*
* @psalm-suppress MissingClosureReturnType
*
* @api
* @psalm-pure
*/
function sepBy2(Parser $separator, Parser $parser): Parser
{
/** @psalm-suppress MissingClosureParamType */
$prepend = fn($x) => fn(array $xs): array => array_merge([$x], $xs);
$label = "at least two of (" . $parser->getLabel() . "), separated by " . $separator->getLabel();
return pure($prepend)->apply(keepFirst($parser, $separator))->apply(sepBy1($separator, $parser))->label($label);
}
/**
* notFollowedBy only succeeds when $parser fails. It never consumes any input.
*
* Example:
*
* `string("print")` will also match "printXYZ"
*
* `keepFirst(string("print"), notFollowedBy(alphaNumChar()))` will match "print something" but not "printXYZ something"
*
* @template T
* @psalm-param Parser<T> $parser
* @psalm-return Parser<T>
* @see Parser::notFollowedBy()
*
* @api
* @psalm-pure
*/
function notFollowedBy(Parser $parser): Parser
{
/** @psalm-var Parser<string> $p */
$label = "notFollowedBy({$parser->getLabel()})";
$p = Parser::make($label, static function (Stream $input) use ($label, $parser): ParseResult {
$result = $parser->run($input);
return $result->isSuccess()
? new Fail($label, $input)
: new Succeed("", $input);
});
return $p;
}
/**
* Map a function over the parser (which in turn maps it over the result).
*
* @template T1
* @template T2
* @psalm-param pure-callable(T1) : T2 $transform
* @psalm-return Parser<T2>
* @api
* @psalm-pure
*/
function map(Parser $parser, callable $transform): Parser
{
return Parser::make($parser->getLabel(), fn(Stream $input): ParseResult => $parser->run($input)->map($transform));
}
/**
* If $parser succeeds (either consuming input or not), lookAhead behaves like $parser succeeded without consuming
* anything. If $parser fails, lookAhead has no effect, i.e. it will fail to consume input if $parser fails consuming
* input.
*
* @template T
* @psalm-param Parser<T> $parser
* @psalm-return Parser<T>
*
* @api
* @psalm-pure
*/
function lookAhead(Parser $parser): Parser
{
return Parser::make(
$parser->getLabel(),
static function (Stream $input) use ($parser): ParseResult {
$parseResult = $parser->run($input);
return $parseResult->isSuccess()
? new Succeed($parseResult->output(), $input)
: new Fail("lookAhead", $input);
}
);
}

View File

@@ -0,0 +1,66 @@
<?php declare(strict_types=1);
/**
* This file is part of the Parsica library.
*
* Copyright (c) 2020 Mathias Verraes <mathias@verraes.net>
*
* For the full copyright and license information, please view the LICENSE
* file that was distributed with this source code.
*/
namespace Parsica\Parsica;
/**
* Parse an integer and return it as a string. Use ->map('intval')
* or similar to cast it to a numeric type.
*
* Example: "-123"
*
* @psalm-return Parser<string>
* @api
* @psalm-pure
*/
function integer(): Parser
{
$zeroNine = digitChar();
$oneNine = oneOfS("123456789");
$minus = char('-');
$digits = takeWhile1(isDigit())->label('at least one 0-9');
/** @var Parser<string> $parser */
$parser = choice(
$minus->append($oneNine)->append($digits),
$minus->append($zeroNine),
$oneNine->append($digits),
$zeroNine
);
return $parser;
}
/**
* Parse a float and return it as a string. Use ->map('floatval')
* or similar to cast it to a numeric type.
*
* Example: -123.456E-789
*
* @psalm-return Parser<string>
* @psalm-suppress InvalidReturnType
* @psalm-suppress InvalidReturnStatement
* @api
* @psalm-pure
*/
function float(): Parser
{
$digits = takeWhile1(isDigit())->label('at least one 0-9');
$fraction = char('.')->append($digits);
$sign = char('+')->or(char('-'))->or(pure('+'));
$exponent = assemble(
charI('e')->map(fn(string $s) : string => strtoupper($s)),
$sign,
$digits
);
return assemble(
integer(),
optional($fraction),
optional($exponent)
)->label("float");
}

View File

@@ -0,0 +1,249 @@
<?php declare(strict_types=1);
/**
* This file is part of the Parsica library.
*
* Copyright (c) 2020 Mathias Verraes <mathias@verraes.net>
*
* For the full copyright and license information, please view the LICENSE
* file that was distributed with this source code.
*/
namespace Parsica\Parsica;
use Parsica\Parsica\Internal\Assert;
/**
* Creates an equality predicate
*
* @psalm-return pure-callable(string) : bool
*
* @api
* @psalm-pure
*/
function isEqual(string $x): callable
{
/** @psalm-suppress ImpureMethodCall */
Assert::singleChar($x);
return fn(string $y): bool => $x === $y;
}
/**
* Negates a predicate.
*
* @psalm-param pure-callable(string) : bool $predicate
*
* @psalm-return pure-callable(string) : bool
*
* @api
* @psalm-pure
*/
function notPred(callable $predicate): callable
{
return fn(string $x): bool => !$predicate($x);
}
/**
* Boolean And predicate.
*
* @psalm-param pure-callable(string) : bool $first
* @psalm-param pure-callable(string) : bool $second
*
* @psalm-return pure-callable(string) : bool
*
* @api
* @psalm-pure
*/
function andPred(callable $first, callable $second): callable
{
return fn(string $x): bool => $first($x) && $second($x);
}
/**
* Boolean Or predicate.
*
* @psalm-param pure-callable(string) : bool $first
* @psalm-param pure-callable(string) : bool $second
*
* @psalm-return pure-callable(string) : bool
* @api
* @psalm-pure
*/
function orPred(callable $first, callable $second): callable
{
return fn(string $x): bool => $first($x) || $second($x);
}
/**
* Predicate that checks if a character is in an array of character codes.
*
* @psalm-param list<int> $chars
*
* @psalm-return pure-callable(string) : bool
* @api
*
* @link https://doc.bccnsoft.com/docs/cppreference2018/en/c/string/wide/iswcntrl.html
* @psalm-pure
*/
function isCharCode(array $chars): callable
{
return fn(string $x): bool => in_array(mb_ord($x), $chars);
}
/**
* Returns true for a space character, and the control characters \t, \n, \r, \f, \v.
*
* @psalm-return pure-callable(string) : bool
* @api
* @psalm-pure
*/
function isSpace(): callable
{
return isCharCode([9, 10, 11, 12, 13, 32, 160]);
}
/**
* Like 'isSpace', but does not accept newlines and carriage returns.
*
* @psalm-return pure-callable(string) : bool
* @api
* @see isSpace
* @psalm-pure
*/
function isHSpace(): callable
{
return isCharCode([9, 11, 12, 32, 160]);
}
/**
* True for 0-9
*
* @psalm-return pure-callable(string) : bool
* @api
* @psalm-pure
*/
function isDigit(): callable
{
return isCharCode(range(0x30, 0x39));
}
/**
* Control character predicate (a non-printing character of the Latin-1 subset of Unicode).
*
* @psalm-return pure-callable(string) : bool
* @api
* @psalm-pure
*/
function isControl(): callable
{
return isCharCode(range(0x00, 0x1F) + [0x7F]);
}
/**
* Returns true for a space or a tab character
*
* @psalm-return pure-callable(string) : bool
* @api
* @psalm-pure
*/
function isBlank() : callable
{
return isCharCode([0x9, 0x20]);
}
/**
* Returns true for a space character, and \t, \n, \r, \f, \v.
*
* @psalm-return pure-callable(string) : bool
* @api
* @psalm-pure
*/
function isWhitespace() : callable
{
return isCharCode([0x20, 0x9, 0xA, 0xB, 0xC, 0xD]);
}
/**
* Returns true for an uppercase character A-Z.
*
* @psalm-return pure-callable(string) : bool
* @api
* @psalm-pure
*/
function isUpper() : callable
{
return isCharCode(range(0x41, 0x5A));
}
/**
* Returns true for a lowercase character a-z.
*
* @psalm-return pure-callable(string) : bool
* @api
* @psalm-pure
*/
function isLower()
{
return isCharCode(range(0x61, 0x7A));
}
/**
* Returns true for an uppercase or lowercase character A-Z, a-z.
*
* @psalm-return pure-callable(string) : bool
* @api
* @psalm-pure
*/
function isAlpha() : callable
{
return isCharCode(array_merge(range(0x41, 0x5A), range(0x61, 0x7A)));
}
/**
* Returns true for an alpha or numeric character A-Z, a-z, 0-9.
*
* @psalm-return pure-callable(string) : bool
* @api
* @psalm-pure
*/
function isAlphaNum() : callable
{
return isCharCode(array_merge(range(0x30, 0x39), range(0x41, 0x5A), range(0x61, 0x7A)));
}
/**
* Returns true if the given character is a hexadecimal numeric character 0123456789abcdefABCDEF.
*
* @psalm-return pure-callable(string) : bool
* @api
* @psalm-pure
*/
function isHexDigit() : callable
{
return isCharCode(array_merge(range(0x30, 0x39), range(0x41, 0x46), range(0x61, 0x66)));
}
/**
* Returns true if the given character is a printable ASCII char.
*
* @psalm-return pure-callable(string) : bool
* @api
* @psalm-pure
*/
function isPrintable() : callable
{
return isCharCode(range(0x20, 0x7E));
}
/**
* Returns true if the given character is a punctuation character !"#$%&'()*+,-./:;<=>?@[\]^_`{|}~
*
* @psalm-return pure-callable(string) : bool
* @api
* @psalm-pure
*/
function isPunctuation() : callable
{
return isCharCode(array_merge(range(0x21, 0x2F), range(0x3A, 0x40), range(0x5B, 0x60), range(0x7B, 0x7E)));
}

View File

@@ -0,0 +1,320 @@
<?php declare(strict_types=1);
/**
* This file is part of the Parsica library.
*
* Copyright (c) 2020 Mathias Verraes <mathias@verraes.net>
*
* For the full copyright and license information, please view the LICENSE
* file that was distributed with this source code.
*/
namespace Parsica\Parsica;
use Parsica\Parsica\Internal\Assert;
use Parsica\Parsica\Internal\EndOfStream;
use Parsica\Parsica\Internal\Fail;
use Parsica\Parsica\Internal\Succeed;
/**
* A parser that satisfies a predicate on a single token. Useful as a building block for writing things like char(),
* digit()...
*
* @template T
*
* @psalm-param callable(string) : bool $predicate
*
* @psalm-return Parser<T>
* @psalm-pure
*/
function satisfy(callable $predicate): Parser
{
$label = "satisfy(predicate)";
/** @psalm-var pure-callable(Stream) : ParseResult $parserFunction */
$parserFunction = static function (Stream $input) use ($label, $predicate): ParseResult {
try {
$t = $input->take1();
} catch (EndOfStream $e) {
return new Fail($label, $input);
}
return $predicate($t->chunk()) ? new Succeed($t->chunk(), $t->stream()) : new Fail($label, $input);
};
return Parser::make($label, $parserFunction);
}
/**
* Skip 0 or more characters as long as the predicate holds.
*
* @template T
*
* @psalm-param pure-callable(string) : bool $predicate
* @psalm-return Parser<null>
* @psalm-pure
*/
function skipWhile(callable $predicate): Parser
{
return takeWhile($predicate)->followedBy(pure(null));
}
/**
* Skip 1 or more characters as long as the predicate holds.
*
* @template T
*
* @psalm-param pure-callable(string) : bool $predicate
*
* @psalm-return Parser<null>
* @psalm-pure
*/
function skipWhile1(callable $predicate): Parser
{
return takeWhile1($predicate)->followedBy(pure(null));
}
/**
* Keep parsing 0 or more characters as long as the predicate holds.
*
* @template T
* @psalm-param pure-callable(string) : bool $predicate
* @psalm-return Parser<T>
* @psalm-pure
*/
function takeWhile(callable $predicate): Parser
{
/** @psalm-pure */
$parserFunction = static function (Stream $input) use ($predicate): ParseResult {
$t = $input->takeWhile($predicate);
return new Succeed($t->chunk(), $t->stream());
};
return Parser::make(
"takeWhile(predicate)", $parserFunction
);
}
/**
* Keep parsing 1 or more characters as long as the predicate holds.
*
* @template T
*
* @psalm-param pure-callable(string) : bool $predicate
*
* @psalm-return Parser<T>
* @psalm-pure
*/
function takeWhile1(callable $predicate): Parser
{
$label = "takeWhile1(predicate)";
return Parser::make($label, static function (Stream $input) use ($label, $predicate): ParseResult {
try {
$t = $input->take1();
} catch (EndOfStream $e) {
return new Fail($label, $input);
}
if (!$predicate($t->chunk())) {
return new Fail($label, $input);
}
$t = $input->takeWhile($predicate);
return new Succeed($t->chunk(), $t->stream());
}
);
}
/**
* Parse and return a single character of anything.
*
* @template T
*
* @psalm-return Parser<T>
* @psalm-pure
*/
function anySingle(): Parser
{
return satisfy(
/** @psalm-param mixed $_ */
fn($_) => true
)->label("anySingle");
}
/**
* Parse and return a single character of anything.
*
* @TODO This is an alias of anySingle. Should we get rid of one of them?
* @psalm-return Parser<string>
* @psalm-pure
*/
function anything(): Parser
{
return satisfy(fn(string $_) => true)->label("anything");
}
/**
* Match any character but the given one.
*
* @psalm-return Parser<string>
* @api
* @template T
* @psalm-pure
*/
function anySingleBut(string $x): Parser
{
return satisfy(notPred(isEqual($x)))->label("anySingleBut($x)");
}
/**
* Succeeds if the current character is in the supplied list of characters. Returns the parsed character.
*
* @psalm-param list<string> $chars
*
* @psalm-return Parser<string>
* @api
* @template T
* @psalm-pure
*/
function oneOf(array $chars): Parser
{
/** @psalm-suppress ImpureMethodCall */
Assert::singleChars($chars);
return satisfy(fn(string $x) => in_array($x, $chars))->label("one of " . implode('', $chars));
}
/**
* A compact form of 'oneOf'.
* oneOfS("abc") == oneOf(['a', 'b', 'c'])
*
* @psalm-param string $chars
*
* @psalm-return Parser<string>
* @api
* @psalm-pure
*/
function oneOfS(string $chars): Parser
{
/** @psalm-var list<string> $split */
$split = mb_str_split($chars);
return oneOf($split);
}
/**
* The dual of 'oneOf'. Succeeds if the current character is not in the supplied list of characters. Returns the
* parsed character.
*
* @psalm-param list<string> $chars
*
* @psalm-return Parser<string>
* @api
* @template T
* @psalm-pure
*/
function noneOf(array $chars): Parser
{
/** @psalm-suppress ImpureMethodCall */
Assert::singleChars($chars);
return satisfy(fn(string $x) => !in_array($x, $chars))
->label("noneOf(" . implode('', $chars) . ")");
}
/**
* A compact form of 'noneOf'.
* noneOfS("abc") == noneOf(['a', 'b', 'c'])
*
* @psalm-param string $chars
*
* @psalm-return Parser<string>
* @api
* @template T
* @psalm-pure
*/
function noneOfS(string $chars): Parser
{
/** @psalm-var list<string> $split */
$split = mb_str_split($chars);
return noneOf($split);
}
/**
* Consume the rest of the input and return it as a string. This parser never fails, but may return the empty string.
*
* @psalm-return Parser<string>
* @api
* @template T
* @psalm-pure
*/
function takeRest(): Parser
{
return takeWhile(fn(string $_): bool => true);
}
/**
* Parse nothing, but still succeed.
*
* This serves as the zero parser in `append()` operations.
*
* @psalm-return Parser<null>
*
* @api
* @psalm-pure
*/
function nothing(): Parser
{
/** @psalm-var pure-callable(Stream):ParseResult<null> $result */
$result = fn(Stream $input) : ParseResult => new Succeed(null, $input);
$parser = Parser::make("<nothing>", $result);
return $parser;
}
/**
* Parse everything; that is, consume the rest of the input until the end.
*
* @api
* @psalm-pure
*/
function everything(): Parser
{
return Parser::make("<everything>", fn(Stream $input) => new Succeed((string)$input, new StringStream("")));
}
/**
* Always succeed, no matter what the input was.
*
* @api
* @psalm-pure
*/
function succeed(): Parser
{
return Parser::make("<always succeed>", fn(Stream $input) => new Succeed(null, $input));
}
/**
* Always fail, no matter what the input was.
*
* @return Parser
* @api
* @psalm-pure
*/
function fail(string $label): Parser
{
return Parser::make($label, fn(Stream $input) => new Fail($label, $input));
}
/**
* Parse the end of the input
*
* @psalm-return Parser<T>
* @api
* @template T
* @psalm-pure
*/
function eof(): Parser
{
$label = "<EOF>";
return Parser::make($label, fn(Stream $input): ParseResult => $input->isEOF()
? new Succeed("", $input)
: new Fail($label, $input)
);
}

View File

@@ -0,0 +1,25 @@
<?php declare(strict_types=1);
/**
* This file is part of the Parsica library.
*
* Copyright (c) 2020 Mathias Verraes <mathias@verraes.net>
*
* For the full copyright and license information, please view the LICENSE
* file that was distributed with this source code.
*/
namespace Parsica\Parsica;
/**
* Create a recursive parser. Used in combination with Parser#recurse().
*
* @psalm-return Parser<T>
* @api
*
* @template T
* @psalm-pure
*/
function recursive(): Parser
{
return Parser::recursive();
}

View File

@@ -0,0 +1,36 @@
<?php declare(strict_types=1);
/**
* This file is part of the Parsica library.
*
* Copyright (c) 2020 Mathias Verraes <mathias@verraes.net>
*
* For the full copyright and license information, please view the LICENSE
* file that was distributed with this source code.
*/
namespace Parsica\Parsica;
/**
* If the parser is successful, call the $receiver function with the output of the parser. The resulting parser
* behaves identical to the original one. This combinator is useful for expressing side effects during the parsing
* process. It can be hooked into existing event publishing libraries by using $receiver as an adapter for those. Other
* use cases are logging, caching, performing an action whenever a value is matched in a long-running input stream, ...
*
* @template T
* @psalm-param Parser<T> $parser
* @psalm-param callable(T): void $receiver
* @psalm-return Parser<T>
* @api
*/
function emit(Parser $parser, callable $receiver): Parser
{
/** @psalm-var pure-callable(Stream):ParseResult $parserFunction */
$parserFunction = static function (Stream $input) use ($receiver, $parser): ParseResult {
$result = $parser->run($input);
if ($result->isSuccess()) {
$receiver($result->output());
}
return $result;
};
return Parser::make("emit", $parserFunction);
}

146
vendor/parsica-php/parsica/src/space.php vendored Normal file
View File

@@ -0,0 +1,146 @@
<?php declare(strict_types=1);
/**
* This file is part of the Parsica library.
*
* Copyright (c) 2020 Mathias Verraes <mathias@verraes.net>
*
* For the full copyright and license information, please view the LICENSE
* file that was distributed with this source code.
*/
namespace Parsica\Parsica;
/**
* Parse a single space character.
*
* @psalm-return Parser<string>
* @api
* @psalm-pure
*/
function space(): Parser
{
return char(' ')->label("<space>");
}
/**
* Parse a single tab character.
*
* @psalm-return Parser<string>
* @api
* @psalm-pure
*/
function tab(): Parser
{
return char("\t")->label("<tab>");
}
/**
* Parse a space or tab.
*
* @psalm-return Parser<string>
* @api
* @psalm-pure
*/
function blank(): Parser
{
return satisfy(isBlank())->label("<blank>");
}
/**
* Parse a space character, and \t, \n, \r, \f, \v.
*
* @psalm-return Parser<string>
* @api
* @psalm-pure
*/
function whitespace(): Parser
{
return satisfy(isWhitespace())->label("<whitespace>");
}
/**
* Parse a newline character.
*
* @psalm-return Parser<string>
* @api
* @psalm-pure
*/
function newline(): Parser
{
return char("\n")->label("<newline>");
}
/**
* Parse a carriage return character and a newline character. Return the two characters. {\r\n}
*
* @psalm-return Parser<string>
* @api
* @psalm-pure
*/
function crlf(): Parser
{
return string("\r\n")->label("<crlf>");
}
/**
* Parse a newline or a crlf.
*
* @psalm-return Parser<string>
* @api
* @psalm-pure
*/
function eol(): Parser
{
return either(newline(), crlf())->label("<EOL>");
}
/**
* Skip zero or more white space characters.
*
* @psalm-return Parser<null>
* @api
* @psalm-pure
*/
function skipSpace(): Parser
{
return skipWhile(isSpace());
}
/**
* Like 'skipSpace', but does not accept newlines and carriage returns.
*
* @psalm-return Parser<null>
* @api
* @see skipSpace
* @psalm-pure
*/
function skipHSpace(): Parser
{
return skipWhile(isHSpace());
}
/**
* Skip one or more white space characters.
*
* @psalm-return Parser<null>
* @api
* @psalm-pure
*/
function skipSpace1(): Parser
{
return skipWhile1(isSpace())->label("<space>");
}
/**
* Like 'skipSpace1', but does not accept newlines and carriage returns.
*
* @psalm-return Parser<null>
* @api
* @see skipSpace1
* @psalm-pure
*/
function skipHSpace1(): Parser
{
return skipWhile1(isHSpace())->label("<space>");
}

View File

@@ -0,0 +1,79 @@
<?php declare(strict_types=1);
/**
* This file is part of the Parsica library.
*
* Copyright (c) 2020 Mathias Verraes <mathias@verraes.net>
*
* For the full copyright and license information, please view the LICENSE
* file that was distributed with this source code.
*/
namespace Parsica\Parsica;
use Parsica\Parsica\Internal\Assert;
use Parsica\Parsica\Internal\EndOfStream;
use Parsica\Parsica\Internal\Fail;
use Parsica\Parsica\Internal\Succeed;
use function Parsica\Parsica\Internal\FP\foldl;
/**
* Parse a non-empty string.
*
* @psalm-return Parser<string>
* @api
* @see stringI()
* @psalm-pure
*/
function string(string $str): Parser
{
/** @psalm-suppress ImpureMethodCall */
Assert::nonEmpty($str);
$len = mb_strlen($str);
$label = "'$str'";
/** @psalm-var Parser<string> $parser */
$parser = Parser::make($label, static function (Stream $input) use ($label, $len, $str): ParseResult {
try {
$t = $input->takeN($len);
} catch (EndOfStream $e) {
return new Fail($label, $input);
}
return $t->chunk() === $str
? new Succeed($str, $t->stream())
: new Fail($label, $input);
}
);
return $parser;
}
/**
* Parse a non-empty string, case-insensitive, and case-preserving. On success, it returns the string cased as the
* actually parsed input.
* eg stringI("foobar")->tryString("foObAr") will succeed with "foObAr"
*
* @TODO The implementation could be replaced using Stream::takeWhile
*
* @psalm-return Parser<string>
* @api
* @see string()
* @psalm-pure
*/
function stringI(string $str): Parser
{
/** @psalm-suppress ImpureMethodCall */
Assert::nonEmpty($str);
/** @psalm-var list<string> $split */
$split = mb_str_split($str);
$chars = array_map(
fn(string $c): Parser => charI($c),
$split
);
/** @psalm-var Parser<string> $parser */
$parser = foldl(
$chars,
/** @psalm-pure */
fn(Parser $l, Parser $r): Parser => append($l, $r),
succeed()
)->label("'$str'");
return $parser;
}

View File

@@ -0,0 +1,290 @@
<?php declare(strict_types=1);
/**
* This code is forked from https://github.com/matteosister/php-curry, which is abandoned. It could be integrated into
* the rest of Parsica.
*/
namespace Tests\Parsica\Parsica\Curry;
use PHPUnit\Framework\TestCase;
use function Parsica\Parsica\Curry\__;
use function Parsica\Parsica\Curry\_is_fullfilled;
use function Parsica\Parsica\Curry\_rest;
use function Parsica\Parsica\Curry\curry;
use function Parsica\Parsica\Curry\curry_args;
use function Parsica\Parsica\Curry\curry_right;
use function Parsica\Parsica\Curry\curry_right_args;
final class CurryTest extends TestCase
{
/**
* @test
*/
public function curry_without_params()
{
$simpleFunction = curry(function () {
return 1;
});
$this->assertEquals(1, $simpleFunction());
}
/**
* @test
*/
public function curry_identity()
{
$identity = curry([new TestSubject(), 'identity'], 1);
$this->assertEquals(1, $identity(1));
}
/**
* @test
*/
public function curry_identity_function()
{
$func = curry(function ($v) {
return $v;
}, 'test string');
$this->assertEquals('test string', $func());
}
/**
* @test
*/
public function curry_with_one_later_param()
{
$curriedOne = curry([new TestSubject(), 'add2'], 1);
$this->assertInstanceOf('Closure', $curriedOne);
$this->assertEquals(2, $curriedOne(1));
}
/**
* @test
*/
public function curry_with_two_later_param()
{
$curriedTwo = curry([new TestSubject(), 'add4'], 1, 1);
$this->assertInstanceOf('Closure', $curriedTwo);
$this->assertEquals(4, $curriedTwo(1, 1));
}
/**
* @test
*/
public function curry_with_successive_calls()
{
$curriedTwo = curry([new TestSubject(), 'add4'], 1, 1);
$curriedThree = $curriedTwo(1);
$this->assertEquals(4, $curriedThree(1));
}
/**
* @test
*/
public function curry_right()
{
$divideBy10 = curry_right([new TestSubject(), 'divide2'], 10);
$this->assertInstanceOf('Closure', $divideBy10);
$this->assertEquals(10, $divideBy10(100));
}
/**
* @test
*/
public function curry_right_immediate()
{
$divide3 = curry_right([new TestSubject(), 'divide3'], 5, 2, 20);
$this->assertEquals(2, $divide3());
}
/**
* @test
*/
public function curry_left_immediate()
{
$divide3 = curry([new TestSubject(), 'divide3'], 20, 2, 4);
$this->assertEquals(2.5, $divide3());
}
/**
* @test
*/
public function curry_three_times()
{
$divideBy5 = curry([new TestSubject(), 'divide3'], 100);
$divideBy10And5 = $divideBy5(10);
$this->assertEquals(2, $divideBy10And5(5));
}
/**
* @test
*/
public function curry_right_three_times()
{
$divideBy5 = curry_right([new TestSubject(), 'divide3'], 5);
$divideBy10And5 = $divideBy5(10);
$this->assertEquals(2, $divideBy10And5(100));
}
/**
* @test
*/
public function curry_using_func_get_args()
{
$fnNoArgs = function () {
return func_get_args();
};
$curried = curry($fnNoArgs);
$curriedRight = curry_right($fnNoArgs);
$this->assertEquals([], $fnNoArgs());
$this->assertEquals([], $curried());
$this->assertEquals([], $curriedRight());
$this->assertEquals([1], $fnNoArgs(1));
$this->assertEquals([1], $curried(1));
$this->assertEquals([1], $curriedRight(1));
$this->assertEquals([1, 2, 'three'], $fnNoArgs(1, 2, 'three'));
$this->assertEquals([1, 2, 'three'], $curried(1, 2, 'three'));
$this->assertEquals([1, 2, 'three'], $curriedRight(1, 2, 'three'));
$fnOneArg = function ($x) {
return func_get_args();
};
$curried = curry($fnOneArg);
$curriedRight = curry_right($fnOneArg);
$this->assertEquals([1], $fnOneArg(1));
$this->assertEquals([1], $curried(1));
$this->assertEquals([1], $curriedRight(1));
$this->assertEquals([1, 2, 'three'], $fnOneArg(1, 2, 'three'));
$this->assertEquals([1, 2, 'three'], $curried(1, 2, 'three'));
$this->assertEquals([1, 2, 'three'], $curriedRight(1, 2, 'three'));
$fnTwoArgs = function ($x, $y) {
return func_get_args();
};
$curried = curry($fnTwoArgs);
$curriedRight = curry_right($fnTwoArgs);
$curriedOne = $curried(1);
$curriedRightOne = $curriedRight(2);
$curriedRightTwo = $curriedRight('three');
$this->assertEquals([1, 2], $fnTwoArgs(1, 2));
$this->assertEquals([1, 2], $curried(1, 2));
$this->assertEquals([1, 2], $curriedRight(2, 1));
$this->assertEquals([1, 2, 'three'], $fnTwoArgs(1, 2, 'three'));
$this->assertEquals([1, 2, 'three'], $curried(1, 2, 'three'));
$this->assertEquals([1, 2, 'three'], $curriedRight('three', 2, 1));
$this->assertEquals([1, 2], $curriedOne(2));
$this->assertEquals([1, 2], $curriedRightOne(1));
$this->assertEquals([1, 2, 'three'], $curriedOne(2, 'three'));
$this->assertEquals([1, 2, 'three'], $curriedRightTwo(2, 1));
}
/**
* @test
*/
public function curry_with_placeholders()
{
$minus = curry(function ($x, $y) {
return $x - $y;
});
$decrement = $minus(__(), 1);
$this->assertEquals(9, $decrement(10));
$introduce = curry(function ($name, $age, $job, $details = '') {
return "{$name}, {$age} years old, is a {$job} {$details}";
});
$introduceDeveloper = $introduce(__(), __(), 'Developer');
$this->assertEquals("Foo, 20 years old, is a Developer ", $introduceDeveloper('Foo', 20));
$introduceOld = $introduce(__(), 99, __());
$this->assertEquals("Foo, 99 years old, is a Developer and Cooker as well", $introduceOld('Foo', 'Developer', 'and Cooker as well'));
$introduceSkipName = $introduce(__());
$introduceSkipJob = $introduceSkipName(99, __());
$this->assertEquals("Foo, 99 years old, is a Cooker ", $introduceSkipJob('Foo', 'Cooker'));
$this->assertEquals("Foo, 99 years old, is a Cooker yumm !", $introduceSkipJob('Foo', 'Cooker', 'yumm !'));
$reduce = curry('array_reduce');
$add = function ($x, $y) {
return $x + $y;
};
$sum = $reduce(__(), $add);
$this->assertEquals(10, $sum([1, 2, 3, 4], 0));
}
/**
* @test
*/
public function rest()
{
$this->assertEquals([1], _rest([1, 1]));
$this->assertEquals(['a', 'b'], _rest([1, 'a', 'b']));
$this->assertEquals([], _rest([1]));
$this->assertEquals([], _rest([]));
}
/**
* @test
* @dataProvider provider_is_fullfilled
*/
public function is_fullfilled($isFullfilled, $args, $callable)
{
$this->assertSame($isFullfilled, _is_fullfilled($callable, $args));
}
public function provider_is_fullfilled()
{
return [[false, [], function ($a) {
}], [true, [], function () {
}], [true, [1], function ($a) {
}], [false, [1], function ($a, $b) {
}], [false, [1], [new TestSubject(), 'add2']], [true, [1, 2], [new TestSubject(), 'add2']], [true, ['aaa', 'a'], 'strpos'],];
}
}
final class TestSubject
{
public function identity($a)
{
return $a;
}
public function add2($a, $b)
{
return $a + $b;
}
public function divide2($a, $b)
{
return $a / $b;
}
public function divide3($a, $b, $c)
{
return $a / $b / $c;
}
public function add3($a, $b, $c)
{
return $a + $b + $c;
}
public function add4($a, $b, $c, $d)
{
return $a + $b + $c + $d;
}
}

View File

@@ -0,0 +1,130 @@
<?php declare(strict_types=1);
/*
* This file is part of the Parsica library.
*
* Copyright (c) 2020 Mathias Verraes <mathias@verraes.net>
*
* For the full copyright and license information, please view the LICENSE
* file that was distributed with this source code.
*/
namespace Tests\Parsica\Parsica\Examples;
use PHPUnit\Framework\TestCase;
use Parsica\Parsica\Parser;
use Parsica\Parsica\PHPUnit\ParserAssertions;
use function Parsica\Parsica\{between, char, choice, keepFirst, recursive, skipHSpace, string};
use function Parsica\Parsica\Expression\{binaryOperator, expression, leftAssoc, prefix, unaryOperator};
final class BooleanExpressionsTest extends TestCase
{
use ParserAssertions;
/** @test */
public function booleanExpressions()
{
$token = fn(Parser $parser) : Parser => keepFirst($parser, skipHSpace());
$parens = fn (Parser $parser): Parser => $token(between($token(char('(')), $token(char(')')), $parser));
$term = fn(): Parser => $token(choice(
string("TRUE")->map(fn($v) => new True_),
string("FALSE")->map(fn($v) => new False_),
));
$NOT = unaryOperator($token(string("NOT")), fn($v) => new Not_($v));
$AND = binaryOperator($token(string("AND")), fn($l, $r) => new And_($l, $r));
$OR = binaryOperator($token(string("OR")), fn($l, $r) => new Or_($l, $r));
$expr = recursive();
$expr->recurse(expression(
$parens($expr)->or($term()),
[
prefix($NOT),
leftAssoc($AND),
leftAssoc($OR),
]
));
$parser = $expr->thenEof();
$input = "TRUE AND NOT (FALSE AND FALSE)";
$expected =
new And_(
new True_(),
new Not_(
new And_(
new False_(),
new False_()
)
)
);
$this->assertParses($input, $parser, $expected);
$parser = $expr->thenEof();
$input = "TRUE AND NOT (FALSE OR TRUE AND FALSE)";
$expected =
new And_(
new True_,
new Not_(
new Or_(
new False_,
new And_(
new True_,
new False_
)
)
)
);
$this->assertParses($input, $parser, $expected);
// Now swapping precedence of AND and OR
$expr = recursive();
$expr->recurse(expression(
$parens($expr)->or($term()),
[
prefix($NOT),
leftAssoc($OR),
leftAssoc($AND),
]
));
$parser = $expr->thenEof();
$input = "TRUE AND NOT (FALSE OR TRUE AND FALSE)";
$expected =
new And_(
new True_,
new Not_(
new And_(
new Or_(
new False_,
new True_
),
new False_
)
)
);
$this->assertParses($input, $parser, $expected);
}
}
interface Boolean_ {}
class True_ implements Boolean_ {}
class False_ implements Boolean_ {}
class Not_ implements Boolean_ {
private Boolean_ $boolean;
function __construct(Boolean_ $boolean){$this->boolean = $boolean;}
}
class And_ implements Boolean_ {
private Boolean_ $l, $r;
function __construct(Boolean_ $l, Boolean_ $r){
$this->l = $l;
$this->r = $r;
}
}
class Or_ implements Boolean_ {
private Boolean_ $l, $r;
function __construct(Boolean_ $l, Boolean_ $r){
$this->l = $l;
$this->r = $r;
}
}

View File

@@ -0,0 +1,63 @@
<?php declare(strict_types=1);
/*
* This file is part of the Parsica library.
*
* Copyright (c) 2020 Mathias Verraes <mathias@verraes.net>
*
* For the full copyright and license information, please view the LICENSE
* file that was distributed with this source code.
*/
namespace Tests\Parsica\Parsica\Examples;
use PHPUnit\Framework\TestCase;
use Parsica\Parsica\Parser;
use function Parsica\Parsica\{atLeastOne, between, char, digitChar, keepFirst, recursive, skipHSpace, string};
use function Parsica\Parsica\Expression\{binaryOperator,
expression,
leftAssoc,
postfix,
prefix,
unaryOperator};
/**
* Parse expressions and calculate the result
*/
final class CalculatorTest extends TestCase
{
/** @test */
public function calculator()
{
$token = fn(Parser $parser) => keepFirst($parser, skipHSpace());
$parens = fn (Parser $parser): Parser => $token(between($token(char('(')), $token(char(')')), $parser));
$term = fn(): Parser => $token(atLeastOne(digitChar()));
$expr = recursive();
$expr->recurse(expression(
$parens($expr)->or($term()),
[
prefix(
unaryOperator(char('-'), fn($v) => -$v),
unaryOperator(char('+'), fn($v) => $v),
),
postfix(
unaryOperator($token(string('--')), fn($v) => $v - 1),
unaryOperator($token(string('++')), fn($v) => $v + 1),
),
leftAssoc(
binaryOperator($token(char('*')), fn($l, $r) => $l * $r),
binaryOperator($token(char('/')), fn($l, $r) => $l / $r),
),
leftAssoc(
binaryOperator($token(char('+')), fn($l, $r) => $l + $r),
binaryOperator($token(char('-')), fn($l, $r) => $l - $r),
),
]
));
$parser = $expr->thenEof();
$result = $parser->tryString("(3 - 2) + -1 - 3 * (1 + 1) / 6");
$this->assertEquals(-1, (string)$result->output());
}
}

View File

@@ -0,0 +1,131 @@
<?php declare(strict_types=1);
/*
* This file is part of the Parsica library.
*
* Copyright (c) 2020 Mathias Verraes <mathias@verraes.net>
*
* For the full copyright and license information, please view the LICENSE
* file that was distributed with this source code.
*/
namespace Tests\Parsica\Parsica\Examples;
use PHPUnit\Framework\TestCase;
use Parsica\Parsica\Parser;
use Parsica\Parsica\PHPUnit\ParserAssertions;
use function Parsica\Parsica\{alphaChar, between, char, collect, digitChar, skipHSpace1, space, string};
final class ExcelTest extends TestCase
{
use ParserAssertions;
/** @test */
public function spaceOrOperatorDependingOnContext()
{
// https://twitter.com/Mark_Baker/status/1309919606887374849?s=20
// and https://twitter.com/Mark_Baker/status/1309960902482026498?s=20
// `=SUM(B7:D7 C6:C8)` where space is the intersection operator for the
// intersection between the two ranges B7:D7 and C6:C8 (ie. C7),
// and `=A1 & B1` where the space is simply whitespace and should be ignored
$parser = $this->excelParser();
$input = "=SUM(B7:D7 C6:C8)";
$expected = new Sum(
new Intersection(
new Range(new Cell("B", "7"), new Cell("D", "7")),
new Range(new Cell("C", "6"), new Cell("C", "8")),
)
);
$this->assertParses($input, $parser, $expected);
$input = "=A1 & B1";
$expected = new Ampersand(
new Cell("A", "1"),
new Cell("B", "1"),
);
$this->assertParses($input, $parser, $expected);
}
private function excelParser(): Parser
{
$parens = fn(Parser $p): Parser => between(char('('), char(')'), $p);
$cell = collect(alphaChar(), digitChar())
->map(fn($o) => new Cell($o[0], $o[1]));
$range = collect($cell, char(':'), $cell)
->map(fn($o) => new Range($o[0], $o[2]));
$intersection = collect($range, space(), $range)
->map(fn($o) => new Intersection($o[0], $o[2]));
$sum = (string('=SUM')->followedBy($parens($intersection)))
->map(fn($o) => new Sum($o));
// consumes space before and after Parser $p
$token = fn(Parser $p): Parser => between(skipHSpace1(), skipHSpace1(), $p);
$ampersand = char('=')->followedBy(collect(
$cell,
$token(char('&')),
$cell
))->map(fn($o) => new Ampersand($o[0], $o[2]));
return $sum->or($ampersand);
}
}
class Cell
{
private $col;
private $row;
function __construct($col, $row)
{
$this->col = $col;
$this->row = $row;
}
}
class Range
{
private Cell $from;
private Cell $to;
function __construct(Cell $from, Cell $to)
{
$this->from = $from;
$this->to = $to;
}
}
class Intersection
{
private Range $l;
private Range $r;
function __construct(Range $l, Range $r)
{
$this->l = $l;
$this->r = $r;
}
}
class Sum
{
private Intersection $intersection;
function __construct(Intersection $intersection)
{
$this->intersection = $intersection;
}
}
class Ampersand
{
private Cell $l;
private Cell $r;
function __construct(Cell $l, Cell $r)
{
$this->l = $l;
$this->r = $r;
}
}

View File

@@ -0,0 +1,47 @@
<?php declare(strict_types=1);
/**
* This file is part of the Parsica library.
*
* Copyright (c) 2020 Mathias Verraes <mathias@verraes.net>
*
* For the full copyright and license information, please view the LICENSE
* file that was distributed with this source code.
*/
namespace Tests\Parsica\Parsica\Examples;
use Parsica\Parsica\PHPUnit\ParserAssertions;
use PHPUnit\Framework\TestCase;
use function Parsica\Parsica\any;
use function Parsica\Parsica\collect;
use function Parsica\Parsica\digitChar;
use function Parsica\Parsica\repeat;
use function Parsica\Parsica\skipSpace;
use function Parsica\Parsica\string;
final class SimpleDateTest extends TestCase
{
use ParserAssertions;
/** @test */
public function simple_date()
{
$jan = (string("January")->or(string("Jan")))->map(fn($v) => 1);
$feb = (string("February")->or(string("Feb")))->map(fn($v) => 2);
$mar = (string("March")->or(string("Mar")))->map(fn($v) => 3);
// ... you get the gist
$month = any($jan, $feb, $mar);
$day = repeat(2, digitChar())->map('intval');
$p1 = collect(
$month->thenIgnore(skipSpace()),
$day
);
$this->assertParses("January 28", $p1, [1, 28]);
$this->assertParses("Jan 28", $p1, [1, 28]);
$this->assertParses("February 28", $p1, [2, 28]);
$this->assertParses("Feb 28", $p1, [2, 28]);
}
}

View File

@@ -0,0 +1,172 @@
<?php declare(strict_types=1);
/*
* This file is part of the Parsica library.
*
* Copyright (c) 2020 Mathias Verraes <mathias@verraes.net>
*
* For the full copyright and license information, please view the LICENSE
* file that was distributed with this source code.
*/
namespace Tests\Parsica\Parsica\Expression;
use PHPUnit\Framework\TestCase;
use Parsica\Parsica\Expression\{LeftAssoc, NonAssoc, Operator, Postfix, Prefix, RightAssoc};
use Parsica\Parsica\Parser;
use Parsica\Parsica\PHPUnit\ParserAssertions;
use function Parsica\Parsica\atLeastOne;
use function Parsica\Parsica\between;
use function Parsica\Parsica\char;
use function Parsica\Parsica\collect;
use function Parsica\Parsica\digitChar;
use function Parsica\Parsica\eof;
use function Parsica\Parsica\Expression\binaryOperator;
use function Parsica\Parsica\Expression\expression;
use function Parsica\Parsica\Expression\leftAssoc;
use function Parsica\Parsica\Expression\nonAssoc;
use function Parsica\Parsica\Expression\operator;
use function Parsica\Parsica\Expression\postfix;
use function Parsica\Parsica\Expression\prefix;
use function Parsica\Parsica\Expression\rightAssoc;
use function Parsica\Parsica\Expression\unaryOperator;
use function Parsica\Parsica\keepFirst;
use function Parsica\Parsica\recursive;
use function Parsica\Parsica\skipHSpace;
use function Parsica\Parsica\string;
final class ExpressionsTest extends TestCase
{
use ParserAssertions;
private Parser $expression;
protected function setUp() : void
{
/** Consumes whitespace */
$token = fn(Parser $parser) => keepFirst($parser, skipHSpace());
$parens = fn (Parser $parser): Parser => $token(between($token(char('(')), $token(char(')')), $parser));
$term = fn(): Parser => $token(atLeastOne(digitChar()));
$expr = recursive();
$primaryTermParser = $parens($expr)->or($term());
$expr->recurse(expression(
$primaryTermParser,
[
prefix(
unaryOperator(char('-'), fn($v) => "(-$v)"),
unaryOperator(char('+'), fn($v) => "(+$v)"),
),
postfix(
unaryOperator($token(string('--')), fn($v) => "($v--)"),
unaryOperator($token(string('++')), fn($v) => "($v++)"),
),
leftAssoc(
binaryOperator($token(char('*')), fn($l, $r) => "($l * $r)"),
binaryOperator($token(char('/')), fn($l, $r) => "($l / $r)"),
),
rightAssoc(
// imaginary right associative operator
binaryOperator($token(char('R')), fn($l, $r) => "($l R $r)"),
binaryOperator($token(string('R2')), fn($l, $r) => "($l R2 $r)"),
),
leftAssoc(
binaryOperator($token(char('-')), fn($l, $r) => "($l - $r)"),
binaryOperator($token(char('+')), fn($l, $r) => "($l + $r)"),
),
nonAssoc(
// imaginary non-associative operator
binaryOperator($token(char('§')), fn($l, $r) => "($l § $r)"),
)
]
));
$this->expression = $expr;
}
/**
* @test
* @dataProvider examples
*/
public function expression(string $input, string $expected)
{
$parser = $this->expression->thenEof();
$result = $parser->tryString($input);
$this->assertEquals($expected, (string)$result->output());
}
public function examples()
{
$examples = [
["1", "1"],
["1 + 1", "(1 + 1)"],
["1 * 1", "(1 * 1)"],
["(1 + 1) + 1", "((1 + 1) + 1)"],
["1 + (1 + 1)", "(1 + (1 + 1))"],
["1 * (1 + 1)", "(1 * (1 + 1))"],
["1 + (1 * 1)", "(1 + (1 * 1))"],
["(1 * 2) + (1 * 1)", "((1 * 2) + (1 * 1))"],
["1 + 2 + 3", "((1 + 2) + 3)"],
["1 * 2 * 3", "((1 * 2) * 3)"],
["1 * 2 + 3", "((1 * 2) + 3)"],
["1 + 2 * 3", "(1 + (2 * 3))"],
["4 + 5 + 2 * 3", "((4 + 5) + (2 * 3))"],
["4 + 5 * 2 * 3", "(4 + ((5 * 2) * 3))"],
["1 * 2 * 3 / 4 * 5", "((((1 * 2) * 3) / 4) * 5)"],
["1 / 2 / 3 * 4", "(((1 / 2) / 3) * 4)"],
["1 - 2 + 3", "((1 - 2) + 3)"],
["1 - 2 * 3", "(1 - (2 * 3))"],
["1 + 5 - 2 * 3 - 6", "(((1 + 5) - (2 * 3)) - 6)"],
["-1", "(-1)"],
["-1 + -2", "((-1) + (-2))"],
["-(-1)", "(-(-1))"],
["-(-(1))", "(-(-1))"],
// @todo crazy slow for some reason
// ["(-(-(1)))", "(-(-1))"],
["-1 * +1", "((-1) * (+1))"],
["1 § 2", "(1 § 2)"],
["1 + 5 § 2 * 3 - 6", "((1 + 5) § ((2 * 3) - 6))"],
["1 R 2 R 3", "(1 R (2 R 3))"],
["1 R 2 R 3 R 4", "(1 R (2 R (3 R 4)))"],
["1 - 2 * 3 R 4", "(1 - ((2 * 3) R 4))"],
["1 - 2 * 3 R 4 R 5", "(1 - ((2 * 3) R (4 R 5)))"],
["1++", "(1++)"],
["1++ + 2++", "((1++) + (2++))"],
["1--", "(1--)"],
["1-- + 2--", "((1--) + (2--))"],
["1++ + 2--", "((1++) + (2--))"],
["1-- + 2++", "((1--) + (2++))"],
];
return array_combine(array_column($examples, 0), $examples);
}
/**
* @test
* @dataProvider unparsableExamples
*/
public function unparsableExpressions(string $input)
{
$parser = $this->expression->thenEof();
$this->assertParseFails($input, $parser);
}
public function unparsableExamples()
{
$examples = [
["--1"],
["1--++"],
["1++--"],
["1 § 2 § 3"],
["1 § 2 * 3 § 4"],
["1 § 2 * 3 § 4 § 5"],
];
return array_combine(array_column($examples, 0), $examples);
}
}

View File

@@ -0,0 +1,47 @@
<?php declare(strict_types=1);
/*
* This file is part of the Parsica library.
*
* Copyright (c) 2020 Mathias Verraes <mathias@verraes.net>
*
* For the full copyright and license information, please view the LICENSE
* file that was distributed with this source code.
*/
namespace Tests\Parsica\Parsica\Internal\FP;
use PHPUnit\Framework\TestCase;
use function Parsica\Parsica\Curry\curry;
final class CurryTest extends TestCase
{
/** @test */
public function curry()
{
$f = fn($a, $b, $c) => $a + $b + $c;
$curried = curry($f);
$this->assertIsCallable($curried);
$this->assertIsCallable($curried(1));
$this->assertIsCallable($curried(1)(2));
$this->assertIsCallable($curried(1)(2));
$this->assertEquals(6, $curried(1)(2)(3));
}
/** @test */
public function partial_application()
{
$f = fn($a, $b, $c) => $a + $b + $c;
$this->assertIsCallable(curry($f, 1));
$this->assertIsCallable(curry($f, 1, 2));
// I would expect this:
// $this->assertEquals(6, curry($f, 1, 2, 3));
// But we must add a () at the end, which I feel is a bug:
$this->assertIsCallable(curry($f, 1, 2, 3));
$this->assertEquals(6, curry($f, 1, 2, 3)());
}
}

View File

@@ -0,0 +1,59 @@
<?php declare(strict_types=1);
/*
* This file is part of the Parsica library.
*
* Copyright (c) 2020 Mathias Verraes <mathias@verraes.net>
*
* For the full copyright and license information, please view the LICENSE
* file that was distributed with this source code.
*/
namespace Tests\Parsica\Parsica\Internal\FP;
use PHPUnit\Framework\TestCase;
use function Parsica\Parsica\Internal\FP\foldr;
final class FoldrTest extends TestCase
{
/** @test */
public function sum_implemented_as_foldr()
{
$actual = foldr([1, 2, 3], fn ($x, $y) => $x + $y, 0);
$this->assertSame(6, $actual);
}
/** @test */
public function associativity_is_correct()
{
$minus = fn($x, $y) => $x - $y;
$input = [1, 2, 3, 4, 5];
$init = 0;
// foldl: ((((0 - 1) - 2) - 3) - 4) - 5) = -15
// foldr: (1 - (2 - (3 - (4 - (5 - 0))))) = 3
$actual = array_reduce($input, $minus, $init);
$this->assertSame(-15, $actual);
$actual = foldr($input, $minus, $init);
$this->assertSame(3, $actual);
}
/** @test */
public function x()
{
$concat = fn($x, $y) => "$x$y";
$input = [1, 2, 3, 4, 5];
$init = "0";
// foldl: 012345
// foldr: 123450
$actual = array_reduce($input, $concat, $init);
$this->assertSame("012345", $actual);
$actual = foldr($input, $concat, $init);
$this->assertSame("123450", $actual);
}
}

View File

@@ -0,0 +1,65 @@
<?php declare(strict_types=1);
/**
* This file is part of the Parsica library.
*
* Copyright (c) 2020 Mathias Verraes <mathias@verraes.net>
*
* For the full copyright and license information, please view the LICENSE
* file that was distributed with this source code.
*/
namespace Tests\Parsica\Parsica\Internal;
use PHPUnit\Framework\TestCase;
use Parsica\Parsica\Internal\Position;
use Parsica\Parsica\StringStream;
use function Parsica\Parsica\char;
final class PositionTest extends TestCase
{
/** @test */
public function update()
{
$position = Position::initial();
$this->assertEquals(1, $position->line());
$this->assertEquals(1, $position->column());
$position = $position->advance("a");
$this->assertEquals(1, $position->line());
$this->assertEquals(2, $position->column());
$position = $position->advance("\n");
$this->assertEquals(2, $position->line());
$this->assertEquals(1, $position->column());
$position = $position->advance("\n\n\nabc");
$this->assertEquals(5, $position->line());
$this->assertEquals(4, $position->column());
}
/** @test */
public function position_in_sequence()
{
$parser = char('a')->followedBy(char('b'));
$input = new StringStream("abc", Position::initial());
$result = $parser->run($input);
$expectedColumn = 3;
$actualColumn = $result->remainder()->position()->column();
$this->assertEquals($expectedColumn, $actualColumn);
}
/** @test */
public function position_with_tabs()
{
$expected = 10;
// All of these move the column position to 10
$position = Position::initial()->advance("123456789");
$this->assertEquals($expected, $position->column());
$position = Position::initial()->advance("\t56789");
$this->assertEquals($expected, $position->column());
$position = Position::initial()->advance("\t\t9");
$this->assertEquals($expected, $position->column());
$position = Position::initial()->advance("1\t56789");
$this->assertEquals($expected, $position->column());
$position = Position::initial()->advance("123\t56789");
$this->assertEquals($expected, $position->column());
}
}

View File

@@ -0,0 +1,52 @@
<?php declare(strict_types=1);
/**
* This file is part of the Parsica library.
*
* Copyright (c) 2020 Mathias Verraes <mathias@verraes.net>
*
* For the full copyright and license information, please view the LICENSE
* file that was distributed with this source code.
*/
namespace Tests\Parsica\Parsica\Internal;
use PHPUnit\Framework\TestCase;
use Parsica\Parsica\Internal\Position;
use Parsica\Parsica\StringStream;
final class StringStreamTest extends TestCase
{
/** @test */
public function take1()
{
$s = new StringStream("abc");
$t = $s->take1();
$this->assertEquals("a", $t->chunk());
$expectedPosition = new Position("<input>", 1, 2);
$expectedStream = new StringStream("bc", $expectedPosition);
$this->assertEquals($expectedStream, $t->stream());
}
/** @test */
public function takeN()
{
$s = new StringStream("abcde");
$t = $s->takeN(3);
$this->assertEquals("abc", $t->chunk());
$expectedPosition = new Position("<input>", 1, 4);
$expectedStream = new StringStream("de", $expectedPosition);
$this->assertEquals($expectedStream, $t->stream());
}
/** @test */
public function takeWhile()
{
$s = new StringStream("abc\nde");
$t = $s->takeWhile(fn($c) => $c !== "\n");
$this->assertEquals("abc", $t->chunk());
$expectedPosition = new Position("<input>", 1, 4);
$expectedStream = new StringStream("\nde", $expectedPosition);
$this->assertEquals($expectedStream, $t->stream());
}
}

View File

@@ -0,0 +1,161 @@
<?php declare(strict_types=1);
/**
* This file is part of the Parsica library.
*
* Copyright (c) 2020 Mathias Verraes <mathias@verraes.net>
*
* For the full copyright and license information, please view the LICENSE
* file that was distributed with this source code.
*/
namespace Tests\Parsica\Parsica\Issues;
use PHPUnit\Framework\TestCase;
use Parsica\Parsica\Parser;
use Parsica\Parsica\PHPUnit\ParserAssertions;
use function Parsica\Parsica\alphaNumChar;
use function Parsica\Parsica\any;
use function Parsica\Parsica\atLeastOne;
use function Parsica\Parsica\between;
use function Parsica\Parsica\char;
use function Parsica\Parsica\either;
use function Parsica\Parsica\emit;
use function Parsica\Parsica\eof;
use function Parsica\Parsica\fail;
use function Parsica\Parsica\many;
use function Parsica\Parsica\succeed;
/**
* https://github.com/mathiasverraes/parsica/issues/6
*/
final class GH26_Test extends TestCase
{
use ParserAssertions;
private static function pathParser(): Parser
{
$sep = char('/')
->label('directory separator');
// unix supports other characters, such as space, so adapt if needed
$name = atLeastOne(char('.')->or(char('_'))->or(alphaNumChar()))
->label("directory or filename");
$parser = many($sep->followedBy($name));
return $parser;
}
/** @test */
public function parsing_a_simple_path()
{
$parser = self::pathParser();
$input = "/a/b/c/file1";
$expected = ["a", "b", "c", "file1"];
$this->assertParses($input, $parser, $expected);
}
/**
* https://github.com/mathiasverraes/parsica/issues/6#issuecomment-653772920
*
* @test
*/
public function only_the_first_successful_parser_in_an_either_should_call_emit()
{
$x = new class {
public bool $first = false;
public bool $second = false;
};
$parser = either(
emit(
succeed(),
function ($output) use ($x) {
$x->first = true; // is called
}
),
emit(
succeed(),
function ($output) use ($x) {
$x->second = true; // is not called
}
)
);
$result = $parser->tryString('test');
$this->assertEquals(true, $x->first);
$this->assertEquals(false, $x->second, "Either should only call emit on the first successful parser");
}
/**
* @TODO Set $repeatParser at 500 and fix the performance issues.
*
* https://github.com/mathiasverraes/parsica/issues/6#issuecomment-653772920
*
* @test
*/
public function it_should_parse_500_times_in_under_100_ms()
{
// Number of times we run the parser
$repeatParser = 1;//500;
$propertyName = atLeastOne(alphaNumChar());
$type = emit(
either(
eof(),
char('@')
->followedBy($propertyName)
->thenIgnore(eof()),
),
function () {}
);
$map = emit(
char('.')->followedBy($propertyName),
function () {}
);
$list = emit(
between(
char('['),
char(']'),
either(
char('@')
->followedBy($propertyName)
->map(fn($value) => [
'discriminatorName' => $value,
'keepKeys' => true
]),
$propertyName
->map(fn($value) => [
'discriminatorName' => $value,
'keepKeys' => false
]),
)
),
function () {}
);
$root = emit(
char('$'),
function () {}
);
$rest = many(any($map, $list))->followedBy($type);
$parser = either(
fail("message"), // $context->preflightCacheParser(),
$root
)->followedBy($rest);
$start = microtime(true);
for ($i = 0; $i < $repeatParser; $i++) {
$parser->tryString('$.q.w[@1].e[2]@int');
}
$end = microtime(true);
$this->assertLessThan(0.1, $end - $start);
}
}

View File

@@ -0,0 +1,46 @@
<?php declare(strict_types=1);
/**
* This file is part of the Parsica library.
*
* Copyright (c) 2020 Mathias Verraes <mathias@verraes.net>
*
* For the full copyright and license information, please view the LICENSE
* file that was distributed with this source code.
*/
namespace Tests\Parsica\Parsica\JSON;
use PHPUnit\Framework\TestCase;
use Parsica\Parsica\JSON\JSON;
use Parsica\Parsica\PHPUnit\ParserAssertions;
use function Parsica\Parsica\JSON\key_value;
final class ArrayTest extends TestCase
{
use ParserAssertions;
/**
* @test
* @dataProvider examples
*/
public function array(string $input, $expected)
{
$parser = JSON::array();
$this->assertParses($input, $parser, $expected);
}
public function examples()
{
return [
['[]', []],
['[ ] ', []],
['[ 1 ] ', [1.0]],
['[ true ] ', [true]],
['[ 1.23, "abc", null, false ] ', [ 1.23, "abc", null, false]],
];
}
}

View File

@@ -0,0 +1,64 @@
<?php declare(strict_types=1);
/**
* This file is part of the Parsica library.
*
* Copyright (c) 2020 Mathias Verraes <mathias@verraes.net>
*
* For the full copyright and license information, please view the LICENSE
* file that was distributed with this source code.
*/
namespace Tests\Parsica\Parsica\JSON;
use PHPUnit\Framework\TestCase;
use Parsica\Parsica\JSON\JSON;
final class JSONTest extends TestCase
{
public static function examples(): array
{
return [
['true'],
['false'],
['null'],
['"abc"'],
['{"a b":"c d"}'],
[' { " a b " : " c d " } '],
[' [ { " a b " : " c d " } ] '],
[' [ { " a b " : " c d " } , { "ef" : "gh" } ] '],
['"some weird chars \\n in \\t strings \\u9999 should do it"'],
['"this \\\\ is just a backslash"'],
[<<<JSON
[
-1.23,
null,
true,
[
[
{
"a": true
},
{
"b": false,
"c": -1.23456789E+123
}
]
]
]
JSON,
],
[file_get_contents(__DIR__ . '/../../composer.json')],
];
}
/**
* @test
* @dataProvider examples
*/
public function compare_to_json_decode(string $input)
{
$native = json_decode($input);
$parsica = JSON::json()->tryString($input)->output();
$this->assertEquals($native, $parsica);
}
}

View File

@@ -0,0 +1,38 @@
<?php declare(strict_types=1);
/**
* This file is part of the Parsica library.
*
* Copyright (c) 2020 Mathias Verraes <mathias@verraes.net>
*
* For the full copyright and license information, please view the LICENSE
* file that was distributed with this source code.
*/
namespace Tests\Parsica\Parsica\JSON;
use PHPUnit\Framework\TestCase;
use Parsica\Parsica\JSON\JSON;
use Parsica\Parsica\PHPUnit\ParserAssertions;
final class NumberTest extends TestCase
{
use ParserAssertions;
/** @test */
public function number()
{
$this->assertParses("0", JSON::number(), 0.0);
$this->assertParses("0.1", JSON::number(), 0.1);
$this->assertParses("0.15", JSON::number(), 0.15);
$this->assertParses("0.10", JSON::number(), 0.1);
$this->assertParses("-0.1", JSON::number(), -0.1);
$this->assertParses("1.2345678", JSON::number(), 1.2345678);
$this->assertParses("-1.2345678", JSON::number(), -1.2345678);
$this->assertParses("-1.23456789E+123", JSON::number(), -1.23456789E+123);
$this->assertParses("-1.23456789e-123", JSON::number(), -1.23456789E-123);
$this->assertParses("-1E-123", JSON::number(), -1E-123);
$this->assertParses("-1E-123 ", JSON::number(), -1E-123);
}
}

View File

@@ -0,0 +1,45 @@
<?php declare(strict_types=1);
/**
* This file is part of the Parsica library.
*
* Copyright (c) 2020 Mathias Verraes <mathias@verraes.net>
*
* For the full copyright and license information, please view the LICENSE
* file that was distributed with this source code.
*/
namespace Tests\Parsica\Parsica\JSON;
use PHPUnit\Framework\TestCase;
use Parsica\Parsica\JSON\JSON;
use Parsica\Parsica\PHPUnit\ParserAssertions;
use Parsica\Parsica\StringStream;
final class ObjectTest extends TestCase
{
use ParserAssertions;
/** @test */
public function member()
{
$input = '"foo":"bar"';
$parser = JSON::member();
$this->assertParses($input, $parser, ["foo" => "bar"]);
}
/** @test */
public function object()
{
$input = '{"foo":"bar","bar":"foo"}';
$parser = JSON::object();
$result = $parser->run(new StringStream($input));
$this->assertParses($input, $parser, (object)["foo" => "bar", "bar" => "foo"]);
}
}

View File

@@ -0,0 +1,61 @@
<?php declare(strict_types=1);
/**
* This file is part of the Parsica library.
*
* Copyright (c) 2020 Mathias Verraes <mathias@verraes.net>
*
* For the full copyright and license information, please view the LICENSE
* file that was distributed with this source code.
*/
namespace Tests\Parsica\Parsica\JSON;
use PHPUnit\Framework\TestCase;
use Parsica\Parsica\JSON\JSON;
use Parsica\Parsica\PHPUnit\ParserAssertions;
final class StringLiteralTest extends TestCase
{
use ParserAssertions;
public static function escapedChars(): array
{
return [
// label => [literal that will appear in json, expected character it results in]
"quotation mark" => ["\\\"", '"'],
"reverse solidus" => ['\\\\', '\\'],
"solidus" => ["\\/", '/'],
"backspace" => ["\\b", mb_chr(8)],
"formfeed" => ["\\f", mb_chr(12)],
"linefeed" => ["\\n", "\n"],
"carriage return" => ["\\r", "\r"],
"horizontal tab" => ["\\t", "\t"],
];
}
/** @test */
public function empty()
{
$this->assertParses('""', JSON::stringLiteral(), "");
}
/**
* @test
* @dataProvider escapedChars
*/
public function escapes(string $input, string $expected)
{
$this->assertParses('"' . $input . '"', JSON::stringLiteral(), $expected);
$this->assertParses('"a' . $input . '"', JSON::stringLiteral(), "a" . $expected);
$this->assertParses('"' . $input . 'a"', JSON::stringLiteral(), $expected . "a");
}
/** @test */
public function escape_hex()
{
$input = '"\\u0BB9\\u0BB2\\u0BCB\\u0020\\u0B89\\u0BB2\\u0B95\\u0BAE\\u0BCD"';
$this->assertParses($input, JSON::stringLiteral(), "ஹலோ உலகம்");
}
}

View File

@@ -0,0 +1,32 @@
<?php declare(strict_types=1);
/**
* This file is part of the Parsica library.
*
* Copyright (c) 2020 Mathias Verraes <mathias@verraes.net>
*
* For the full copyright and license information, please view the LICENSE
* file that was distributed with this source code.
*/
namespace Tests\Parsica\Parsica\JSON;
use PHPUnit\Framework\TestCase;
use Parsica\Parsica\JSON\JSON;
use Parsica\Parsica\PHPUnit\ParserAssertions;
use function Parsica\Parsica\char;
use function Parsica\Parsica\JSON\token;
final class TokenTest extends TestCase
{
use ParserAssertions;
/** @test */
public function token()
{
$parser = JSON::token(char('a'));
$input = "a \n \tb";
$this->assertRemainder($input, $parser, "b");
}
}

View File

@@ -0,0 +1,79 @@
<?php declare(strict_types=1);
/**
* This file is part of the Parsica library.
*
* Copyright (c) 2020 Mathias Verraes <mathias@verraes.net>
*
* For the full copyright and license information, please view the LICENSE
* file that was distributed with this source code.
*/
namespace Tests\Parsica\Parsica\JSON;
use PHPUnit\Framework\TestCase;
use Parsica\Parsica\JSON\JSON;
use Parsica\Parsica\PHPUnit\ParserAssertions;
use function Parsica\Parsica\JSON\ws;
final class WhitespaceTest extends TestCase
{
use ParserAssertions;
/** @test */
public function ws_empty()
{
$expected = null;
$input = "";
$parser = JSON::ws();
$this->assertParses($input, $parser, $expected);
}
/** @test */
public function ws_space()
{
$expected = null;
$input = " ";
$parser = JSON::ws();
$this->assertParses($input, $parser, $expected);
}
/** @test */
public function ws_tab()
{
$expected = null;
$input = "\t";
$parser = JSON::ws();
$this->assertParses($input, $parser, $expected);
}
/** @test */
public function ws_newline()
{
$expected = null;
$input = "\n";
$parser = JSON::ws();
$this->assertParses($input, $parser, $expected);
}
/** @test */
public function ws_carriage_return()
{
$expected = null;
$input = "\r";
$parser = JSON::ws();
$this->assertParses($input, $parser, $expected);
}
/** @test */
public function a_bunch_of_whitespace()
{
$expected = null;
$input = " \n \r \t a";
$parser = JSON::ws();
$this->assertParses($input, $parser, $expected);
$this->assertRemainder($input, $parser, "a");
}
}

View File

@@ -0,0 +1,80 @@
<?php declare(strict_types=1);
/**
* This file is part of the Parsica library.
*
* Copyright (c) 2020 Mathias Verraes <mathias@verraes.net>
*
* For the full copyright and license information, please view the LICENSE
* file that was distributed with this source code.
*/
namespace Tests\Parsica\Parsica\PHPUnit;
use PHPUnit\Framework\TestCase;
use Parsica\Parsica\PHPUnit\ParserAssertions;
final class ParserTestCaseTest extends TestCase
{
use ParserAssertions;
/** @test */
public function strict_equality()
{
$this->assertEquals(1.23, "1.23",
"A string and a float are equal in php");
$this->assertNotSame(1.23, "1.23",
"A string and a float are not the same in php");
$this->assertSame(1.23, 1.23,
"Primitives are compared by value");
$this->assertEquals(new MyType(1.23), new MyType(1.23),
"Weak equality works for objects");
$this->assertNotSame(new MyType(1.23), new MyType(1.23),
"...but value object instances with the same value do not have equality");
$this->assertTrue((new MyType(1.23))->equals(new MyType(1.23)),
"We can solve it with an equals() method, but the user doesn't always have "
. "control of the types.");
$this->assertTrue(true,
"Therefore, we need something that will behave like assertSame for primitives, "
. "like assertEquals for objects of the same type,"
. "and fail for everything else.");
$this->assertStrictlyEquals(1.23, 1.23);
$this->assertStrictlyEquals(new MyType(1.23), new MyType(1.23));
/*
$this->assertStrictlyEquals(1.23, "1.23",
"should fail");
$this->assertStrictlyEquals("1.23", 1.23,
"should fail");
$this->assertStrictlyEquals(new MyType(1.23), new MyType(7.89),
"should fail");
*/
}
/** @test */
public function strictlyEquals_for_arrays()
{
$this->assertStrictlyEquals(
[1, new MyType(5.0)],
[1, new MyType(5.0)]
);
}
}
final class MyType
{
private float $x;
public function __construct(float $x)
{
$this->x = $x;
}
public function equals(MyType $other): bool
{
return $this->x === $other->x;
}
}

View File

@@ -0,0 +1,53 @@
<?php declare(strict_types=1);
/**
* This file is part of the Parsica library.
*
* Copyright (c) 2020 Mathias Verraes <mathias@verraes.net>
*
* For the full copyright and license information, please view the LICENSE
* file that was distributed with this source code.
*/
namespace Tests\Parsica\Parsica\ParseResult;
use PHPUnit\Framework\TestCase;
use Parsica\Parsica\Internal\Fail;
use Parsica\Parsica\Internal\Succeed;
use Parsica\Parsica\PHPUnit\ParserAssertions;
use Parsica\Parsica\StringStream;
final class AppendTest extends TestCase
{
use ParserAssertions;
/** @test */
public function append_strings()
{
$remainder = new StringStream("");
$succeed1 = new Succeed("Parsed1", new StringStream("Remain1"));
$succeed2 = new Succeed("Parsed2", new StringStream("Remain2"));
$fail1 = new Fail("Expected1", new StringStream("Got1"));
$fail2 = new Fail("Expected2", new StringStream("Got2"));
$this->assertStrictlyEquals(new Succeed("Parsed1Parsed2", new StringStream("Remain2")), $succeed1->append($succeed2));
$this->assertStrictlyEquals(new Fail("Expected1", new StringStream("Got1")), $succeed1->append($fail1));
$this->assertStrictlyEquals(new Fail("Expected1", new StringStream("Got1")), $fail1->append($succeed2));
$this->assertStrictlyEquals(new Fail("Expected1", new StringStream("Got1")), $fail1->append($fail2));
}
/** @test */
public function append_with_null()
{
$null1 = new Succeed(null, new StringStream("Remain Null 1"));
$null2 = new Succeed(null, new StringStream("Remain Null 2"));
$string = new Succeed("String", new StringStream("Remain String"));
$first = $string->append($null1);
$this->assertStrictlyEquals(new Succeed("String", new StringStream("Remain Null 1")), $first);
$second = $null1->append($string);
$this->assertStrictlyEquals(new Succeed("String", new StringStream("Remain String")), $second);
$both = $null1->append($null2);
$this->assertStrictlyEquals(new Succeed(null, new StringStream("Remain Null 2")), $both);
}
}

View File

@@ -0,0 +1,277 @@
<?php declare(strict_types=1);
/**
* This file is part of the Parsica library.
*
* Copyright (c) 2020 Mathias Verraes <mathias@verraes.net>
*
* For the full copyright and license information, please view the LICENSE
* file that was distributed with this source code.
*/
namespace Tests\Parsica\Parsica\ParseResult;
use PHPUnit\Framework\TestCase;
use Parsica\Parsica\Internal\Position;
use Parsica\Parsica\PHPUnit\ParserAssertions;
use Parsica\Parsica\StringStream;
use function Parsica\Parsica\char;
use function Parsica\Parsica\many;
use function Parsica\Parsica\newline;
use function Parsica\Parsica\repeat;
use function Parsica\Parsica\skipSpace;
use function Parsica\Parsica\string;
use function Parsica\Parsica\whitespace;
final class ErrorReportingTest extends TestCase
{
use ParserAssertions;
/** @test */
public function failing_on_the_first_token()
{
$parser = char('a');
$input = new StringStream("bcd");
$result = $parser->run($input);
$expected =
<<<ERROR
<input>:1:1
|
1 | bcd
| ^— column 1
Unexpected 'b'
Expecting 'a'
ERROR;
$this->assertEquals($expected, $result->errorMessage());
}
/** @test */
public function failing_with_an_advanced_position()
{
$parser = char('a');
$input = new StringStream("bcd", new Position("/path/to/file", 5, 10));
$result = $parser->run($input);
$expected =
<<<ERROR
/path/to/file:5:10
|
5 | ...bcd
| ^— column 10
Unexpected 'b'
Expecting 'a'
ERROR;
$this->assertEquals($expected, $result->errorMessage());
}
/** @test */
public function works_for_parsers_with_more_than_one_character()
{
$parser = string("abc");
$input = new StringStream("xyz", Position::initial("/path/to/file"));
$result = $parser->run($input);
$expected =
<<<ERROR
/path/to/file:1:1
|
1 | xyz
| ^— column 1
Unexpected 'x'
Expecting 'abc'
ERROR;
$this->assertEquals($expected, $result->errorMessage());
}
/** @test */
public function advance_the_column_with_followedBy()
{
$parser = char('a')->sequence(char('b'));
$input = new StringStream("axy");
$result = $parser->run($input);
$expected =
<<<ERROR
<input>:1:2
|
1 | ...xy
| ^— column 2
Unexpected 'x'
Expecting 'b'
ERROR;
$this->assertEquals($expected, $result->errorMessage());
}
/** @test */
public function works_with_custom_labels()
{
$parser = char('a')->sequence(char('b'))->label("a followed by b");
$input = new StringStream("axy");
$result = $parser->run($input);
$expected =
<<<ERROR
<input>:1:2
|
1 | ...xy
| ^— column 2
Unexpected 'x'
Expecting a followed by b
ERROR;
$this->assertEquals($expected, $result->errorMessage());
}
/** @test */
public function tabs_move_column_position()
{
$parser = skipSpace()->sequence(char('a'));
$input = new StringStream("\t\tbcdefgh");
$result = $parser->run($input);
$expected =
<<<ERROR
<input>:1:9
|
1 | ...bcdefgh
| ^— column 9
Unexpected 'b'
Expecting 'a'
ERROR;
$this->assertEquals($expected, $result->errorMessage());
}
/** @test */
public function line_numbers_space_out()
{
$parser = skipSpace()->sequence(char('a'));
$input = new StringStream(str_repeat("\n", 99) . "b");
$result = $parser->run($input);
$expected =
<<<ERROR
<input>:100:1
|
100 | b
| ^— column 1
Unexpected 'b'
Expecting 'a'
ERROR;
$this->assertEquals($expected, $result->errorMessage());
}
/** @test */
public function multiline_input()
{
$parser = many(newline())->sequence(char('a'));
$input = new StringStream("\n\n\nbcd\nxyz", Position::initial("/path/to/file"));
$result = $parser->run($input);
$expected =
<<<ERROR
/path/to/file:4:1
|
4 | bcd
| ^— column 1
Unexpected 'b'
Expecting 'a'
ERROR;
$this->assertEquals($expected, $result->errorMessage());
}
/** @test */
public function indicate_position()
{
$parser = repeat(5, char('a'))->sequence(char('b'));
$input = new StringStream("aaaaaXYZ");
$result = $parser->run($input);
$expected =
<<<ERROR
<input>:1:6
|
1 | ...XYZ
| ^— column 6
Unexpected 'X'
Expecting 'b'
ERROR;
$this->assertEquals($expected, $result->errorMessage());
}
/** @test */
public function repeatN()
{
$parser = repeat(5, char('a'))->sequence(char('b'));
$input = new StringStream("aaaaXYZ");
$result = $parser->run($input);
$expected =
<<<ERROR
<input>:1:5
|
1 | ...XYZ
| ^— column 5
Unexpected 'X'
Expecting 5 times 'a'
ERROR;
$this->assertEquals($expected, $result->errorMessage());
}
/** @test */
public function indicate_shorter_position()
{
$parser = string("aa")->sequence(char('b'));
$input = new StringStream("aaXYZ");
$result = $parser->run($input);
$expected =
<<<ERROR
<input>:1:3
|
1 | ...XYZ
| ^— column 3
Unexpected 'X'
Expecting 'b'
ERROR;
$this->assertEquals($expected, $result->errorMessage());
}
/** @test */
public function truncate_long_lines()
{
$parser = skipSpace()->sequence(string("Hello"))->sequence(char(','))->sequence(whitespace())->sequence(string("World"));
$input = new StringStream("\n\n\n\n\n\n\n\n\nHello World! This is a really long line of more than 80 characters, if you count the spaces.");
$result = $parser->run($input);
$expected =
<<<ERROR
<input>:10:6
|
10 | ... World! This is a really long line of more than 80 characters, if you...
| ^— column 6
Unexpected <space>
Expecting ','
ERROR;
$this->assertEquals($expected, $result->errorMessage());
}
/** @test */
public function dont_truncate_short_enough_lines()
{
$parser = char('a');
$input = new StringStream("1234567890123456789012345678901234567890123456789012345678901234567890123456");
$result = $parser->run($input);
$expected =
<<<ERROR
<input>:1:1
|
1 | 1234567890123456789012345678901234567890123456789012345678901234567890123456
| ^— column 1
Unexpected '1'
Expecting 'a'
ERROR;
$this->assertEquals($expected, $result->errorMessage());
}
}

View File

@@ -0,0 +1,40 @@
<?php declare(strict_types=1);
/**
* This file is part of the Parsica library.
*
* Copyright (c) 2020 Mathias Verraes <mathias@verraes.net>
*
* For the full copyright and license information, please view the LICENSE
* file that was distributed with this source code.
*/
namespace Tests\Parsica\Parsica\ParseResult;
use PHPUnit\Framework\TestCase;
use Parsica\Parsica\Internal\Fail;
use Parsica\Parsica\Internal\Succeed;
use Parsica\Parsica\PHPUnit\ParserAssertions;
use Parsica\Parsica\StringStream;
final class FunctorTest extends TestCase
{
use ParserAssertions;
/** @test */
public function map_over_ParseSuccess()
{
$succeed = new Succeed("parsed", new StringStream("remainder"));
$expected = new Succeed("PARSED", new StringStream("remainder"));
$this->assertEquals($expected, $succeed->map('strtoupper'));
}
/** @test */
public function map_over_ParseFailure()
{
$remainder = new StringStream("");
$fail = new Fail("expected", new StringStream("got"));
$expected = new Fail("expected", new StringStream("got"));
$this->assertEquals($expected, $fail->map('strtoupper'));
}
}

View File

@@ -0,0 +1,39 @@
<?php declare(strict_types=1);
/**
* This file is part of the Parsica library.
*
* Copyright (c) 2020 Mathias Verraes <mathias@verraes.net>
*
* For the full copyright and license information, please view the LICENSE
* file that was distributed with this source code.
*/
namespace Tests\Parsica\Parsica\ParseResult;
use PHPUnit\Framework\TestCase;
use Parsica\Parsica\StringStream;
use function Parsica\Parsica\char;
final class ParseResultTest extends TestCase
{
/** @test */
public function ParseSuccess_continueWith()
{
$input = new StringStream("abc");
$success = char('a')->run($input);
$result = $success->continueWith(char('b'));
$this->assertTrue($result->isSuccess());
$this->assertEquals("c", $result->remainder());
}
/** @test */
public function ParseFailure_continueWith()
{
$input = new StringStream("abc");
$fail = char('x')->run($input);
$result = $fail->continueWith(char('a'));
$this->assertTrue($result->isFail());
}
}

View File

@@ -0,0 +1,128 @@
<?php declare(strict_types=1);
/**
* This file is part of the Parsica library.
*
* Copyright (c) 2020 Mathias Verraes <mathias@verraes.net>
*
* For the full copyright and license information, please view the LICENSE
* file that was distributed with this source code.
*/
namespace Tests\Parsica\Parsica\Parser;
use PHPUnit\Framework\TestCase;
use Parsica\Parsica\PHPUnit\ParserAssertions;
use function Parsica\Parsica\alphaChar;
use function Parsica\Parsica\char;
use function Parsica\Parsica\digitChar;
use function Parsica\Parsica\either;
use function Parsica\Parsica\eof;
use function Parsica\Parsica\ignore;
use function Parsica\Parsica\keepFirst;
use function Parsica\Parsica\many;
use function Parsica\Parsica\punctuationChar;
use function Parsica\Parsica\some;
use function Parsica\Parsica\string;
final class AlternativeTest extends TestCase
{
use ParserAssertions;
/** @test */
public function or()
{
$parser = char('a')->or(char('b'));
$this->assertParses("a123", $parser, "a");
$this->assertParses("b123", $parser, "b");
$this->assertParseFails("123", $parser);
}
/** @test */
public function alternatives_for_strings_with_similar_starts()
{
$jan =
either(
string("Jan")->thenIgnore(eof()),
string("January")->thenIgnore(eof()),
);
$this->assertParses("Jan", $jan, "Jan");
$this->assertParses("January", $jan, "January");
// Reverse order
$jan =
either(
string("January")->thenIgnore(eof()),
string("Jan")->thenIgnore(eof()),
);
$this->assertParses("Jan", $jan, "Jan");
$this->assertParses("January", $jan, "January");
}
/** @test */
public function or_order_matters()
{
// The order of clauses in an or() matters. If we do the following parser definition, the parser will consume
// "http", even if the strings starts with "https", leaving "s://..." as the remainder.
$parser = string('http')->or(string('https'));
$input = "https://verraes.net";
$this->assertRemainder($input, $parser, "s://verraes.net");
// The solution is to consider the order of or clauses:
$parser = string('https')->or(string('http'));
$input = "https://verraes.net";
$this->assertParses($input, $parser, "https");
$this->assertRemainder($input, $parser, "://verraes.net");
}
/** @test */
public function optional()
{
$parser = char('a')->optional();
$this->assertParses("", $parser, null, "EOF");
$this->assertParses("abc", $parser, "a");
$this->assertRemainder("abc", $parser, "bc");
$this->assertParses("bc", $parser, null);
$this->assertRemainder("bc", $parser, "bc");
}
/** @test */
public function many()
{
$parser = many(alphaChar());
$this->assertParses("123", $parser, []);
$this->assertParses("Hello", $parser, ["H", "e", "l", "l", "o"]);
$parser = many(alphaChar()->append(digitChar()));
$this->assertParses("1a2b3c", $parser, []);
$this->assertParses("a1b2c3", $parser, ["a1", "b2", "c3"]);
}
/** @test */
public function some()
{
$parser = many(
keepFirst(
some(alphaChar())->map(fn($a) => implode('', $a)),
punctuationChar()->optional()
)
);
$input = "abc,def,ghi";
$expected = ["abc","def","ghi"];
$this->assertParses($input, $parser, $expected);
}
/** @test */
public function some_2()
{
$parser = some(string("foo"));
$this->assertParseFails("bla", $parser);
$this->assertParses("foo", $parser, ["foo"]);
$this->assertParses("foobar", $parser, ["foo"]);
$this->assertParses("foofoo", $parser, ["foo", "foo"]);
$this->assertParses("foofoobar", $parser, ["foo", "foo"]);
}
}

Some files were not shown because too many files have changed in this diff Show More