Parser library for Kotlin
MIT License
A Kotlin multiplatform parser combinator library.
Konbini is hosted on Jitpack, so you need to first att the Jitpack repository to your build file, then add a dependency on Konbini.
For build.gradle.kts
(Kotlin DSL):
repositories {
maven {
url = uri("https://jitpack.io")
}
}
dependencies {
implementation("cc.ekblad.konbini:konbini:0.1.2")
}
Konbini parsers are constructed from combinators: tiny functions which glue together other tiny functions into complex parsers.
Consider the following parser which parses and computes expressions containing integer literals and additions:
import cc.ekblad.konbini.Parser
import cc.ekblad.konbini.char
import cc.ekblad.konbini.integer
import cc.ekblad.konbini.oneOf
import cc.ekblad.konbini.parser
import cc.ekblad.konbini.whitespace
val addition: Parser<Long> = parser {
val lhs = expr()
whitespace()
char('+')
whitespace()
val rhs = expr()
return lhs + rhs
}
val expr: Parser<Long> = oneOf(integer, addition)
This example makes use five basic combinators. Three of them are leaf parsers, or atoms:
integer
, which parses any integer that can fits in a 64-bit signed Long
;char
, which parses any one out of zero or more characters passed to it as arguments; andwhitespace
, which parses zero or more whitespace characters.The remaining two are more interesting:
parser
lets you combine parsers into bigger parsers.parser
block, simply call it as a normal function.oneOf
takes zero or more parsers as arguments, and tries them all from left to rightoneOf
parser fails.oneOf
. Unlike many other parser generators and librariesIn addition to these, Konbini defines several other helpful parser combinators to quickly get your parser off the ground.
regex
matches regular expressions.doubleQuotedString
and singleQuotedString
matches quoted strings, including escape code handling.chainl
and chainr
make it easy to define left-recursive and right-recursive parsers respectively.Each Konbini parser has two extension methods which lets you apply them to arbitrary strings.
parse
applies the parser to a string, reading as much of the string as it can, and returns both the resultparseToEnd
does the same as parse
, but returns an error if the parser did not match the entire input string.Both methods can be configured to ignore whitespace at the start and end of its input for convenience.
Unlike our simple example parser, most real parsers don't calculate values on the fly; they build up a syntax tree of some kind. For a more realistic example, the following parser uses mostly built-in functionality to implement a complete JSON parser.
val comma = parser { whitespace() ; char(',') ; whitespace() }
val colon = parser { whitespace() ; char(':') ; whitespace() }
val pKeyValue = parser { doubleQuotedString().also { colon() } to pValue() }
val pAtom = oneOf(decimal, doubleQuotedString, boolean, string("null").map { null })
val pArray = bracket(
parser { char('[') ; whitespace() },
parser { whitespace() ; char(']') },
parser { chain(pValue, comma).terms },
)
val pDict = bracket(
parser { char('{') ; whitespace() },
parser { whitespace() ; char('}') },
parser { chain(pKeyValue, comma).terms.toMap() },
)
val pValue: Parser<Any?> = oneOf(pAtom, pDict, pArray)
This parser is capable of processing around 35 MB of JSON per second on an Apple M2; about the same speed as the parser used for the better-parse benchmark.