A tool set for CSS including fast detailed parser, walker, generator and lexer based on W3C specs and browser implementations
MIT License
Bot releases are visible (Hide)
Published by lahmatiy about 5 years ago
npm audit
issues)dist/csstree.js
and dist/csstree.min.js
now (instead of single dist/csstree.js
that was a min version)grammar
into definitionSyntax
(named per spec)compact
option to generate()
method to avoid formatting (spaces) when possibledump()
method to produce syntaxes in compact form by defaultPublished by lahmatiy about 5 years ago
find()
, findLast()
and findAll()
methods (e.g. csstree.find(ast, node => node.type === 'ClassSelector')
)Published by lahmatiy about 5 years ago
This release improves syntax matching by new features and some fixes.
A couple month ago bracketed range notation was added to Values and Units spec. The notation allows restrict numeric values to some range. For example, <integer [0,∞]>
is for positive integers, or <number [0,1]>
that can be used for an alpha value.
Since the notation is new thing in syntax definition, it isn't used in specs yet. However, there is a PR (https://github.com/w3c/csswg-drafts/pull/3894) that will bring it to some specs. And CSSTree is ready for this.
Right now, the notation helped to remove <number-zero-one>
, <number-one-or-greater>
and <positive-integer>
from generic types and define them using a regular grammar (thanks to the notation).
There are at least two productions that has a low priority in matching. It means that such productions give a chance for other production to claim a token, and if no one – claim a token. This release introduce a solution for such productions. It's hardcoded at the moment, but can be exposed if needed (i.e. if there are more such productions).
First production is <custom-ident>
. The Values and Units spec states:
When parsing positionally-ambiguous keywords in a property value, a
<custom-ident>
production can only claim the keyword if no other unfulfilled production can claim it.
This rule takes place in properties like <'animation'>
, <'transition'>
and <'list-style'>
. Before solves in different ways:
<'animation'>
– that's not an issue since <'custom-ident'>
goes last, however a terms order can be changed in the future<'transition'>
– there was a patch for <single-transition>
that changes order of terms<'list-style'>
– had no fixes, just didn't work in some cases (see #101)And now, all those and the rest syntaxes work as expected.
Second production is a bit tricky. It's about "unitless zero" for <length>
production. The spec states:
... if a
0
could be parsed as either a<number>
or a<length>
in a property (such asline-height
), it must parse as a<number>
.
This rule takes place in properties like <'line-height'>
or <'flex'>
. And now it works per spec too (try it here):
mdn/data
to 2.0.4
(#99)<number-zero-one>
, <number-one-or-greater>
and <positive-integer>
from generic types. In fact, types moved to patch, because those types can be expressed in a regular grammar due to bracketed range notation implemented<custom-ident>
production matching to claim the keyword only if no other unfulfilled production can claim it (#101)<length>
production matching to claim "unitless zero" only if no other unfulfilled production can claim it||
- and &&
-group matching, matching continues from the beginning on term match (#85)var()
occurrences when value is a string (such values can't be matched on syntax currently and fail with specific error that can be used for ignorance in validation tools)<declaration-value>
and <any-value>
matching when a value contains a function, parentheses or bracesPublished by lahmatiy about 5 years ago
isBOM()
functioncharCodeCategory()
functionfirstCharOffset()
function (use isBOM()
instead)CHARCODE
dictionaryINPUT_STREAM_CODE*
dictionariesdebugger
(#104)Published by lahmatiy over 5 years ago
Published by lahmatiy over 5 years ago
||-
and &&-
groups (#103)Published by lahmatiy over 5 years ago
This release took too many time to be released. But it was worth the wait, because it unlocks new possibilities and ways for further improvements.
CSSTree tends to be as close as possible to the specifications in reasonable way. It means that CSSTree deviates from specs because specs are generally targeted for user agents (browsers) rather than source processing tools like CSSTree.
Previously CSSTree's tokenizer used its own token types set, which were selected for better performance and to be convenient enough for building AST. However, this has restricted the further improvement of parser, lexer and even generator, since the basis of CSS is tokens. That's not obvious at first glance, but if you dig deep into specs you'll find that CSS syntax is described in tokens and their productions, serialization relay on tokens, even var()
substitution takes place at the level of tokens and so on. Using own token types set means that many rules described in CSS specs can't be implemented as designed. That's why previously CSSTree's tokenizer was actually too far from specs.
In this release tokenizer was reworked to use token type set defined by CSS Syntax Module Level 3. Algorithms described by spec was adopted by tokenizer implementation and code is provided with excerpts from the specification. It allowed to be very close to spec and helped to fix numerous edge cases.
Current deviations from the CSS Syntax Module Level 3:
Changing the token types set led to a significant alteration of parser implementation. Most dramatic changes in AnPlusB
and UnicodeRange
implementations, because those two microsyntaxes are really hard. Nevertheless, in general, most things became simpler. Also parser continues relaxing on parse stage, more delegating syntax checking to lexer. As a result some parsing errors are no longer occur, so tools using CSSTree have a chance to use AST even for partially invalid CSS.
This release doesn't change AST format. However, the format will be changing for sure in next releases to be closer to token type set. It will reduce more parse errors and increase tools possibilities.
Lexer was slightly refactored. Most significant change, syntax matching relies on real CSS tokens produced by a tokenizer rather than generated from AST tokens. In other words, AST is translating to a string and then splitting into tokens by the tokenizer. Consequences of this:
lexer.matchProperty('border', 'red 1px dotted')
). So parsing into AST is not required anymore, and that's a good news for tools which using CSSTree for a validation and have another AST format or have no AST at all.mdn/data
by CSSTree's patch recently. Fortunately, it is no longer needed (difference with mdn/data
).Work on lexer is not completed yet. This version removes some restrictions and its ready for further improvements like at-rules and selectors matching, better mathematical expressions (calc()
and friends) support, attr()
/toggle()
/var()
fallback checking, multiple errors, suggestions, improving matching performance and so on.
mdn/data
to ~2.0.3
mdn/data
due to lack of some generic types and specific lexer restictions (since lexer was reworked, see below)Tokenizer
class splitted into several abstractions:
TokenStream
classOffsetToLocation
classtokenize()
function that creates TokenStream
instance for given string or updates a TokenStream
instance passed as second parameterTokenizer
classRaw
token typeIdentifier
token type to Ident
Hash
, BadString
, BadUrl
, Delim
, Percentage
, Dimension
, Colon
, Semicolon
, Comma
, LeftSquareBracket
, RightSquareBracket
, LeftParenthesis
, RightParenthesis
, LeftCurlyBracket
, RightCurlyBracket
Punctuator
with Delim
token type, that excludes specific characters with its own token type like Colon
, Semicolon
etcfindCommentEnd
, findStringEnd
, findDecimalNumberEnd
, findNumberEnd
, findEscapeEnd
, findIdentifierEnd
and findUrlRawEnd
helper functionSYMBOL_TYPE
, PUNCTUATION
and STOP_URL_RAW
dictionariesisDigit
, isHexDigit
, isUppercaseLetter
, isLowercaseLetter
, isLetter
, isNonAscii
, isNameStart
, isName
, isNonPrintable
, isNewline
, isWhiteSpace
, isValidEscape
, isIdentifierStart
, isNumberStart
, consumeEscaped
, consumeName
, consumeNumber
and consumeBadUrlRemnants
helper functionsHexColor
consumption in way to relax checking a value, i.e. now value
is a sequence of one or more name chars&
as a property hackvar()
parsing to only check that a first arguments is an identifier (not a custom property name as before)Lexer#match()
, Lexer#matchType()
and Lexer#matchProperty()
methods to take a string as value, beside AST as a valueLexer#match()
method to take a string as a syntax, beside of syntax descriptor<attr()>
, <url>
(moved to patch) and <progid>
types<ident-token>
, <function-token>
, <at-keyword-token>
, <hash-token>
, <string-token>
, <bad-string-token>
, <url-token>
, <bad-url-token>
, <delim-token>
, <number-token>
, <percentage-token>
, <dimension-token>
, <whitespace-token>
, <CDO-token>
, <CDC-token>
, <colon-token>
, <semicolon-token>
, <comma-token>
, <[-token>
, <]-token>
, <(-token>
, <)-token>
, <{-token>
and <}-token>
<an-plus-b>
, <urange>
, <custom-property-name>
, <declaration-value>
, <any-value>
and <zero>
<unicode-range>
to <urange>
as per spec<expression>
(IE legacy extension) to <-ms-legacy-expression>
and may to be removed in next releasesPublished by lahmatiy over 6 years ago
This release brings a brand new syntax matching approach. The syntax matching is important feature that allow CSSTree to provide a meaning of each component in a declaration value, e.g. which component of a declaration value is a color, a length and so on. You can see example of matching result on CSSTree's syntax reference page:
Syntax matching is now based on CSS tokens and uses a state machine approach which fixes all problems it has before (see https://github.com/csstree/csstree/issues/67 for the list of issues).
Previously syntax matching was based on AST nodes. Beside it possible to make syntax matching such way, it has several disadvantages:
Function
node contains a function name and a list of children, but it also produce parentheses that isn't store in AST. This introduces many hacks and workarounds. However, it was not enough since approach doesn't work for nodes like Brackets
. Also it forces matching algorithm to know a lot of about node types and their features.Starting this release, AST (CSS parse result) is converting to a token stream before matching (using CSSTree's generator with a special decorator function). Syntax description tree is also converting into so called Match graph (see details below). Those tree transformations allow to align both tree to work in the same terms – CSS tokens.
This change make matching algorithm much simpler. Now it know nothing about AST structure, hacks and workarounds were removed. Moreover, syntaxes like <line-names>
(contains brackets) and <calc()>
(contains operators in nested syntaxes) are now can be matched (previously syntax matching failed for them).
Since syntax matching moved from AST nodes to CSS tokens, syntax description tree format was also changed. For instance, functions is now represented as a token sequence. It allows to handle syntaxes that contains a group with several function tokens inside, like this one:
<color-adjuster> =
[red( | green( | blue( | alpha( | a(] ['+' | '-']? [<number> | <percentage>] ) |
[red( | green( | blue( | alpha( | a(] '*' <percentage> ) |
...
Despite that<color-mod()>
syntax was recently removed from CSS Color Module Level 4, such syntaxes can appear in future, since valid (even looks odd).
As the result of format changes, all syntaxes in mdn/data can now be parsed, even invalid from the standpoint of CSS Values and Units Module Level 3 spec syntaxes. Due to this, some errors in syntaxes were found and fixed (https://github.com/mdn/data/pull/221, https://github.com/mdn/data/pull/226). Also some suggestions on syntax optimisation were made (https://github.com/mdn/data/pull/223, https://github.com/mdn/data/issues/230).
As mentioned above, syntax tree is now transforming to Match graph. This happens on first match for a syntax and then reused. Match graph represents a graph of simple actions (states) and transitions between them. Some complicated thing, like multipliers, are translating in a set of nodes and edges. You can explore which a match graph is building for any syntax on CSSTree's syntax reference page, e.g. the match graph for <'animation-name'>
:
There were some challenges during implementation, most notable of them:
&&-
and ||-
groups. Actually it was a technical blocker that suspended moving to match graph. Finally, a solution was found: split a groups in smaller one by removing a term one by one. For example, a && b && c
can be represented as following (pseudo code):if match a
then [b && c]
else if match b
then [a && c]
else if match c
then [a && b]
else MISMATCH
So, a size of groups is reducing by one on each step, then we process the smaller groups until a group consists of a single term.
a && b
=
if match a
then if match b
then MATCH
else MISMATCH
else if match b
then if match a
then MATCH
else MISMATCH
else MISMATCH
It works fine, but for small groups only. Since it produces at least N! (factorial) nodes, where N is a number of terms in a group. Hopefully, there are not so many syntaxes that contain a group with a big number of terms for &&-
or ||-
group. However, font-variant
syntax contains a group of 20 terms, that means at least 2,432,902,008,176,640,000 nodes in a graph. It's huge and we can't create such number of object due a memory limit. So, alternative solution for groups greater than 5 terms was introduced, it uses special buffer and iterate terms in a loop. The solution is not ideal, but there are just 9 such groups (with 6 or more terms) across all syntaxes, so it should be ok for now.
a?, b?, c?
We can match a, b, c
, a, c
, b
, b, c
and so on. But input like , b, c
, a, , c
or a,
is not allowed. In other words, comma must not be hanged and must not be followed by an another comma. And when comma is matching to an input, it should notify a positive match even there is no a comma token in the input. This was a blocker that could cancel the whole approach.
Nevertheless, the problem was solved in elegant way, by checking adjacent tokens for a several patterns. It most non-trivial part of new syntax matching, several lines of code works well only with along other parts of implementation, so may looks like a magic.
Another improvement in syntax matching is replacing a recursion-based algorithm with a state machine approach. This allowed to check all possible alternatives during the syntax matching. Previously if nothing matched by a chosen path, algorithm just exited with a mismatch result. New algorithm is returning back to a branching point and choose an alternative path when possible. This fixes following:
<bg-position>
.<bg-position> =
[ left | center | right | top | bottom | <length-percentage> ] |
[ left | center | right | <length-percentage> ] [ top | center | bottom | <length-percentage> ] |
[ center | [ left | right ] <length-percentage>? ] && [ center | [ top | bottom ] <length-percentage>? ]
This syntax didn't work before, since it defines shortest form first and matching fell in this path with no chance to use an alternative path. However, reverse order of groups in this syntax makes it work with old algorithm.
Another example is a new syntax for <rgb()>
:
rgb() = rgb( <percentage>{3} [ / <alpha-value> ]? ) |
rgb( <number>{3} [ / <alpha-value> ]? ) |
rgb( <percentage>#{3} , <alpha-value>? ) |
rgb( <number>#{3} , <alpha-value>? )
Old algorithm doesn't exit from a function content when matched a function, and can't handle such syntaxes. To make matching work for syntaxes like this one, an adoption is required (by a patch as workaround). Now patches are not required.
Matching for syntaxes not compatible with greedy algorithms. For instance, syntax of composes
(CSS Modules) is defined as <custom-ident>+ from <string>
, and old matching algorithm failed on it because from
is a valid value for <custom-ident>
and it's capturing by <custom-ident>+
with no alternatives. New algorithm is not greedy, on first try it takes a minimum count of tokens allowed by a syntax and increases that count if possible on each returning in the branching point. Syntaxes like composes
can be matched now as well.
A state machine approach gives some other benefits like a precise error locations. Previously, location of a problem could be confusing:
SyntaxMatchError: Mismatch
syntax: ...
value: rgb(1,2)
------------^
And now it's more helpful:
SyntaxMatchError: Mismatch
syntax: ...
value: rgb(1,2)
---------------^
Further improvements on syntax matching can improve error handling and probably provide some sort of suggestions.
New syntax matching approach requires more memory and time, because of AST to token stream transformation and checking all possible alternatives. However, new approach is more effective itself and have a room for further optimisations. Usually it takes the same or ~50% more time (depending on syntax and a matching value) compared with previous algorithm. So that's not a big deal.
The main goal the release was make it all works, so not every possible optimisation were implemented and more will come in next releases.
Token
node type to represent a single code point (<delim-token>
)Multiplier
that wraps a single node (term
property)AtKeyword
to represent <at-keyword-token>
Slash
and Percent
node types, they are replaced for a node with Token
typeFunction
to represent <function-token>
with no childrenmultiplier
property from Group
generate()
method:
options
as second argument now (generate(node, forceBraces, decorator)
-> generate(node, options)
). Two options are supported: forceBraces
and decorator
decorate
option value, i.e. generate(node, fn)
-> generate(node, { decorate: fn })
Atrule
const to AtKeyword
Published by lahmatiy over 6 years ago
lexer.grammar.translate()
method into generate()
<'-webkit-font-smoothing'>
and <'-moz-osx-font-smoothing'>
syntaxes (#75)<'overflow'>
property syntax (#76)mdn-data
to ~1.1.0
and fixed issues with some updated property syntaxesPublished by lahmatiy almost 7 years ago
Most of the changes of this release relate to rework of generator and walker. Instead of plenty methods there just single method for each one: generate()
for the generator and walk()
for the walker. Both methods take two arguments ast
and options
(optional for the generator). This makes API much simpler (see details about API in Translate AST to string and AST traversal):
Also List
class API was extended, and some utils methods such as keyword()
and property()
were changed to be more useful.
generate()
methods invocation, methods now take a node as a single argument and context (i.e. this
) that have methods: chunk()
, node()
and children()
translate()
to generate()
and changed to take options
argumenttranslateMarkup(ast, enter, leave)
method, use generate(ast, { decorator: (handlers) => { ... }})
insteadtranslateWithSourceMap(ast)
, use generate(ast, { sourceMap: true })
insteadwalk()
to take an options
argument instead of handler, with enter
, leave
, visit
and reverse
options (walk(ast, fn)
is still works and equivalent to walk(ast, { enter: fn })
)walkUp(ast, fn)
, use walk(ast, { leave: fn })
walkRules(ast, fn)
, use walk(ast, { visit: 'Rule', enter: fn })
insteadwalkRulesRight(ast, fn)
, use walk(ast, { visit: 'Rule', reverse: true, enter: fn })
insteadwalkDeclarations(ast, fn)
, use walk(ast, { visit: 'Declaration', enter: fn })
insteadreverse: true
will fail on arrays since they have no forEachRight()
method)List#forEach()
methodList#forEachRight()
methodList#filter()
methodList#map()
method to return a List
instance instead of Array
List#push()
method, similar to List#appendData()
but returns nothingList#pop()
methodList#unshift()
method, similar to List#prependData()
but returns nothingList#shift()
methodList#prependList()
methodList#insert()
, List#insertData()
, List#appendList()
and List#insertList()
methods to return a list that performed an operationkeyword()
method
name
field to include a vendor prefixbasename
field to contain a name without a vendor prefixcustom
field that contain a true
when keyword is a custom property referenceproperty()
method
name
field to include a vendor prefixbasename
field to contain a name without any prefixes, i.e. a hack and a vendor prefixvendorPrefix()
methodisCustomProperty()
methodPublished by lahmatiy almost 7 years ago
This journey started a couple months ago with 1.0.0-alpha20
, which added tolerant parsing mode as experimental feature, available behind tolerant
option. During 5 releases, the feature was tested on various data, numerous errors and edge cases were fixed. The last necessary changes were made in this release, which makes the feature ready for use. So, I proud to say, CSSTree parser is tolerant to errors by default now.
That's the significant change, and this meets CSS Syntax Module Level 3, which says:
When errors occur in CSS, the parser attempts to recover gracefully, throwing away only the minimum amount of content before returning to parsing as normal. This is because errors aren’t always mistakes - new syntax looks like an error to an old parser, and it’s useful to be able to add new syntax to the language without worrying about stylesheets that include it being completely broken in older UAs.
In other words, spec compliant CSS parser should be able to parse any text as a CSS with no errors. CSSTree is now such parser! 🎉
The only thing the CSSTree parser departs from the specification is that it doesn't throw away bad content, but wraps it in the Raw
nodes, which allows processing it later. This discrepancy is due to the fact that the specification is written for UA that extract meaning from CSS, so incomprehensible parts simply do not make sense to them and can be ignored. CSSTree has a wider range of tasks, and most of them are related to the processing of the source code. These are tasks such as locating errors, error correction, preprocessing, and so on.
Tolerant mode means you don't need to wrap csstree.parse()
into try/catch. To collect parse errors onParseError
handler should be set in parse options:
var csstree = require('css-tree');
csstree.parse('I must! be tolerant to errors', {
onParseError: function(e) {
console.error(e.formattedMessage);
}
});
// Parse error: Unexpected input
// 1 |I must! be tolerant to errors
// -------------^
// Parse error: LeftCurlyBracket is expected
// 1 |I must! be tolerant to errors
// ------------------------------------^
If you need old parser behaviour, just throw an exception inside onParseError
handler, that immediately stops a parsing:
try {
csstree.parse('I must! be tolerant to errors', {
onParseError: function(e) {
throw e;
}
});
} catch(e) {
console.error(e.formattedMessage);
}
// Parse error: Unexpected input
// 1 |I must! be tolerant to errors
// -------------^
Tokenizer#isBalanceEdge()
methodTokenizer.endsWith()
methodtolerant
parser option (no parsing modes anymore)property
parser option (a value parsing does not depend on property name anymore)Brackets
, Function
and Parentheses
when EOF is reachedRaw
node)Raw
node, not a declaration as before)Raw
node that represents a declaration valueValue
parse handler to return a node only with type Value
(previously it returned a Raw
node in some cases)onParseError()
is not invoked on parse errors on selector and declaration valueonParseError()
to stop parsing if handler throws an exceptiongrammar.walk()
to invoke passed handler on entering to node rather than on leaving the nodegrammar.walk()
to take a walk handler pair as an object, i.e. walk(node, { enter: fn, leave: fn })
Lexer#match*()
methods to take a node of any type, but with a children
fieldLexer#match(syntax, node)
methodLexer#matchType()
method to stop return a positive result for the CSS wide keywordsPublished by lahmatiy about 7 years ago
onParseError()
handlerRaw
node in tolerant mode instead of being ignoredRule
node as part of selector instead of being ignoredparseAtrulePrelude
behaviour to parseRulePrelude
Raw
node wraping into AtrulePrelude
when parseAtrulePrelude
is disabledtranslateWithSourceMap()
, flattening the string (because of mixing building string and indexing into it) turned it into a quadratic algorithm (approximate numbers can be found in the quiz created by this case)property()
Published by lahmatiy about 7 years ago
selector
to prelude
. The reasons: spec names this part so, and this branch can contain not a selector only (SelectorList
) but also a raw payload (Raw
). What's changed:
Rule.selector
to Rule.prelude
parseSelector
parser option to parseRulePrelude
SelectorList
Lexer#checkStructure()
Published by lahmatiy about 7 years ago
Tokenizer#getRawLength()
's false positive balance match to the end of input in some cases (#56)walk()
, walkUp()
etc)expression
to prelude
(since spec names it so)
AtruleExpression
node type → AtrulePrelude
Atrule.expression
field → Atrule.prelude
parseAtruleExpression
parser's option → parseAtrulePrelude
atruleExpression
parse context → atrulePrelude
atruleExpression
walk context reference → atrulePrelude
Published by lahmatiy about 7 years ago
{}-block
in tolerant modeDeclarationList
require('css-tree/lib/parser')
(#47)+n
when AnPlusB.a
is +1
to be "round-trip" with parserrequire('css-tree/lib/generator')
require('css-tree/lib/walker')
(#47)default
keyword to the list of invalid values for <custom-ident>
(since it reversed per spec)toPlainObject()
and fromPlainObject()
) moved to lib/convertor
(entry point is require('css-tree/lib/convertor')
)Published by lahmatiy about 7 years ago
Raw
token typeurl()
with raw as url to be more spec complientTokenizer#balance
array computation on token layoutTokenizer#getRawLength()
to compute a raw length with respect of block balanceTokenizer#getTokenStart(offset)
method to get token start offset by token indexidx
and balance
fields to each token of Tokenizer#dump()
method resultonParseError
optionRaw
node to use a new approach. Since now a Raw
node builds in parser#Raw()
function onlуparser#Raw()
, it takes 5 parameters now (it might to be changed in future)parser#tolerantParse()
to pass a start token index to fallback function instead of source offsetAtruleExpression
consumption in tolerant modeAtruleExpression
node into null
AtruleExpression
handler to always return a node (before it could return a null
in some cases)#
multiplierSyntaxReferenceError
syntax.fork()
Published by lahmatiy about 7 years ago
Atrule
token type (<at-rule-token>
per spec)Function
token type (<function-token>
per spec)Url
token typeTokenizer#getTypes()
method with Tokenizer#dump()
to get all tokens as an arrayTokenizer.TYPE.Whitespace
to Tokenizer.TYPE.WhiteSpace
Tokenizer.findWhitespaceEnd()
to Tokenizer.findWhiteSpaceEnd()
tolerant: true
option). In this mode parse errors are never occour and any invalid part of CSS turns into a Raw
node. Current safe points: Atrule
, AtruleExpression
, Rule
, Selector
and Declaration
. Feature is experimental and further improvements are planned.Atrule.expression
to contain a AtruleExpression
node or null
only (other node types is wrapping into a AtruleExpression
node)AttributeSelector.operator
to AttributeSelector.matcher
translate()
method is now can take a function as second argument, that recieves every generated chunk. When no function is passed, default handler is used, it concats all the chunks and method returns a string.x
unit to <resolution>
generic typezero or more
multipliers)ASTNode
node type to contain a reference to AST node#
multipliertranslate()
function to get a handler as third argument (optional). That handler recieves result of node traslation and can be used for decoration purposes. See example
SyntaxParseError
to grammar exportSequence
for Group
node type (Sequence
node type removed)explicit
boolean property for Group
nonEmpty
Group's property to disallowEmpty
Group
when it contains a single Group
term (return this Group
as a result)Lexer#matchProperty()
and Lexer#matchType()
to return an object instead of match tree. A match tree stores in matched
field when AST is matched to grammar successfully, otherwise an error in error
field. The result object also has some methods to test AST node against a match tree: getTrace()
, isType()
, isProperty()
and isKeyword()
Lexer#matchDeclaration()
methodLexer#lastMatchError
(error stores in match result object in error
field)Lexer#findValueSegments()
, Lexer#findDeclarationValueSegments()
and Lexer#findAllSegments
)SyntaxReferenceError
for unknown property and type referencesproperty()
function: variable
→ custom
line
and column
) of Error
and exception on attempt to write in iOS SafariPublished by lahmatiy over 7 years ago
List
class with new methods:
List#prepend(item)
List#prependData(data)
List#insertData(data)
List#insertList(list)
List#replace(item, itemOrList)
Published by lahmatiy over 7 years ago
atrule
walk context (#39)AnPlusB
, AttributeSelector
, Function
, MediaFeature
and Ratio
(1e95877)List
exception messages (@strarsis, #42)Published by lahmatiy over 7 years ago
The main goal of CSSTree is to provide standard CSS parsing as good as possible. However, use cases has shown that it would be useful to easily extend the syntax to make possible experimenting with new CSS modules and features. Therefore, in this release CSSTree is making the first step towards extensibility through a new concept called syntax
.
Syntax
is a set of tools: parser, walkers, lexer, generators and other functions to deal with some variant of CSS syntax. By default it's a standard CSS syntax with implementators features (e.g. hacks and extensions). This syntax may be extended by fork()
method, which returns a new syntax (fork) with modified functionality (when needed) but the same API.
The approach allows to experiment with new CSS features that haven't been implemented yet by browsers, and provide support for CSS extensions (like CSS Modules
, and even SCSS
or Less
syntaxes) at a new level. To reach that goal syntax is described in a declarative way with minimal efforts from the developer. Initial CSS syntax definition speaks for itself. It will be completed and improved in upcoming releases.
It's hard enough to understand how good a parser is. There are several problems here, the most notable is the lack of appropriate test suites to test the parser across specs and implementations. As you may know, adoption of CSS is not consistent by browsers, it's changing so fast and don't forget about legacy. Too many things we should care about.
That's the reason why Real Web CSS project was created. The project’s scripts take Alexa Top 250 websites, crawl their CSS, and try to parse and validate them. The results can be found in this table. As you can see there are various issues around the Web, many of websites have a broken CSS and validation warnings. Although this test also has revealed weaknesses of CSSTree and most them were fixed by this release.
This simple test on real Web CSS already showed many problems on sites and CSSTree. And that's just a beginning. We believe it will help to make Web and CSSTree better.
syntax
exports
to expose a default syntaxcreateSyntax()
method to create a new syntax from scratchfork()
method to create a new syntax based on given via extensionmediaQueryList
and mediaQuery
parsing contextsCDO
and CDC
node types#
and +
)@font-face
at-rulechroma()
to legacy IE filter functionsHexColor
to consume hex only\0
and \9
hacks (#2)Ratio
terms
Ratio
termRatio
term!ie
)true
for important
field in case identifier equals to important
and string otherwiseParser
classreadSelectorSequence()
, readSequenceFallback()
and readSelectorSequenceFallback
methodsAtruleExpression
, Selector
and Value
translateMarkup(ast, before, after)
method for complex casestranslateWithSourceMap
to be more flexible (based on translateMarkup
, additional work to be done in next releases)checkStructure(ast)
method to check AST structure based on syntax definitionmdn/data
<'offset-position'>
syntax<position>
property with -webkit-sticky
(@sergejmueller, #37)gen:syntax
) to generate AST format reference page (docs/ast.md
) using syntax definition