hyperless

🧼 HTML parser and various utilities

MIT License

Downloads
1.2K
Stars
8
Committers
1

🧼 Hyperless

HTML parser and various utilities.

parseHTML

Parse an HTML document or fragment into a traversable node tree.

import {parseHTML} from '@dbushell/hyperless';
const root = parseHTML('<h1>Hello, World!</h1>');

Node API subject to change.

parseAttributes

Parse an HTML attribute string into a case-insensitive deduplicated key/value map.

import {parseAttributes} from '@dbushell/hyperless';
const map = parseAttributes('a="1" b="2" c d="d" D="d" e=e');

HTML entity encoding is handled automatically by default.

Utilities

stripTags

Remove HTML and return text content with a few niceties.

import {stripTags} from '@dbushell/hyperless';
// Pass a chunk of HTML
const text = stripTags('<p>Ceci n’est pas une paragraphe.</p>');

Text in <blockquote> and <q> are wrapped in quotation marks.

excerpt

Generate a text excerpt from HTML content.

import {excerpt} from '@dbushell/hyperless';
// Pass a chunk of HTML
const text = excerpt(html);

Output is context aware trimmed to the nearest sentence, or word, to fit the maximum length as close as possible.

An optional maxLength can be passed as the second argument (default: 300 characters).


MIT License | Copyright © 2024 David Bushell