Turn markdown files into structured, queryable data with JS. Build markdown-powered docs, blogs, and sites quickly and reliably.
MIT License
MarkdownDB is a javascript library that turns markdown files into structured queryable databaase (SQL-based and simple JSON). It helps you build rich markdown-powered sites easily and reliably. Specifically it:
Extract structured data like:
#abc
v0.5
[hello](abc.md)
or wikilink style [[xyz]]
so we can compute backlinks or deadlinks etc (see #4) v0.2
- [ ] this is a task
(See obsidian data view) v0.4
Data enhancement and validation
For example, your blog posts. Each file can have a YAML frontmatter header with metadata like title, date, tags, etc.
---
title: My first blog post
date: 2021-01-01
tags: [a, b, c]
author: John Doe
---
# My first blog post
This is my first blog post.
I'm using MarkdownDB to manage my blog posts.
Use the npm mddb
package to index Markdown files into an SQLite database. This will create a markdown.db
file in the current directory. You can preview it with any SQLite viewer, e.g. https://sqlitebrowser.org/.
# npx mddb <path-to-folder-with-your-md-files>
npx mddb ./blog
To monitor files for changes and update the database accordingly, simply add the --watch
flag to the command:
npx mddb ./blog --watch
This command will continuously watch for any modifications in the specified folder (./blog
), automatically rebuilding the database whenever a change is detected.
E.g. get all the files with with tag a
.
SELECT files.*
FROM files
INNER JOIN file_tags ON files._id = file_tags.file
WHERE file_tags.tag = 'a'
Use our Node API to query your data for your blog, wiki, docs, digital garden, or anything you want!
Install mddb
package in your project:
npm install mddb
Now, once the data is in the database, you can add the following script to your project (e.g. in /lib
folder). It will allow you to establish a single connection to the database and use it across you app.
// @/lib/mddb.mjs
import { MarkdownDB } from "mddb";
const dbPath = "markdown.db";
const client = new MarkdownDB({
client: "sqlite3",
connection: {
filename: dbPath,
},
});
const clientPromise = client.init();
export default clientPromise;
Now, you can import it across your project to query the database, e.g.:
import clientPromise from "@/lib/mddb";
const mddb = await clientPromise;
const blogs = await mddb.getFiles({
folder: "blog",
extensions: ["md", "mdx"],
});
This feature helps you define functions that compute additional fields you want to include.
Next, define a function that computes the additional field you want to include. In this example, we have a function named addTitle
that extracts the title from the first heading in the AST (Abstract Syntax Tree) of a Markdown file.
const addTitle = (fileInfo, ast) => {
// Find the first header node in the AST
const headerNode = ast.children.find((node) => node.type === "heading");
// Extract the text content from the header node
const title = headerNode
? headerNode.children.map((child) => child.value).join("")
: "";
// Add the title property to the fileInfo
fileInfo.title = title;
};
Now, use the client.indexFolder
method to scan and index the folder containing your Markdown files. Pass the addTitle
function in the computedFields
option array to include the computed title in the database.
client.indexFolder(folderPath: "PATH_TO_FOLDER", customConfig: { computedFields: [addTitle] });
markdowndb.config.js
Here's an example markdowndb.config.js
with custom configurations:
export default {
computedFields: [
(fileInfo, ast) => {
// Your custom logic here
},
],
include: ["docs/**/*.md"], // Include only files matching this pattern
exclude: ["drafts/**/*.md"], // Exclude those files matching this pattern
};
prebuild
script{
"name": "my-mddb-app",
"scripts": {
...
"mddb": "mddb <path-to-your-content-folder>",
"prebuild": "npm run mddb"
},
...
}
For example, in your Next.js project's pages, you could do:
// @/pages/blog/index.js
import React from "react";
import clientPromise from "@/lib/mddb.mjs";
export default function Blog({ blogs }) {
return (
<div>
<h1>Blog</h1>
<ul>
{blogs.map((blog) => (
<li key={blog.id}>
<a href={blog.url_path}>{blog.title}</a>
</li>
))}
</ul>
</div>
);
}
export const getStaticProps = async () => {
const mddb = await clientPromise;
// get all files that are not marked as draft in the frontmatter
const blogFiles = await mddb.getFiles({
frontmatter: {
draft: false,
},
});
const blogsList = blogFiles.map(({ metadata, url_path }) => ({
...metadata,
url_path,
}));
return {
props: {
blogs: blogsList,
},
};
};
Retrieve a file by URL path:
mddb.getFileByUrl("urlPath");
Currently used file path -> url resolver function:
const defaultFilePathToUrl = (filePath: string) => {
let url = filePath
.replace(/\.(mdx|md)/, "") // remove file extension
.replace(/\\/g, "/") // replace windows backslash with forward slash
.replace(/(\/)?index$/, ""); // remove index at the end for index.md files
url = url.length > 0 ? url : "/"; // for home page
return encodeURI(url);
};
🚧 The resolver function will be configurable in the future.
Retrieve a file by it's database ID:
mddb.getFileByUrl("fileID");
Get all indexed files:
mddb.getFiles();
By file types:
You can specify type
of the document in its frontmatter. You can then get all the files of this type, e.g. all blog
type documents.
mddb.getFiles({ filetypes: ["blog", "article"] }); // files of either blog or article type
By tags:
mddb.getFiles({ tags: ["tag1", "tag2"] }); // files tagged with either tag1 or tag2
By file extensions:
mddb.getFiles({ extensions: ["mdx", "md"] }); // all md and mdx files
By frontmatter fields:
You can query by multiple frontmatter fields at once.
At them moment, only exact matches are supported. However, false
values do not need to be set explicitly. I.e. if you set draft: true
on some blog posts and want to get all the posts that are not drafts, you don't have to explicitly set draft: false
on them.
mddb.getFiles({
frontmatter: {
key1: "value1",
key2: true,
key3: 123,
key4: ["a", "b", "c"], // this will match exactly ["a", "b", "c"]
},
});
By folder:
Get all files in a subfolder (path relative to your content folder).
mddb.getFiles({ folder: "path" });
Combined conditions:
mddb.getFiles({ tags: ["tag1"], filetypes: ["blog"], extensions: ["md"] });
Retrieve all tags:
mddb.getTags();
Get links (forward or backward) related to a file:
mddb.getLinks({ fileId: "ID", direction: "forward" });
graph TD
markdown --remark-parse--> st[syntax tree]
st --extract features--> jsobj1[TS Object eg. File plus Metadata plus Tags plus Links]
jsobj1 --computing--> jsobj[TS Objects]
jsobj --convert to sql--> sqlite[SQLite markdown.db]
jsobj --write to disk--> json[JSON on disk in .markdowndb folder]
jsobj --tests--> testoutput[Test results]