Improve Confluence to Notion HTML exporting using the unofficial Notion API
MIT License
Improve Notion pages imported from a Confluence HTML export using the unofficial Notion API to fix up common issues.
The current state of exporting from Confluence into Notion is that images and attachments are broken, titles are broken, and there are a bunch of annoying formatting issues. This fixes some of those.
Compatible with Python 3.5 and above.
Currently handles:
Does not handle:
Space Settings -> Content Tools -> Export -> HTML -> Custom Export -> Deselect All
Select the pages you want -> Export -> Download Here -> Unzip
Save the path of this folder, it will be passed to the script.
... -> Import -> HTML -> Select all HTML pages in the unzipped folder
This will create a page like "Import Mar 18, 2020". You should see the subpages underneath along with an index page.
Save the URL of this page, it will be passed to the script.
In a browser session where you're logged into Notion.so, open:
chrome://settings/cookies/detail?site=www.notion.so
Find the token_v2
entry and copy the Content.
Set this as an environment variable as follows:
export NOTION_TOKEN=8e8ec87b5cf11f4e354fb1d145f78ca...
Install requirements (just the unofficial Notion API):
sudo pip3 install -r requirements.txt
Test with dry run:
python3 confluence_to_notion.py --confluence ~/Downloads/Confluence_Export --notion https://www.notion.so/blah/Path-To-Page --dry-run
If that's looking good and printing the right titles and images, run without --dry-run
.
No need to wait for dry run traverse all of the pages if it's looking reasonable, you can just abort.
The full run can take a while as it recursively parses all blocks and uploads a bunch of images to S3 through Notion.
Notion will live update as the script runs, you can inspect the pages that have been already processed.