browser-plugin

Workflow automation plugin.

MIT License

Stars
0
Committers
2

Browser Plugin

Note: This project is currently in alpha and prone to breaking changes.

Overview

This project is a browser extension for creating web-based automations on the fly. Capture different actions automatically for later replay or create fine-tuned steps in an intuitive, no-code way.

Building

Because this project is still early on, we have not released it on the various extension platforms. Load the extension manually instead. Use pnpm to build the project and manage dependencies. Install and run the following:

$ pnpm install
$ pnpm build

This will produce a dist/chrome-mv3 directory containing the unpacked extension. Navigate to Chrome > Manage Extensions > Load unpacked and select this directory. Consider pinning the newly installed extension.

Usage

The side panel contains four top-level tabs. We touch on each in turn.

Builder

This is the primary workhorse of the application. Here you can provide a sequence of steps that can be replayed automatically later on. As of now, we have the following steps available:


Extracting

Triggering this step converts your cursor to a crosshair and shows outlines around elements you hover over. On each click, the plugin will record which element you targeted. Name each capture you perform in this way for interpolating into subsequent steps.


Navigating

Very simply navigates to the specified URL. You can use interpolated values in the required field. For example, you can specify a URL like:

https://github.com/{PROFILE}

if you have a PROFILE interpolation available to you.


OpenAI

Allows sending a custom chat completion request to OpenAI. The response is always formatted as a JSON object, though you are free to specify what the keys of the object are. These key/value pairs will be available for interpolating into subsequent steps.

Furthermore, you can use any interpolated values already available in both the system and user prompt.


Recording

Allows interacting with the webpage as normal, but captures clicks and key presses automatically.


Library

A repository of all previously saved scripts. From here you can edit, delete, or execute (i.e. run) the script.

Runner

After running a script from your library, this tab will populate with details on each step as they execute. It will also surface any errors encountered during.

Settings

One particular step available in the builder tab is the OpenAI step. This sends a custom chat completion request to OpenAI, but requires an API key. You can configure the API key from here.

Development

We use the WXT framework to give room for cross-platform functionality, though efforts are currently focused on Chrome and Chromium-based browsers. As such, organization of files mirror WXT's recommendations:

  • assets
  • components/ui
    • We use the shadcn/ui component framework. If
      looking to contribute, confirm any custom components do not have functional
      equivalents already available. A list of available components can be found
      here.
  • components/icons
    • SVG wrapped within a React component.
  • entrypoints/background.ts
    • Corresponds to our service worker.
  • entrypoints/*.content
    • Our content scripts. Fluidic works by injecting three different scripts into
      every page. The scripts themselves do not do meaningful work unless certain
      actions are invoked from the side panel.
  • entrypoints/sidepanel
    • The primary entrypoint of the project. Contains most UI elements and
      recording/replaying functionality.
  • public
    • Browser extension icons.
  • utils
    • A collection of utility functions/types that can be useful across entrypoints.

Formatting

Formatting depends on prettier. A pre-commit hook is included in .githooks that can be used to format all *.jsx? and *.tsx? files prior to commit. Install via:

$ git config --local core.hooksPath .githooks/