About
This is a repository meant to provide fast access to reading ~~.con~~ files from
various languages. The goal is to be able to quickly generate visuals in a
variety of interpreted languages.
** Usage
This is C++ libary built with ~~meson~~. The most automated way to get started is:
#+begin_src bash
micromamba create -n readcon_dev -f conda-lock.yml
micromamba activate readcon_dev
meson setup bbdir --prefix=$CONDA_PREFIX --libdir=lib --native-file nativeFiles/mold.ini --native-file nativeFiles/ccache_gnu.ini -Dwith_tests=true
meson compile -C bbdir
meson test -C bbdir
#+end_src
** Features

Fast reader for both single ~~.con~~ and trajectory ~~.con~~ files
Pure C++17 core implementation, with optional helpers
- ~~fmt~~ is used optionally for some debug printing
- ~~range-v3~~ can be used for more efficiency (views instead of copies)
Apache Arrow wrapper

** Rationale One of the main drawbacks of visualization is the need to read in specific file formats which may not map cleanly to delimited file specifications like ~~csv~~ files. Additionally, languages which are meant for interactive visuals, like ~~Python~~ or R often have sub-par file I/O capabilities for arbitrary formats. To this end, a core ~~C++~~ library with an Apache Arraow compatibility layer is a reasonable solution which allows minimal, expressive bindings to supported Arrow languages, while also providing simple enough inter-operability with unsupported languages like ~~Fortran~~.

*** Existing solutions It is tempting to reach for or recreate an entire ecosystem of discipline specific code, as for example, the Atomic Simulation Environment (ASE) or ~~AtomsBase.jl~~, both of which are excellent pedagogical aids with a rich machninery of helpers. Often these are not designed for interoperability or speed.

** Design Decisions Currently this implements the ~~con~~ format specification as written out by eON, so some assumptions are made about the input files, not all of which are currently tested / guaranteed to throw (contributions are welcome for additional sanity checks). *** Single Frames

The first 9 lines are the header
The remaining lines can be inferred from the header
*** Multiple Frames
Often, as for example when running a Nudged Elastic Band, ~~eON~~ will write out
multiple units of ~~con~~ like data into a single file.
The ~~con~~ like data have no whitespace between them!

That is we expect: #+begin_src bash Random Number Seed Time 15.345600 21.702000 100.000000 90.000000 90.000000 90.000000 0 0 218 0 1 2 2 2 63.546000 1.007930 Cu Coordinates of Component 1 0.63940000000000108 0.90450000000000019 6.97529999999999539 1 0 3.19699999999999873 0.90450000000000019 6.97529999999999539 1 1 H Coordinates of Component 2 8.68229999999999968 9.94699999999999740 11.73299999999999343 0 2 7.94209999999999550 9.94699999999999740 11.73299999999999343 0 3 Random Number Seed Time 15.345600 21.702000 100.000000 90.000000 90.000000 90.000000 0 0 218 0 1 2 2 2 63.546000 1.007930 Cu Coordinates of Component 1 0.63940000000000108 0.90450000000000019 6.97529999999999539 1 0 3.19699999999999873 0.90450000000000019 6.97529999999999539 1 1 H Coordinates of Component 2 8.85495714285713653 9.94699999999999740 11.16538571428571380 0 2 7.76944285714285154 9.94699999999999740 11.16538571428571380 0 3 #+end_src

Nothing else. No whitespace or lines between the ~~con~~ entries. **** Why? We read the entire file to memory and then map it to a set of strings. The first 9 lines are parsed to figure out how many lines are needed for the rest of the (first) frame, and then this logic is repeated en-masse until the lines run out.

Better memory management / streaming files / more errors and sanity checks are all welcome as pull requests. ** Development *** Developing locally A ~~pre-commit~~ job is setup on CI to enforce consistent styles, so it is best to set it up locally as well (using [[https://pypa.github.io/pipx][pipx]] for isolation):

#+begin_src sh