readCon

A tiny library for reading con files

MIT License

Stars
1
  • About
    This is a repository meant to provide fast access to reading .con files from
    various languages. The goal is to be able to quickly generate visuals in a
    variety of interpreted languages.
    ** Usage
    This is C++ libary built with meson. The most automated way to get started is:
    #+begin_src bash
    micromamba create -n readcon_dev -f conda-lock.yml
    micromamba activate readcon_dev
    meson setup bbdir --prefix=$CONDA_PREFIX --libdir=lib --native-file nativeFiles/mold.ini --native-file nativeFiles/ccache_gnu.ini -Dwith_tests=true
    meson compile -C bbdir
    meson test -C bbdir
    #+end_src
    ** Features
  • Fast reader for both single .con and trajectory .con files
  • Pure C++17 core implementation, with optional helpers
    • fmt is used optionally for some debug printing
    • range-v3 can be used for more efficiency (views instead of copies)
  • Apache Arrow wrapper

** Rationale One of the main drawbacks of visualization is the need to read in specific file formats which may not map cleanly to delimited file specifications like csv files. Additionally, languages which are meant for interactive visuals, like Python or R often have sub-par file I/O capabilities for arbitrary formats. To this end, a core C++ library with an Apache Arraow compatibility layer is a reasonable solution which allows minimal, expressive bindings to supported Arrow languages, while also providing simple enough inter-operability with unsupported languages like Fortran.

*** Existing solutions It is tempting to reach for or recreate an entire ecosystem of discipline specific code, as for example, the Atomic Simulation Environment (ASE) or AtomsBase.jl, both of which are excellent pedagogical aids with a rich machninery of helpers. Often these are not designed for interoperability or speed.

** Design Decisions Currently this implements the con format specification as written out by eON, so some assumptions are made about the input files, not all of which are currently tested / guaranteed to throw (contributions are welcome for additional sanity checks). *** Single Frames

  • The first 9 lines are the header
  • The remaining lines can be inferred from the header
    *** Multiple Frames
    Often, as for example when running a Nudged Elastic Band, eON will write out
    multiple units of con like data into a single file.
  • The con like data have no whitespace between them!

That is we expect: #+begin_src bash Random Number Seed Time 15.345600 21.702000 100.000000 90.000000 90.000000 90.000000 0 0 218 0 1 2 2 2 63.546000 1.007930 Cu Coordinates of Component 1 0.63940000000000108 0.90450000000000019 6.97529999999999539 1 0 3.19699999999999873 0.90450000000000019 6.97529999999999539 1 1 H Coordinates of Component 2 8.68229999999999968 9.94699999999999740 11.73299999999999343 0 2 7.94209999999999550 9.94699999999999740 11.73299999999999343 0 3 Random Number Seed Time 15.345600 21.702000 100.000000 90.000000 90.000000 90.000000 0 0 218 0 1 2 2 2 63.546000 1.007930 Cu Coordinates of Component 1 0.63940000000000108 0.90450000000000019 6.97529999999999539 1 0 3.19699999999999873 0.90450000000000019 6.97529999999999539 1 1 H Coordinates of Component 2 8.85495714285713653 9.94699999999999740 11.16538571428571380 0 2 7.76944285714285154 9.94699999999999740 11.16538571428571380 0 3 #+end_src

Nothing else. No whitespace or lines between the con entries. **** Why? We read the entire file to memory and then map it to a set of strings. The first 9 lines are parsed to figure out how many lines are needed for the rest of the (first) frame, and then this logic is repeated en-masse until the lines run out.

Better memory management / streaming files / more errors and sanity checks are all welcome as pull requests. ** Development *** Developing locally A pre-commit job is setup on CI to enforce consistent styles, so it is best to set it up locally as well (using [[https://pypa.github.io/pipx][pipx]] for isolation):

#+begin_src sh

Run before commiting

pipx run pre-commit run --all-files

Or install the git hook to enforce this

pipx run pre-commit install #+end_src

  • License
    MIT.