Shared-memory Python object namespace with Apache Plasma. Built because of Plotly Dash, useful anywhere.
MIT License
brain-plasma
is a high-level wrapper for the Apache Plasma PlasmaClient API with an added naming and namespacing system. Only supported on Mac/Linux with Python 3.5+.
Basic idea: the brain has a list of names that it has "learned" that are attached to objects in Plasma. Learn, recall, and delete stored objects, call it like a dictionary with bracket notation e.g. brain['x']
and del
.
pip install pyarrow
pip install brain-plasma
plasma_store -m 50000000 -s /tmp/plasma
from brain_plasma import Brain
brain = Brain()
this = 'a text object'
those = pd.DataFrame(dict(this=[1,2,3,4],that=[4,5,6,7]))
brain['this'] = this
brain['this']
>>> 'a text object'
brain.names()
>>> ['this']
del brain['this'] # remove the object from the brain's memory
brain['this']
>>> # error, that name/object no longer exists
brain.names()
>>> [] # object is gone
Namespaces
# change namespace
brain['this'] = 'default text object'
brain.set_namespace('newname')
brain['this'] = 'newname text object'
brain.set_namespace('default')
brain['this']
>>> 'default text object'
brain.names(namespaces='all')
['this','this']
The API/features in brain-plasma
should be considered alpha stage. It might change without notice.
Brain
object share memory or not; the namespace is only stored in Plasma and checked each time any object is referenced.plasma_store
Current Drawbacks
RELEASE WITH BREAKING CHANGES: v0.3
brain_plasma.mock
brain_plasma.compatibility import v02Brain
brain_plasma.exceptions
RELEASE WITH BREAKING CHANGES: v0.2
learn()
to ('name',thing)
which is more intuitive (but you should always use bracket notation)len(brain)
, del brain['this']
and 'this' in brain
are now avilable (implemented __len__
, __delitem__
, and __contains__
)NOTE: stability problems should be resolved in v0.2
.
Test with pytest.
pip install pytest
git clone https://github.com/russellromney/brain-plasma
cd brain-plasma
pytest
brain_plasma.Brain
Initialization
# simple
brain = Brain()
# show defaults
brain = Brain(path: str='/tmp/plasma',namespace: str='default')
Parameters:
path
- which path to use to connect to the plasma storenamespace
- which namespace to useBrain.client
The underlying PlasmaClient object. Created at instantiation. Requires plasma_store to be running locally.
Brain.path
The path to the PlasmaClient connection folder. Default is /tmp/plasma
but can be changed by using brain = Brain(path='/my/new/path')
Brain.bytes
int - number of bytes in plasma_store
Brain.mb
str - number of mb available, e.g. '50 MB'
Brain.namespace
str - the name of the current namespace
Store an item's value and recall the value with bracket notation:
brain['this'] = 5
x = brain['this']
i.e. Brain.__setitem__
and Brain.__getitem__
Delete a name and its stored value like del brain['this']
Brain.learn(name, thing, description=False)
Store object thing
in Plasma, reference later with name
Brain.recall(name)
Get the value of the object with name name
from Plasma
Brain.forget(name)
Delete the object in Plasma with name name
as well as the index object
Since v0.2
. Lightweight namespaces within a single plasma_store
instance. Object names are unique within namespaces but can be duplicated within namespaces. Namespaces can be created and removed at anytime along with all of their objects and names.
IMPORTANT: Namespaces must be between at least 5 and no more than 15 characters. This is because namespace strings are used as the prefix of the plasma.ObjectID for all objects in a given namespace, and must allow enough room for at least 6 unique random characters to ensure ObjectID uniqueness with near certainty. The namespaces set is stored in a unique namespace object with ObjectID as
plasma.ObjectID(b'brain_namespaces_set')
.
Brain.set_namespace(namespace=None)
Changes self.namespace
to namespace
and adds namespace
to the unique namespace object if it does not already exist. Returns name of namespace if successful. If namespace is not specified, simply returns name of current namespace.
Brain.namespaces()
Returns set of unique namespaces.
Brain.remove_namespace(namespace=None)
Removes namespace namespace
and removes all of the objects in namespace
. If namespace
is not specified, it removes the current namespace i.e. self.namespace.
Brain.object_id(name: str)
Get the ObjectId of the value of the name.
Brain.object_ids()
Get a dictionary of the names and the ObjectIds of the values associated with all the names in the namespace. Allows for more granular work with PlasmaClient.
Brain.names(namespace: str="default")
Get a list of all objects that Brain
knows the name of (all names in the specified namespace).
If namespace='all'
, then it gives the list of all the names in all the available namespaces.
Use 'name' in brain
as a shortcut for checking if a name is known.
Brain.ids()
Get a list of all the plasma.ObjectID instances that brain knows the name of.
Brain.metadata(*names, output='dict')
Get a dictionary (or list if output='list'
) of all the metadata for the names you list.
Returns a single dict if you provide only one name. Otherwise returns a list of metadata
objects or dict of name:metadata pairs.
Get the metadata dict object associated with the object with name name
.
Metadata object structure:
{
name: str (variable name),
metadata_id: bytes (bytes of the ObjectID for the index object),
value_id: bytes (bytes of ObjectID for the value),
description: str (False if not assigned),
namespace: str (the object's namespace)
}
Store Metadata
Brain.size()
Calls brain.client.store_capacity()
, returns int - number of bytes available in the plasma_store, e.g. 50000000
Brain.used()
Calculates how many bytes the plasma_store is using.
Brain.free()
Calculates how many bytes of the plasma_store is not used
Managing connection state
Brain.sleep()
Disconnect Brain.client
from Plasma. Must use Brain.wake_up()
to use the Brain
again.
Brain.wake_up()
Reconnect Brain.client
to Plasma.
v0.3 introduces custom exceptions for each type of problem the user may encounter (within limits). Import and use like:
from brain_plasma.exceptions import (
BrainNamespaceNameError,
BrainNamespaceNotExistError,
BrainNamespaceRemoveDefaultError,
BrainNameLengthError,
BrainNameTypeError,
BrainClientDisconnectedError,
BrainRemoveOldNameValueError,
BrainLearnNameError,
BrainUpdateNameError,
)
Apache PlasmaClient API reference: https://arrow.apache.org/docs/python/generated/pyarrow.plasma.PlasmaClient.html#pyarrow.plasma.PlasmaClient
Apache Plasma docs: https://arrow.apache.org/docs/python/plasma.html#
TODO
Made with ❤️ by Russell Romney in Madison, WI. Thanks for the contributions from @tcbegley (so far)