Open Source Ecosystems

This Python package focus on the deployment of gesture control systems. It ease dataset creation, models evaluation, and processing pipeline deployment. The critical element in the proposed processing architecture is the intermediate representation of human bodies as key points to perform efficient classification. In addition to the main application, the package contains two datasets for body/hands pose classificaiton, several classification models, and data augmentation tools that can be accessed through an API. Feel free to check-out the drone-gesture-control repository for a deployment example on Jetson Nano using this package.

Getting Started

Step 1 - Install the package

Using PyPi

Run the following command to install the whole package in the desired Python environment:

pip install pose-classification-kit[app]

If you don't plan to use the application but just want access to the datasets and pre-trained models:

pip install pose-classification-kit

From source

Ensure that Poetry is installed for Python 3.7 and above on your system.

Git clone the repository

git clone https://github.com/ArthurFDLR/pose-classification-kit.git
cd pose-classification-kit

Create an adequate venv virtual environment
```
python -m poetry install
```

Step 2 - Install OpenPose

The dataset creation and real-time model evaluation application heavily rely on the pose estimation system OpenPose. It must be installed on your system to allow real-time gesture classification. This step is not requiered if you don't plan to use the application.

Follow OpenPose installation instructions.
Once the installation is completed, change the variable OPENPOSE_PATH ( .\pose-classification-kit\config.py) to the location of the OpenPose installation folder on your system.

Step 3 - Launch application

You should now be able to run the application if you installed all optionnal dependancies. See the usage section about how to use the app.

pose-classification-app

Step 4 - Create new classification models

The .\examples folder contains Jupyter Notebook detailing the use of the API to create new classification models. Note that these Notebooks can be executed on Google Colab.

Demonstrations

User guide

Real-time pose classification

The video stream of the selected camera is fed to OpenPose at all times. The analysis results are displayed on the left side of the application. You have to choose one of the available models in the drop-down at the bottom of the analysis pannel. Keypoints extracted from the video by OpenPose are automatically normalized and fed to the classifier.

Create and manipulate datasets

First, you either have to load or create a new set of samples for a specific label and hand side. To do so, respectively choose Open (Ctrl+O) or Create new (Ctrl+N) in Dataset of the menu bar. You have to specify the hand side, the label, and the newly created samples set' accuracy threshold. A configuration window will ask for the label and the newly created samples set's accuracy threshold in case of creating a new class. The accuracy threshold defines the minimum accuracy of hand keypoints detection from OpenPose of any sample in the set. This accuracy is displayed on top of the keypoints graph.

Now that a set is loaded in the application, you can record new samples from your video feed or inspect the set and delete inadequate samples. When your done, save the set through Dataset -> Save (Ctrl+S).

Additional scripts

Some functionalities are currently unavailable through the GUI:

You can export all dataset samples from .\pose_classification_kit\datasets\Body and .\pose_classification_kit\datasets\Hands in two respective CSV files.
```
export-datasets
```
You can generate videos similar to this one (.\pose-classification-kit\scripts\video_creation.py might need some modification to fit your use case).

Currently not functional
```
video-overlay
```

Documentation

Body datasets

There is a total of 20 body dataset classes which contains between 500 and $600$ samples each for a total of 10680 entries. Even if the number of samples from one class to the other varies in the raw dataset, the API yields a balanced dataset of 503 samples per class. Also, by default, 20% of these are reserved for final testing of the model. Each entry in the dataset is an array of 25 2D coordinates. The mapping of these keypoints follows the BODY25 body model. We created the dataset using the BODY25 representation as it is one of the most comprehensive standard body models. However, some pose estimation models, such as the one used on the Jetson Nano, use an 18 keypoints representation (BODY18). The seven missing keypoints do not strongly influence classification as 6 of them are used for feet representation, and the last one is a central hip keypoint. Still, the dataset must be converted to the BODY18 representation. This is done by reindexing the samples based on the comparison of the mapping of both body models. You can choose which body model to use when importing the dataset with the API.

Data augmentation

The data augmentation tool currently support the following operations:

Scaling: a random scaling factor drawn from a normal distribution of mean 0 and standard deviation is applied to all sample coordinates.
Rotation: a rotation of an angle randomly drawn from a normal distribution of mean 0 and standard deviation is applied to the sample.
Noise: Gaussian noise of standard deviation is added to coordinates of the sample.
Remove keypoints: a pre-defined or random list of keypoints are removed (coordinates set to 0) from the sample.

from pose_classification_kit.datasets import BODY18, bodyDataset, dataAugmentation

dataset = bodyDataset(testSplit=.2, shuffle=True, bodyModel=BODY18)
x_train = dataset['x_train']
y_train = dataset['y_train_onehot']
x, y = [x_train], [y_train]

# Scaling augmentation
x[len(x):],y[len(y):] = tuple(zip(dataAugmentation(
    x_train, y_train,
    augmentation_ratio=.1,
    scaling_factor_standard_deviation=.08,
)))

# Rotation augmentation
x[len(x):],y[len(y):] = tuple(zip(dataAugmentation(
    x_train, y_train,
    augmentation_ratio=.1,
    rotation_angle_standard_deviation=10,
)))

# Upper-body augmentation
lowerBody_keypoints = np.where(np.isin(BODY18.mapping,[
    "left_knee", "right_knee", "left_ankle", "right_ankle"
]))[0]
x[len(x):],y[len(y):] = tuple(zip(dataAugmentation(
    x_train, y_train,
    augmentation_ratio=.15,
    remove_specific_keypoints=lowerBody_keypoints,
    random_noise_standard_deviation=.03
)))
lowerBody_keypoints = np.where(np.isin(BODY18.mapping,[
    "left_knee", "right_knee", "left_ankle", "right_ankle", "left_hip", "right_hip",
]))[0]
x[len(x):],y[len(y):] = tuple(zip(dataAugmentation(
    x_train, y_train,
    augmentation_ratio=.15,
    remove_specific_keypoints=lowerBody_keypoints,
    random_noise_standard_deviation=.03
)))        

# Random partial input augmentation
x[len(x):],y[len(y):] = tuple(zip(dataAugmentation(
    x_train, y_train,
    augmentation_ratio=.2,
    remove_rand_keypoints_nbr=2,
    random_noise_standard_deviation=.03
)))

x_train_augmented = np.concatenate(x, axis=0)
y_train_augmented = np.concatenate(y, axis=0)