GAN-based models to flash-simulate the LHCb PID detectors
GPL-3.0 License
Bot releases are hidden (Show)
PIDGAN is a Python package built upon TensorFlow 2 to provide ready-to-use implementations for several GAN algorithms (listed in this table). The package was originally designed to simplify the training and optimization of GAN-based models for the Particle Identification (PID) system of the LHCb experiment. Today, PIDGAN is a versatile package that can be employed in a wide range of High Energy Physics (HEP) applications and, in general, whenever one has anything to do with tabular data and aims to learn the conditional probability distributions of a set of target features. This package is one of the building blocks to define a Flash Simulation framework of the LHCb experiment.
algorithms
callbacks
metrics
optimization
players
classifiers
discriminators
generators
utils
Keras 3 has introduced new appealing features but at the cost of breaking the backward compatibility with the previous versions as reported in https://github.com/mbarbetti/pidgan/issues/4. PIDGAN has been massively rewritten to be compatible with the new multi-backend Keras 3 and to make the code execution as similar as possible on TensorFlow < 2.16 (with Keras 2) and TensorFlow >= 2.16 (with Keras 3).
Aiming to migrate the code to Keras 3 being as transparent as possible for the user, that means keeping the compatibility with Keras 2 and not requiring any changes on existing scripts, the vast majority of PIDGAN classes and functions has needed minor changes or spurious adjustments to be aligned with the new package design.
k2
/k3
]
Problem. The scale factor used for the learning rate scheduling was defined as
decayed = (1 - alpha) * (cosine_decay + alpha)
instead of
decayed = (1 - alpha) * cosine_decay + alpha
Solution. The scale factor has been corrected according to the TensorFlow definition.
Relying on TensorFlow and Keras as backends, pidgan is a Python package designed to simplify the implementation and training of GAN-based models intended for High Energy Physics (HEP) applications.
invertColumnTransformer
Problem. When the column indices passed to a transformer of the scikit-learn's
ColumnTransformer
aren't adjacent, this custom function has an unexpected behavior mixing the output columns.
Solution. The function has been rewritten from scratch trying to follow a logical procedure that should mitigate new issues with the inversion of theColumnTransformer
.
Since regularization terms applied to either generator or discriminator can be extremely data-dependent, if they are computed also during the test step, it can produce loss values significantly different from the ones resulting in the train step. Hence, the GAN algorithms
were updated so that the various _compute_*_loss
methods take an additional boolean argument, called test
, to avoid to compute any regularization terms during the test steps.
This is the first release for Zenodo.
Published by mbarbetti 11 months ago
Relying on TensorFlow and Keras as backends, pidgan is a Python package designed to simplify the implementation and training of GAN-based models intended for High Energy Physics (HEP) applications.
algorithms
callbacks
metrics
optimization
players
classifiers
AuxClassifier
- ✨AuxMultiClassifier
- ✨Classifier
- ✨MultiClassifier
- ✨ResClassifier
- ✨ResMultiClassifier
- ✨discriminators
AuxDiscriminator
- ✨Discriminator
- ✨ResDiscriminator
- ✨generators
Generator
- ✨ResGenerator
- ✨utils
Generator
player is implemented via a neural network using the TensorFlow's sequential model. To prevent the vanishing gradient problem even when playing with deep models, we enabled the use of skip connections. The ResGenerator
player allows to use skip connections thanks to the TensorFlow's functional API.Discriminator
player is implemented via a neural network using the TensorFlow's sequential model. To prevent the vanishing gradient problem even when playing with deep models, we enabled the use of skip connections. The ResDiscriminator
and AuxDiscriminator
players allow to use skip connections thanks to the TensorFlow's functional API.Classifier
and MultiClassifier
players are implemented via neural networks using the TensorFlow's sequential model. To prevent the vanishing gradient problem even when playing with deep models, we enabled the use of skip connections. The ResClassifier
, AuxClassifier
, ResMultiClassifier
and AuxMultiClassifier
players allow to use skip connections thanks to the TensorFlow's functional API.Published by mbarbetti 11 months ago
Relying on TensorFlow and Keras as backends, pidgan is a Python package designed to simplify the implementation and training of GAN-based models intended for High Energy Physics (HEP) applications.
Generator
Problem. Using the
generate()
method withseed=None
, the generator player used to produce always the same output.
Solution. The default seed value used by thetf.random.set_seed()
method has been removed.
Published by mbarbetti 12 months ago
Relying on TensorFlow and Keras as backends, pidgan is a Python package designed to simplify the implementation and training of GAN-based models intended for High Energy Physics (HEP) applications. Originally designed to develop parameterizations to flash-simulate the LHCb Particle Identification system, pidgan can be used to describe a wide range of LHCb sub-detectors and succeeds in reproducing the high-level response of a generic HEP experiment. The pidgan package will be publicly presented during the Fifth ML-INFN Hackathon: Advanced Level where it will be used to parameterize high energy particle jets as detected and reconstructed by the CMS experiment.