I replace original utterance encoder and knowledge encoder with pretrained BERT. For decoder, I apply Hierarchical Gated Fusion Unit (HGFU) [Yao et al. 2017] and I only use three number of knowledges for the sake of code simplicity.
$ python train.py -pre_epoch 5 -n_epoch 15 -n_batch 32
$ python demo.py
# example
Type first Knowledge: i'm very athletic.
Type second Knowledge: i wear contacts.
Type third Knowledge: i have brown hair.
you: hi ! i work as a gourmet cook .
bot(response): i don't like carrots . i throw them away . # reponse can change based on training.