This repo contains all the codes and data used in our paper https://arxiv.org/abs//2408.12748
Large language models (LLMs) are highly capable but face latency challenges in real-time applications, such as conducting online hallucination detection. To overcome this issue, we propose a novel framework that leverages a small language model (SLM) classifier for initial detection, followed by a LLM as constrained reasoner to generate detailed explanations for detected hallucinated content. This study optimizes the real-time interpretable hallucination detection by introducing effective prompting techniques that align LLM-generated explanations with SLM decisions. Empirical experiment results demonstrate its effectiveness, thereby enhancing the overall user experience.
Please use the following command install all the required python packages:
pip install -r requirements.txt
We leverage Azure OpenAI Service to conduct the experiment. We use GPT-4 turbo
as our model deployment, set temperature=0
and top_p=0.6
. To Avoid sharing key in the repository, we read the key securally from Azure Key Vault, please make sure to assign yourself a role assignment of Key Vault Secrets User
in IAM, and save your key into the Key Vault Secrets with a SECRET_NAME
. Then the user should specify the resource details in aoai_config.json :
For audiance calling GPT via other methods, please revise the code, mostly aoaiutil.py.
Run Constrained Reasoner:
reason_analysis.sh
bash script.0
to run all data, or n > 0
to sample n
hypotheses).results
folder.Human Review:
Run Analysis:
analyze_reasoning_results.ipynb
notebook.Original Data:
data
folder.groundingsources
folder and a hypothesis file.groundingsources
folder includes all grounding source files, named as EncounterID.txt
.EncounterID
: It is used to match the grounding source files in the folder to the corresponding hypotheses in the hypothesis file.SentenceID
: Index of the hypotheses within the same encounter.Sentence
: The hypotheses to be judged.IsHallucination
: Ground-truth indicating whether the hypothesis is hallucinated (1
for hallucination).Results:
results
folder. Column "GPTreason" is the generated reasons.GPTJudgement
column. "1" means the output really explains hallucination, while "0" means the constrained reasoner disagrees with the upstream decision and gives reasons why the text is a non-hallucination.results/labelled
folder.[TO Do]