TrustScore: Reference-Free Evaluation of LLM Response Trustworthiness [paper]

May 5, 2024 · View on GitHub

Environment: conda env export > trustscore_environment.yml

Behavior_consistency_example.ipynb: An example code showing how $Trust_{BC}$ works.

qa_human_check.json: includes the mixed QA data used in this project, the predictions of Flan-T5-XXL, LLAMA-7B, GPT-3.5-turbo, and the human evaluation for the predictions.