readme.md

August 22, 2025 ยท View on GitHub

Video Tutorials

Watch step-by-step tutorials on our YouTube channel:

https://github.com/user-attachments/assets/489501b9-69ae-467f-9be3-e4a02a7f9019

Table of Contents

๐Ÿ“ Overview

Trustgen is a dynamic and comprehensive framework for evaluating the trustworthiness of Generative Foundation Models (GenFMs) across dimensions such as safety, fairness, robustness, privacy, and more.

Overview

๐Ÿ‘พ Features

  • Dynamic Dataset Generation: Automatically generate datasets tailored for evaluation tasks.
  • Multi-Model Compatibility: Evaluate LLMs, VLMs, T2I models, and more.
  • Local & API Model Support: Run models locally or via API endpoints for maximum flexibility.
  • Multi-GPU Acceleration: Concurrent inference across multiple GPUs for 7-8x speed improvement.
  • Advanced T2I Models: Support for Stable Diffusion 3.5, FLUX.1-dev, HunyuanDiT, Kolors, and more.
  • Performance Optimizations: xformers memory efficiency, attention slicing, and optimized pipelines.
  • Customizable Metrics: Configure workflows with flexible metrics and evaluation methods.
  • Metadata-Driven Pipelines: Design and execute test cases efficiently using metadata.
  • Comprehensive Dimensions: Evaluate models across safety, fairness, robustness, privacy, and truthfulness.
  • Detailed Reports: Generate interactive, easy-to-interpret evaluation reports.

๐Ÿ“– The documentation can be viewed directly with a browser at TrustGen/docs/html/index.html.

๐Ÿš€ Getting Started

1. Set Up a Conda Environment

Create and activate a new environment with Python 3.10:

conda create -n trustgen_env python=3.10
conda activate trustgen_env

3. Install Dependencies

Basic Installation (API Models Only):

Install the package with basic dependencies for API-based models:

pip install .

Full Installation (Including Local Models):

If you want to run local Text-to-Image models (such as Stable Diffusion, FLUX, HunyuanDiT, etc.), install with additional dependencies:

pip install -e ".[local]"

This will install additional packages required for local model inference:

  • diffusers==0.31.0 - Hugging Face Diffusers library for T2I models
  • torch>=2.1.0 - PyTorch for deep learning
  • transformers>=4.41.2 - Transformers library
  • datasets>=2.15.0 - Dataset utilities
  • accelerate==0.30.1 - Hardware acceleration utilities

๐Ÿค– Supported Models

TrustEval supports a comprehensive range of foundation models across different modalities and providers:

Recommended API Providers: For API-based models, we recommend OpenRouter, DeepInfra, and Replicate for reliable and cost-effective inference.

๐Ÿ“ Large Language Models (LLMs)

OpenAI Models

ModelNameTypeProvider
GPT-4ogpt-4oChat + VisionOpenAI API
GPT-4o Minigpt-4o-miniChat + VisionOpenAI API
GPT-3.5 Turbogpt-3.5-turboChatOpenAI API
o1o1ReasoningOpenAI API
o1 Minio1-miniReasoningOpenAI API
o1 Previewo1-previewReasoningOpenAI API

Anthropic Claude Models

ModelNameTypeProvider
Claude 3.5 Sonnetclaude-3.5-sonnetChat + VisionAnthropic API
Claude 3 Haikuclaude-3-haikuChat + VisionAnthropic API
Claude 3 Opusclaude-3-opusChatAnthropic API

Meta LLaMA Models

ModelNameTypeProvider
LLaMA 2 13Bllama-2-13BChatDeepInfra API / Local
LLaMA 3 8Bllama-3-8BChatDeepInfra API
LLaMA 3 70Bllama-3-70BChatDeepInfra API
LLaMA 3.1 8Bllama-3.1-8BChatDeepInfra API
LLaMA 3.1 70Bllama-3.1-70BChatDeepInfra API
LLaMA 3.2 11B Visionllama-3.2-11B-VChat + VisionDeepInfra API
LLaMA 3.2 90B Visionllama-3.2-90B-VChat + VisionDeepInfra API

Chinese Models

ModelNameTypeProvider
GLM-4glm-4ChatZhipu API
GLM-4 Plusglm-4-plusChatZhipu API
GLM-4Vglm-4vChat + VisionZhipu API
GLM-4V Plusglm-4v-plusChat + VisionZhipu API
DeepSeek Chatdeepseek-chatChatDeepSeek API
Qwen 2.5 72Bqwen-2.5-72BChatDeepInfra API
QwQ 32Bqwq-32BReasoningDeepInfra API
Qwen VL Maxqwen-vl-max-0809Chat + VisionQwen API
Qwen 2 VL 72Bqwen-2-vl-72BChat + VisionOpenRouter API
Yi Lightningyi-lightningChatYi API

Google Models

ModelNameTypeProvider
Gemini 1.5 Flashgemini-1.5-flashChat + VisionGoogle API
Gemini 1.5 Progemini-1.5-proChat + VisionGoogle API
Gemma 2 27Bgemma-2-27BChatDeepInfra API

Mistral Models

ModelNameTypeProvider
Mistral 7Bmistral-7BChatDeepInfra API
Mixtral 8x7Bmistral-8x7BChatDeepInfra API
Mixtral 8x22Bmistral-8x22BChatDeepInfra API

Other Models

ModelNameTypeProvider
Command Rcommand-rChatCohere API
Command R Pluscommand-r-plusChatCohere API
InternLM 72BinternLM-72BChat + VisionInternLM API

๐ŸŽจ Text-to-Image Models (T2I)

Local T2I Models (Requires pip install -e ".[local]")

ModelName
Stable Diffusion 3.5 Largesd-3.5-large
Stable Diffusion 3.5 Turbosd-3.5-large-turbo
Stable Diffusion XLstable-diffusion-xl-base-1.0
Stable Diffusion 3 Mediumstable-diffusion-3-medium
FLUX.1-devFLUX.1-dev
HunyuanDiTHunyuanDiT
Kolorskolors
Playground v2.5playground-v2.5

API T2I Models

ModelNameProvider
CogView3-Pluscogview-3-plusZhipu API
DALL-E 3dalle3OpenAI API
FLUX 1.1 Proflux-1.1-proReplicate API
FLUX Schnellflux_schnellReplicate API

๐Ÿ” Embedding Models

ModelNameProvider
Text Embedding Ada 002text-embedding-ada-002OpenAI API

๐Ÿš€ Performance Features

  • Multi-GPU Support: Automatic load balancing across 1-8 GPUs for local models
  • Memory Optimization: xformers attention, slicing, CPU offloading
  • Concurrent Inference: 7-8x speed improvement with multiple GPUs
  • Auto-Detection: Automatic hardware detection and optimization

๐Ÿค– Usage

Configure API Keys

Run the configuration script to set up your API keys:

python trusteval/src/configuration.py

image

Quick Start

The following example demonstrates an Advanced AI Risk Evaluation workflow.

Step 0: Set Your Project Base Directory

import os
base_dir = os.getcwd() + '/advanced_ai_risk'
Step 1: Download Metadata
from trusteval import download_metadata

download_metadata(
    section='advanced_ai_risk',
    output_path=base_dir
)
Step 2: Generate Datasets Dynamically
from trusteval.dimension.ai_risk import dynamic_dataset_generator

dynamic_dataset_generator(
    base_dir=base_dir,
)
Step 3: Apply Contextual Variations
from trusteval import contextual_variator_cli

contextual_variator_cli(
    dataset_folder=base_dir
)
Step 4: Generate Model Responses
from trusteval import generate_responses

request_type = ['llm']  # Options: 'llm', 'vlm', 't2i'
async_list = ['your_async_model']
sync_list = ['your_sync_model']

await generate_responses(
    data_folder=base_dir,
    request_type=request_type,
    async_list=async_list,
    sync_list=sync_list,
)
Step 5: Evaluate and Generate Reports
  1. Judge the Responses

    from trusteval import judge_responses
    
    target_models = ['your_target_model1', 'your_target_model2']
    judge_type = 'llm'  # Options: 'llm', 'vlm', 't2i'
    judge_key = 'your_judge_key'
    async_judge_model = ['your_async_model']
    
    await judge_responses(
        data_folder=base_dir,
        async_judge_model=async_judge_model,
        target_models=target_models,
        judge_type=judge_type,
    )
    
  2. Generate Evaluation Metrics

    from trusteval import lm_metric
    
    lm_metric(
        base_dir=base_dir,
        aspect='ai_risk',
        model_list=target_models,
    )
    
  3. Generate Final Report

    from trusteval import report_generator
    
    report_generator(
        base_dir=base_dir,
        aspect='ai_risk',
        model_list=target_models,
    )
    

Your report.html will be saved in the base_dir folder. For additional examples, check the examples folder.

Using the Huggingface Dataset Download Script

This script allows you to download and process Trustgen datasets from Hugging Face Hub, with options to download specific subsets or all available subsets.

Basic Usage

To run the script and download a specific subset:

python huggingface_dataset_download_example.py --subset ai_risk_llm

You can also download all available subsets by using --subset all or omitting the subset parameter (as "all" is the default).

Setting the base_dir in Your Evaluation Script

When running your evaluation script after downloading the dataset, make sure to set the base_dir parameter to point to the location where you downloaded the dataset:

base_dir = os.path.abspath("./download_datasets/fairness_llm/")

This ensures your evaluation script can find the downloaded dataset files.

Next Steps in the Pipeline
  1. If the dimension you're working with includes a contextual_variator step, proceed to that step next.
  2. Otherwise, you can skip directly to the generate_responses step.
Troubleshooting

If you encounter execution errors, it may be due to inconsistencies between the downloaded dataset filenames and the expected names in file_config.json. In such cases, simply rename the files manually to match the names specified in the configuration file.

Trustworthiness Report

A detailed trustworthiness evaluation report is generated for each dimension. The reports are presented as interactive web pages, which can be opened in a browser to explore the results. The report includes the following sections:

The data shown in the images below is simulated and does not reflect actual results.

Test Model Results

Displays the evaluation scores for each model, with a breakdown of average scores across evaluation dimensions. Test Model Results

Model Performance Summary

Summarizes the model's performance in the evaluated dimension using LLM-generated summaries, highlighting comparisons with other models. Model Performance Summary

Error Case Study

Presents error cases for the evaluated dimension, including input/output examples and detailed judgments. Error Case Study

Leaderboard

Shows the evaluation results for all models, along with visualized comparisons to previous versions (e.g., our v1.0 results). Leaderboard

Contributing

We welcome contributions from the community! To contribute:

  1. Fork the repository.
  2. Create a feature branch (git checkout -b feature-name).
  3. Commit your changes (git commit -m 'Add feature').
  4. Push to your branch (git push origin feature-name).
  5. Open a pull request.

Citation

@article{huang2025trustgen,
    title={On the Trustworthiness of Generative Foundation Models: Guideline, Assessment, and Perspective},
    author={Yue Huang and Chujie Gao and Siyuan Wu and Haoran Wang and Xiangqi Wang and Yujun Zhou and Yanbo Wang and Jiayi Ye and Jiawen Shi and Qihui Zhang and Yuan Li and Han Bao and Zhaoyi Liu and Tianrui Guan and Dongping Chen and Ruoxi Chen and Kehan Guo and Andy Zou and Bryan Hooi Kuen-Yew and Caiming Xiong and Elias Stengel-Eskin and Hongyang Zhang and Hongzhi Yin and Huan Zhang and Huaxiu Yao and Jaehong Yoon and Jieyu Zhang and Kai Shu and Kaijie Zhu and Ranjay Krishna and Swabha Swayamdipta and Taiwei Shi and Weijia Shi and Xiang Li and Yiwei Li and Yuexing Hao and Zhihao Jia and Zhize Li and Xiuying Chen and Zhengzhong Tu and Xiyang Hu and Tianyi Zhou and Jieyu Zhao and Lichao Sun and Furong Huang and Or Cohen Sasson and Prasanna Sattigeri and Anka Reuel and Max Lamparth and Yue Zhao and Nouha Dziri and Yu Su and Huan Sun and Heng Ji and Chaowei Xiao and Mohit Bansal and Nitesh V. Chawla and Jian Pei and Jianfeng Gao and Michael Backes and Philip S. Yu and Neil Zhenqiang Gong and Pin-Yu Chen and Bo Li and Xiangliang Zhang},
    journal={arXiv preprint arXiv:2502.14296},
    year={2025}
}

License

This project is licensed under the CC BY-NC 4.0.