mxnv_acc.md

November 7, 2025 ยท View on GitHub

Average accuracy of hellaswag,lambada_openai,mmlu,piqa,winogrande.

We evaluated using a fake model since we currently have no access to devices for running the real models. However, we have verified that in most cases the fake model closely matches the real model.

mxfp4 g32llama3.1-8B-InstructQwen2-7.5-InstructPhi4Qwen3-32B
RTN0.62120.65500.71670.6901
AutoRound0.66860.67580.72470.7211
AutoRound+alg_ext0.67320.68090.72250.7201
nvfp4 g16llama3.1-8B-InstructQwen2-7.5-InstructPhi4Qwen3-32B
RTN0.68760.69060.72960.7164
AutoRound0.69180.69730.73060.7306
AutoRound+alg_ext0.69650.69890.73180.7295