Acknowledgements
June 11, 2026 ยท View on GitHub
CuTe DSL OSS kernels
Several open-source kernels in this repository originated in NVIDIA's CuTe DSL kernel library. We gratefully acknowledge the contributors who helped develop, bring up, and test these kernels.
Contributors include:
- Rachit Garg rachitg@nvidia.com
- Jack Yang jackyang@nvidia.com
- Hao Sheng hsheng@nvidia.com
- Alex Li alel@nvidia.com
- Aragorn Guan aragorng@nvidia.com
- Bangyu Shen bangyus@nvidia.com
- Caleb Du cadu@nvidia.com
- Xiao Song xiaos@nvidia.com
- Siddhartha Raman sraman@nvidia.com
This acknowledgement covers the CuTe DSL kernel work now represented by these
modules in python/cudnn/:
gemm_amaxgemm_dsrelugemm_srelugemm_swiglugrouped_gemm/grouped_gemm_dglugrouped_gemm/grouped_gemm_dsrelugrouped_gemm/grouped_gemm_dswiglugrouped_gemm/grouped_gemm_glugrouped_gemm/grouped_gemm_glu_hadamardgrouped_gemm/grouped_gemm_quantgrouped_gemm/grouped_gemm_srelugrouped_gemm/grouped_gemm_swiglugrouped_gemm/grouped_gemm_wgraddiscrete_grouped_gemm/discrete_grouped_gemm_dswigludiscrete_grouped_gemm/discrete_grouped_gemm_swiglurmsnorm_rht_amax
Thank you also to the broader CUTLASS/CuTe DSL and infrastructure teams who supported the original kernel development.