LLM-Based Fortran to C++ Translation Framework
May 7, 2026 · View on GitHub
Public-facing summary of research on LLM-assisted legacy scientific code translation, compiler-in-the-loop feedback, and evaluation of open-source LLMs for Fortran-to-C++ modernization.
Overview
This repository documents a public-facing summary and reimplementation outline of research I conducted as a Graduate Research Intern in the CCS-3 Division at Los Alamos National Laboratory.
The work contributed to the NAACL 2025 paper:
"LLM-Assisted Translation of Legacy FORTRAN Code to C++: A Cross-Platform Study"
The research studied how large language models can assist with translating legacy Fortran scientific computing code into modern C++, with emphasis on:
- open-source LLM evaluation,
- controlled prompting strategies,
- code translation quality,
- compiler feedback loops,
- and functional validation.
Important Repository Note
The original research code, internal datasets, experiment artifacts, and detailed LANL documentation are not included in this repository.
This is intentional.
The original implementation and data were developed during work at Los Alamos National Laboratory and remain subject to internal review, approval, and release constraints. This repository therefore serves as a public project summary and reimplementation-safe description of the research contributions, rather than a full release of the internal experimental framework.
No restricted LANL code, data, internal reports, or non-public experiment artifacts are included here.
Research Motivation
Large scientific computing codebases still rely heavily on legacy Fortran. Many of these systems are:
- long-lived,
- performance-sensitive,
- difficult to modernize manually,
- domain-specific,
- and expensive to validate.
Modern C++ is often preferred for maintainability, interoperability, tooling, and integration with newer software ecosystems. However, translating scientific Fortran code to C++ is difficult because translation must preserve:
- numerical behavior,
- memory semantics,
- array indexing patterns,
- control flow,
- compiler compatibility,
- and domain-specific logic.
This project explored whether LLMs can support this modernization process and how their outputs should be evaluated.
Research Questions
The project investigated questions such as:
- How well can open-source LLMs translate legacy Fortran code into C++ under controlled prompting conditions?
- How should translation quality be measured beyond surface-level similarity?
- Can compiler feedback improve translation correctness?
- What failure modes appear when LLMs translate scientific computing code?
- How do model size, prompting strategy, and session context affect translation performance?
My Contributions
During this research, I contributed to the design and implementation of an evaluation workflow for LLM-based code translation.
Publicly describable contributions include:
Evaluation Framework
- Designed and implemented components of a framework for evaluating multiple open-source LLMs on Fortran-to-C++ translation.
- Supported controlled prompting experiments for comparing translation behavior across models.
- Helped structure standardized test cases covering different Fortran programming patterns and translation challenges.
Quantitative Assessment
- Worked with metrics such as CodeBLEU to assess structural and semantic similarity between generated C++ translations and reference implementations.
- Supported evaluation approaches for measuring translation quality beyond exact string matching.
- Helped compare model outputs across different architectures, model sizes, and prompting configurations.
Compiler-in-the-Loop Validation
- Developed and evaluated feedback-loop ideas using compiler diagnostics.
- Explored how GCC/GFortran compiler errors could be used to guide iterative correction.
- Investigated agentic workflows where model outputs are refined using tool feedback.
Failure Analysis
- Analyzed common translation failures, including syntax errors, semantic mismatches, incorrect control flow, and numerical inconsistencies.
- Studied how prompting strategy and session context affected translation quality.
- Contributed to interpretation of model behavior across code translation experiments.
System-Level Workflow
At a high level, the research workflow can be represented as:
flowchart LR
A[Legacy Fortran Code] --> B[Prompt Construction]
B --> C[Open-Source LLM]
C --> D[Generated C++ Translation]
D --> E[Static / Structural Metrics]
D --> F[Compiler Validation]
F --> G[Compiler Error Feedback]
G --> B
D --> H[Functional / Qualitative Analysis]
The key idea was to evaluate code translation as more than text generation. Generated translations need to be checked for structural similarity, compilability, and functional correctness.
Evaluation Dimensions
The project considered several complementary evaluation dimensions.
| Dimension | Purpose |
|---|---|
| CodeBLEU / structural similarity | Measures overlap in code structure and semantics |
| Compilation success | Checks whether generated C++ compiles successfully |
| Compiler diagnostics | Identifies syntax, type, and compatibility issues |
| Functional equivalence | Assesses whether translated code preserves intended behavior |
| Prompting strategy | Compares zero-shot and context/session-based prompting |
| Model comparison | Evaluates behavior across open-source LLMs of different sizes |
Technical Scope
The research involved:
| Area | Details |
|---|---|
| Source language | Legacy Fortran |
| Target language | C++ |
| Models | Open-source LLMs in the 7B–34B parameter range |
| Prompting | Zero-shot and session-maintained prompting strategies |
| Evaluation | CodeBLEU, custom metrics, compiler validation, qualitative failure analysis |
| Tooling | Python, Hugging Face Transformers, Ollama, GCC, GFortran |
| Analysis | Translation quality, consistency, error patterns, compiler-feedback behavior |
Publication
This work contributed to the following publication:
Nishath Rajiv Ranasinghe, Shawn M. Jones, Michal Kucer, Ayan Biswas, Daniel O’Malley, Alexander Most, Selma Liliane Wanna, and Ajay Sreekumar.
"LLM-Assisted Translation of Legacy FORTRAN Code to C++: A Cross-Platform Study."
North American Chapter of the Association for Computational Linguistics (NAACL), 2025.
Why This Work Matters
LLM-based code translation is promising, but scientific computing raises a higher bar than ordinary code generation.
A translated scientific program must not only look plausible. It must:
- compile,
- preserve numerical behavior,
- respect language-specific semantics,
- maintain performance-sensitive structures,
- and be understandable to domain scientists and software maintainers.
This project explored the reliability boundaries of LLMs in that setting and studied how evaluation frameworks can better capture translation quality.
Repository Status
This repository is intentionally minimal.
| Component | Status |
|---|---|
| Public README summary | Available |
| Internal LANL code | Not released |
| Internal datasets/test cases | Not released |
| Internal figures/results | Not released |
| Published paper reference | Included |
| Reimplementation-safe methodology summary | Included |
Future public additions may include:
- toy examples using synthetic Fortran snippets,
- a simplified compiler-feedback demo,
- a public-safe evaluation template,
- or links to the final published paper page.
Technologies Referenced
- Python
- Hugging Face Transformers
- Ollama
- Open-source LLMs
- CodeBLEU
- GCC
- GFortran
- C++
- Fortran
- Matplotlib / Plotly for internal analysis workflows
Citation
If referencing this work, please cite the NAACL 2025 paper once the official citation page is available.
@inproceedings{ranasinghe2025llmfortran,
title = {LLM-Assisted Translation of Legacy FORTRAN Code to C++: A Cross-Platform Study},
author = {Ranasinghe, Nishath Rajiv and Jones, Shawn M. and Kucer, Michal and Biswas, Ayan and O'Malley, Daniel and Most, Alexander and Wanna, Selma Liliane and Sreekumar, Ajay},
booktitle = {Proceedings of the North American Chapter of the Association for Computational Linguistics},
year = {2025}
}
License
This repository summary is released under the MIT License.
No restricted LANL code, data, internal documentation, or non-public research artifacts are included.