Code Generation Example (CodeGen)

November 5, 2025 · View on GitHub

Table of Contents

Overview

The Code Generation (CodeGen) example demonstrates an AI application designed to assist developers by generating computer code based on natural language prompts or existing code context. It leverages Large Language Models (LLMs) trained on vast datasets of repositories, documentation, and code for programming.

This example showcases how developers can quickly deploy and utilize a CodeGen service, potentially integrating it into their IDEs or development workflows to accelerate tasks like code completion, translation, summarization, refactoring, and error detection.

Problem Motivation

Writing, understanding, and maintaining code can be time-consuming and complex. Developers often perform repetitive coding tasks, struggle with translating between languages, or need assistance understanding large codebases. CodeGen LLMs address this by automating code generation, providing intelligent suggestions, and assisting with various code-related tasks, thereby boosting productivity and reducing development friction. This OPEA example provides a blueprint for deploying such capabilities using optimized components.

Architecture

High-Level Diagram

The CodeGen application follows a microservice-based architecture enabling scalability and flexibility. User requests are processed through a gateway, which orchestrates interactions between various backend services, including the core LLM for code generation and potentially retrieval-augmented generation (RAG) components for context-aware responses.

High-level Architecture

OPEA Microservices Diagram

This example utilizes several microservices from the OPEA GenAIComps repository. The diagram below illustrates the interaction between these components for a typical CodeGen request, potentially involving RAG using a vector database.

---
config:
  flowchart:
    nodeSpacing: 400
    rankSpacing: 100
    curve: linear
  themeVariables:
    fontSize: 25px
---
flowchart LR
    %% Colors %%
    classDef blue fill:#ADD8E6,stroke:#ADD8E6,stroke-width:2px,fill-opacity:0.5
    classDef orange fill:#FBAA60,stroke:#ADD8E6,stroke-width:2px,fill-opacity:0.5
    classDef orchid fill:#C26DBC,stroke:#ADD8E6,stroke-width:2px,fill-opacity:0.5
    classDef invisible fill:transparent,stroke:transparent;
    style CodeGen-MegaService stroke:#000000
    %% Subgraphs %%
    subgraph CodeGen-MegaService["CodeGen-MegaService"]
        direction LR
        EM([Embedding<br>MicroService]):::blue
        RET([Retrieval<br>MicroService]):::blue
        RER([Agents]):::blue
        LLM([LLM<br>MicroService]):::blue
    end
    subgraph User Interface
        direction LR
        a([Submit Query Tab]):::orchid
        UI([UI server]):::orchid
        Ingest([Manage Resources]):::orchid
    end

    CLIP_EM{{Embedding<br>service}}
    VDB{{Vector DB}}
    V_RET{{Retriever<br>service}}
    Ingest{{Ingest data}}
    DP([Data Preparation]):::blue
    LLM_gen{{LLM Serving}}
    GW([CodeGen GateWay]):::orange

    %% Data Preparation flow
    direction LR
    Ingest[Ingest data] --> UI
    UI --> DP
    DP <-.-> CLIP_EM

    %% Questions interaction
    direction LR
    a[User Input Query] --> UI
    UI --> GW
    GW <==> CodeGen-MegaService
    EM ==> RET
    RET ==> RER
    RER ==> LLM


    %% Embedding service flow
    direction LR
    EM <-.-> CLIP_EM
    RET <-.-> V_RET
    LLM <-.-> LLM_gen

    direction TB
    %% Vector DB interaction
    V_RET <-.->VDB
    DP <-.->VDB

Deployment Options

This CodeGen example can be deployed manually on various hardware platforms using Docker Compose or Kubernetes. Select the appropriate guide based on your target environment:

HardwareDeployment ModeGuide Link
Intel Xeon CPUSingle Node (Docker)Xeon Docker Compose Guide
Intel Xeon CPUSingle Node (Docker) with MonitoringXeon Docker Compose with Monitoring Guide
Intel Gaudi HPUSingle Node (Docker)Gaudi Docker Compose Guide
Intel Gaudi HPUSingle Node (Docker) with MonitoringGaudi Docker Compose with Monitoring Guide
AMD EPYC CPUSingle Node (Docker)EPYC Docker Compose Guide
AMD ROCm GPUSingle Node (Docker)ROCm Docker Compose Guide
Intel Xeon CPUKubernetes (Helm)Kubernetes Helm Guide
Intel Gaudi HPUKubernetes (Helm)Kubernetes Helm Guide
Intel Xeon CPUKubernetes (GMC)Kubernetes GMC Guide
Intel Gaudi HPUKubernetes (GMC)Kubernetes GMC Guide

Note: Building custom microservice images can be done using the resources in GenAIComps.

Monitoring

The CodeGen example supports monitoring capabilities for Intel Xeon and Intel Gaudi platforms. Monitoring includes:

  • Prometheus: For metrics collection and querying
  • Grafana: For visualization and dashboards
  • Node Exporter: For system metrics collection

Monitoring Features

  • Real-time metrics collection from all CodeGen microservices
  • Pre-configured dashboards for:
    • vLLM/TGI performance metrics
    • CodeGen MegaService metrics
    • System resource utilization
    • Node-level metrics

Enabling Monitoring

Monitoring can be enabled by using the compose.monitoring.yaml file along with the main compose file:

# For Intel Xeon
docker compose -f compose.yaml -f compose.monitoring.yaml up -d

# For Intel Gaudi
docker compose -f compose.yaml -f compose.monitoring.yaml up -d

Accessing Monitoring Services

Once deployed with monitoring, you can access:

  • Prometheus: http://${HOST_IP}:9090
  • Grafana: http://${HOST_IP}:3000 (username: admin, password: admin)
  • Node Exporter: http://${HOST_IP}:9100

Benchmarking

Guides for evaluating the performance and accuracy of this CodeGen deployment are available:

Benchmark TypeGuide Link
AccuracyAccuracy Benchmark Guide
PerformancePerformance Benchmark Guide

Automated Deployment using Terraform

Intel® Optimized Cloud Modules for Terraform provide an automated way to deploy this CodeGen example on various Cloud Service Providers (CSPs).

Cloud ProviderIntel ArchitectureIntel Optimized Cloud Module for TerraformComments
AWS4th Gen Intel Xeon with Intel AMXAWS DeploymentAvailable
GCP4th/5th Gen Intel XeonGCP DeploymentAvailable
Azure4th/5th Gen Intel XeonWork-in-progressComing Soon
Intel Tiber AI Cloud5th Gen Intel Xeon with Intel AMXWork-in-progressComing Soon

Validated Configurations

Deploy MethodLLM EngineLLM ModelHardware
Docker ComposevLLM, TGIQwen/Qwen2.5-Coder-7B-InstructIntel Gaudi
Docker ComposevLLM, TGIQwen/Qwen2.5-Coder-7B-InstructIntel Xeon
Docker ComposevLLM, TGIQwen/Qwen2.5-Coder-7B-InstructAMD EPYC
Docker ComposevLLM, TGIQwen/Qwen2.5-Coder-7B-InstructAMD ROCm
Helm ChartsvLLM, TGIQwen/Qwen2.5-Coder-7B-InstructIntel Gaudi
Helm ChartsvLLM, TGIQwen/Qwen2.5-Coder-7B-InstructIntel Xeon
Helm ChartsvLLM, TGIQwen/Qwen2.5-Coder-7B-InstructAMD ROCm

Contribution

We welcome contributions to the OPEA project. Please refer to the contribution guidelines for more information.