Cheetah Platform - Apache Flink Examples
December 19, 2025 ยท View on GitHub
A collection of example Apache Flink streaming jobs demonstrating various data processing patterns and best practices for the Cheetah data platform.
Overview
This repository contains multiple real-world examples of Apache Flink jobs, each showcasing different streaming data processing concepts:
- AvroToJson - Convert Avro-serialized messages to JSON format
- EnrichStream - Enrich streaming data with additional information
- ExternalLookup - Perform lookups against external APIs during stream processing
- FlinkStates - Demonstrate Flink's stateful processing capabilities (ValueState, ReducingState, AggregatingState, ListState, MapState)
- JsonToAvro - Convert JSON messages to Avro format with schema registry integration
- KeySerializationSchema - Custom serialization for Kafka message keys
- MultipleSideOutput - Split streams into multiple outputs based on conditions
- Observability - Monitor and instrument Flink jobs with metrics and logging
- SerializationErrorCatch - Handle serialization errors gracefully
- SerializationErrorSideOutput - Route serialization errors to a separate output stream
- TransformAndStore - Transform data and store results in multiple sinks
- TumblingWindow - Implement time-based windowing operations
Prerequisites
Local Infrastructure
All examples require the local development infrastructure from the cheetah-development-infrastructure repository. This provides:
- Apache Kafka for message streaming
- Schema Registry for Avro schemas
- Keycloak for authentication
- Redpanda Console for Kafka inspection
Start the infrastructure:
# Clone the infrastructure repo
git clone https://github.com/trifork/cheetah-development-infrastructure
cd cheetah-development-infrastructure
# Start Kafka and related services
docker compose --profile kafka up -d
GitHub Package Authentication
The Flink jobs depend on Maven packages from the Cheetah Maven Repository on GitHub. You'll need to:
- Create a GitHub Personal Access Token with
read:packagesscope at https://github.com/settings/tokens/new - Set environment variables:
export GITHUB_ACTOR=your-github-username
export GITHUB_TOKEN=your-github-token
Kafka Topics
Each example requires specific Kafka topics to be created before running. Refer to each job's README for the required topics and setup instructions.
Getting Started
Each example is self-contained with:
/src- Java source code for the Flink job/ComponentTest- .NET integration testsdocker-compose.yaml- Configuration for running locallyREADME.md- Detailed documentation and setup instructions
To run an example:
# Navigate to an example directory
cd AvroToJson
# Build and start the Flink job
docker compose up --build
For detailed information about each job, including specific configuration, testing procedures, and implementation details, see the README in each example's directory.
Testing
Each example includes:
- Unit tests - JUnit tests in
/src/test(automatically run during Maven build) - Component tests - .NET integration tests in
/ComponentTestthat verify end-to-end behavior
Run component tests via Docker Compose or directly with dotnet test.