Awesome Performance Engineering [](https://awesome.re)
April 17, 2026 ยท View on GitHub
The discipline that ensures systems deliver fast, reliable, and cost-efficient experiences at any scale, combining observability and performance testing.
Contents
- Observability
- Metrics Collection & Time-Series Storage
- Distributed Tracing
- Log Management & Log Pipelines
- Observability Pipelines and Telemetry Processing
- Visualization & Dashboards
- Profiling & Continuous Performance Analysis
- Alerting & Incident Response
- Observability Platforms (Integrated)
- Monitoring Suites (Operations-Oriented)
- Service Mesh Observability
- Database Observability
- Real User Monitoring (RUM) & Frontend Observability
- AI-Augmented Observability
- SLO Management
- Synthetic Monitoring
- Legacy & Historical
- Performance Testing
- Load & Stress Testing
- HTTP Benchmarking & Micro-Benchmarking
- API Testing & Contract Testing
- gRPC & Protocol-Specific Testing
- Browser & Frontend Performance
- Service Virtualization and Mocking
- Synthetic Data Generation
- Database Performance Testing & Benchmarking
- System & Infrastructure Benchmarking
- Chaos Engineering & Fault Injection
- Network Simulation & Traffic Shaping
- CI/CD Integration & Performance Gates
- Results Analysis & Reporting
- Cloud Provider Services
- Developer-Centric Platforms
- Enterprise Platforms
- Tools & Integrations
- Related
Indicators: โญ Widely adopted ยท ๐ข Active ยท ๐ต Cloud-native ยท ๐ Commercial ยท ๐ High performance
Observability
Metrics Collection & Time-Series Storage
- Prometheus - โญ๐ข๐ต Pull-based cloud-native metrics platform with dimensional data model and PromQL query language.
- VictoriaMetrics - โญ๐ข๐ High-performance, cost-efficient Prometheus-compatible TSDB with high-cardinality and long-retention support.
- Thanos - โญ๐ข๐ต Long-term storage, global query view, and high availability layer for Prometheus via sidecar architecture.
- Mimir - โญ๐ข๐ต๐ Horizontally scalable, multi-tenant Prometheus-compatible TSDB from Grafana Labs.
- InfluxDB - ๐ข๐ Purpose-built time-series database with high write throughput and a Rust-based engine (v3).
- Grafana Alloy - โญ๐ข๐ต OpenTelemetry-native telemetry collector supporting metrics, logs, traces, and profiles.
- Telegraf - ๐ข Plugin-driven agent for collecting and reporting metrics with 300+ input plugins.
- StatsD - Lightweight, UDP-based metrics aggregation daemon with broad application support.
- Netdata - โญ๐ข๐ Real-time per-second monitoring with built-in anomaly detection and zero-configuration agent.
Distributed Tracing
- OpenTelemetry - โญ๐ข๐ต Open standard for distributed tracing, metrics, and logs with language-specific SDKs and auto-instrumentation.
- Jaeger - โญ๐ข๐ต CNCF graduated distributed tracing backend and UI, originally from Uber.
- Grafana Tempo - โญ๐ข๐ต High-scale tracing backend requiring only object storage, with native Grafana integration.
- Zipkin - ๐ข Pioneering distributed tracing system (Twitter, 2012) with a simple architecture.
- Apache SkyWalking - โญ๐ข๐ต Observability platform with bytecode-injection-based tracing, popular in the Java ecosystem.
- SigNoz - ๐ข๐ต Open-source OpenTelemetry-native observability platform with unified metrics, traces, and logs.
- Pinpoint - Bytecode-instrumentation-based APM and tracing for Java and PHP with zero-code-change approach.
Log Management & Log Pipelines
- Grafana Loki - โญ๐ข๐ต Label-based log aggregation that indexes metadata instead of content for cost-efficient storage at scale.
- Fluent Bit - โญ๐ข๐ต๐ Lightweight, high-performance log processor and forwarder for edge and containerized environments.
- Fluentd - ๐ข๐ต CNCF graduated unified logging layer with 1000+ plugins for complex routing.
- Elasticsearch - โญ๐ข๐ Distributed search and analytics engine with powerful full-text search capabilities.
- OpenSearch - ๐ข๐ต Community-driven, Apache-2.0-licensed fork of Elasticsearch, backed by AWS.
- Logstash - Flexible log ingestion and transformation pipeline, part of the Elastic Stack.
- Graylog - ๐ข๐ Centralized log management with built-in alerting and dashboards.
- rsyslog - ๐ข๐ High-performance system logging daemon handling millions of messages per second.
Observability Pipelines and Telemetry Processing
- OpenTelemetry Collector - โญ๐ข๐ต Standard telemetry processing pipeline with receivers, processors, and exporters for any signal.
- Vector - ๐ข๐ End-to-end observability data routing and transformation with programmable VRL transforms.
- Logstash - ETL-style processing for observability data with powerful filter plugins.
- Cribl Stream - ๐ ๐ Commercial observability pipeline for routing, reducing, and enriching telemetry data.
Visualization & Dashboards
- Grafana - โญ๐ข Open-source observability dashboard platform supporting 100+ data sources with alerting and annotations.
- Kibana - ๐ข๐ Visualization and log exploration for Elasticsearch and OpenSearch data.
- OpenSearch Dashboards - ๐ข๐ต Open-source fork of Kibana for OpenSearch.
- Apache Superset - ๐ข SQL-first analytics and dashboarding platform for ad-hoc data exploration.
- Perses - ๐ข๐ต CNCF sandbox dashboards-as-code project with native PromQL and TraceQL support.
Profiling & Continuous Performance Analysis
- Parca - โญ๐ข๐ต eBPF-based continuous profiling platform with zero-instrumentation and differential flame graphs (CNCF sandbox).
- Grafana Pyroscope - โญ๐ข๐ต Continuous profiling with flame graph visualization and multi-language support.
- async-profiler - ๐ข๐ Low-overhead JVM sampling profiler capturing CPU, allocation, and lock contention profiles.
- perf - ๐ Linux kernel performance analysis tool with hardware counters, tracepoints, and sampling.
- bpftrace - ๐ข๐ High-level tracing language for Linux eBPF with dynamic kernel and user-space tracing.
- bcc (BPF Compiler Collection) - ๐ข๐ Toolkit for creating eBPF-based tracing programs with dozens of ready-to-use tools.
- Grafana Beyla - ๐ข๐ต๐ eBPF-based zero-code auto-instrumentation generating RED metrics and distributed traces.
- Perfetto - ๐ข System-wide tracing and profiling toolkit from Google for Android, Chrome, and general system analysis.
Alerting & Incident Response
- Alertmanager - โญ๐ข Prometheus-native alert handling with grouping, silencing, inhibition, and routing.
- Grafana OnCall - ๐ข๐ต Open-source on-call management and alert routing with native Grafana integration.
- Keep - ๐ข๐ต Open-source alert management platform consolidating alerts from multiple sources.
- Alerta - ๐ข Unified alert correlation and management across multiple monitoring systems.
- PagerDuty - ๐ Industry-standard incident response and on-call management platform.
- Opsgenie - ๐ Alerting and escalation platform, part of the Atlassian suite.
- Rootly - ๐ AI-assisted incident management with automated timelines and postmortem generation.
Observability Platforms (Integrated)
- Datadog - ๐ SaaS observability platform with AI-powered anomaly detection and root-cause analysis.
- Dynatrace - ๐ AI-driven observability with automatic topology discovery and root-cause analysis (Davis AI).
- New Relic - ๐ Developer-centric observability platform with NRQL query language and a generous free tier.
- Splunk Observability - ๐ Observability built on Splunk's machine data analytics platform.
- Elastic Observability - ๐ Observability solution built on the Elastic Stack with self-managed and cloud options.
- Honeycomb - ๐ Observability platform for high-cardinality event data with BubbleUp automated correlation.
- Grafana Cloud - ๐ Managed Grafana stack (Mimir, Loki, Tempo, Pyroscope) with a generous free tier.
- Instana (IBM) - ๐ Automatic infrastructure and application discovery with real-time observability.
- AppDynamics (Splunk/Cisco) - ๐ Enterprise APM with business transaction monitoring and code-level diagnostics.
- Chronosphere - ๐ Cloud-native observability platform focused on metrics at scale with cost control.
- Lightstep / ServiceNow Cloud Observability - ๐ OpenTelemetry-native observability platform, now part of ServiceNow.
- Sematext - ๐ข๐ SaaS observability platform with OpenTelemetry-native support and topology discovery.
Monitoring Suites (Operations-Oriented)
- Zabbix - ๐ข Enterprise-grade monitoring platform with agent-based and agentless monitoring.
- Nagios - ๐ข Pioneering open-source check-based monitoring with an enormous plugin ecosystem.
- Icinga - ๐ข Modern evolution of Nagios with improved APIs, configuration management, and scalability.
- Checkmk - ๐ข๐ Infrastructure and application monitoring with auto-discovery for large environments.
Service Mesh Observability
- Kiali - ๐ข๐ต Observability console for Istio with topology visualization and traffic flow analysis.
- Linkerd Viz - ๐ข๐ต Built-in telemetry and dashboard for Linkerd service mesh.
- Hubble - ๐ข๐ต๐ eBPF-powered network observability for Cilium with L3/L4/L7 flow visibility.
Database Observability
- PMM (Percona Monitoring and Management) - ๐ข Open-source database performance monitoring for MySQL, PostgreSQL, and MongoDB.
- pgwatch - ๐ข PostgreSQL-specific monitoring and metrics collection.
- pg_stat_monitor - ๐ข PostgreSQL extension for enhanced query performance monitoring.
- VividCortex / SolarWinds DPM - ๐ SaaS query-level database performance monitoring.
- Datadog DBM - ๐ Database monitoring with query-level explain plans, wait event analysis, and trace correlation.
Real User Monitoring (RUM) & Frontend Observability
- Sentry - ๐ข Error tracking and performance monitoring with session replay and Web Vitals.
- Grafana Faro - ๐ข๐ต Open-source frontend observability SDK capturing errors, performance, and user events.
- OpenTelemetry Browser SDK - ๐ข OTel instrumentation for web applications capturing page loads and resource timings.
- LogRocket - ๐ Session replay combined with frontend performance monitoring.
AI-Augmented Observability
- Dynatrace Davis AI - ๐ Deterministic and causal AI for topology-aware automatic root-cause analysis.
- Datadog Watchdog - ๐ ML-driven anomaly detection across metrics, logs, and APM data.
- Moogsoft - ๐ AIOps platform for alert correlation, noise reduction, and incident clustering.
- New Relic AI - ๐ Applied intelligence with anomaly detection, incident correlation, and natural-language querying.
- Honeycomb BubbleUp - ๐ Automated outlier correlation across high-cardinality dimensions.
- Coroot - ๐ข๐ต Open-source eBPF-powered observability with automated service map discovery.
SLO Management
- Sloth - ๐ข๐ต SLO generation for Prometheus with YAML definitions and multi-window multi-burn-rate alerts.
- Pyrra - ๐ข๐ต Kubernetes-native SLO management generating Prometheus recording rules and alerts.
- OpenSLO - ๐ข Open, vendor-neutral specification for defining SLOs as code.
- Nobl9 - ๐ Enterprise SLO platform with unified tracking and error budget management.
Synthetic Monitoring
- Checkly - ๐ข๐ต Monitoring as code for APIs and browsers with Playwright-based synthetic checks.
- Grafana Synthetic Monitoring - ๐ข๐ต Probe-based multi-location synthetic monitoring integrated into Grafana Cloud.
- Uptime Kuma - โญ๐ข Self-hosted monitoring tool with HTTP, TCP, DNS, and keyword checks.
- Sematext - ๐ข๐ Playwright-based synthetic checks with CI/CD integration and SSL monitoring.
Legacy & Historical
- Graphite - Pioneering time-series storage and graphing system with Whisper backend and Carbon collector.
- Redash - SQL-first data visualization and collaboration connecting to many data sources.
Performance Testing
Load & Stress Testing
- k6 - โญ๐ข๐ต Modern load testing tool with JavaScript ES6 scripting and native Prometheus/Grafana integration.
- Gatling - โญ๐ข๐ High-performance load testing framework with Scala/Java/Kotlin DSL and detailed HTML reports.
- Locust - โญ๐ข Python-based load testing framework defining user behavior in plain Python code.
- Apache JMeter - โญ๐ข Load testing tool with GUI and extensive protocol support (HTTP, JDBC, JMS, LDAP, SOAP).
- Artillery - ๐ข๐ต Node.js-based load testing toolkit with YAML scenarios supporting HTTP, WebSocket, and Socket.io.
- NBomber - ๐ข Load testing framework for .NET with C#/F# scripting.
- Tsung - ๐ Erlang-based distributed load testing tool handling massive concurrent connections across multiple protocols.
- GoReplay (gor) - ๐ข๐ Capture and replay production HTTP traffic for load testing with real traffic patterns.
- Anteon (formerly Ddosify) - ๐ต eBPF-based Kubernetes performance testing platform with distributed load generation.
- Neoload - ๐ Enterprise performance testing platform with codeless and as-code options.
- LoadRunner / OpenText - ๐ Enterprise performance testing platform with broad protocol support.
HTTP Benchmarking & Micro-Benchmarking
- wrk2 - ๐ Constant-throughput HTTP benchmarking with accurate latency histograms that avoids coordinated omission.
- wrk - ๐ HTTP benchmarking tool with Lua scripting for quick relative performance comparisons.
- Vegeta - ๐ข๐ HTTP load testing tool with constant request rate mode and built-in plotting.
- hey - ๐ข Simple HTTP load generator, successor to Apache Bench (ab).
- oha - ๐ข๐ Rust-based HTTP load generator with real-time TUI.
- bombardier - ๐ข๐ Fast, cross-platform HTTP benchmarking tool with detailed latency reporting.
- hyperfoil - ๐ข๐ต๐ Distributed benchmarking framework designed to avoid coordinated omission.
API Testing & Contract Testing
- Hurl - ๐ข Plain-text HTTP request runner for API testing in CI with assertions and chaining.
- Postman - โญ๐ข๐ API development and testing platform with Newman CLI for CI/CD integration.
- REST-assured - ๐ข Java DSL for testing REST APIs with fluent syntax and JUnit/TestNG integration.
- Karate - ๐ข BDD-style API testing framework combining API testing, mocking, and performance testing.
- Step CI - ๐ข Open-source YAML-based API testing and monitoring framework for CI/CD.
- Pact - ๐ข Contract testing framework ensuring provider-consumer compatibility for HTTP APIs and messaging.
- Dredd - API testing tool that validates implementations against OpenAPI and API Blueprint specifications.
gRPC & Protocol-Specific Testing
- ghz - ๐ข๐ gRPC benchmarking and load testing tool supporting unary and streaming RPCs.
- k6 + xk6-grpc - ๐ข๐ต k6 extension for scriptable gRPC load testing scenarios.
- k6 + xk6-kafka - ๐ข๐ต k6 extension for Apache Kafka load testing at scale.
- kafka-producer-perf-test / kafka-consumer-perf-test - ๐ข Built-in Kafka benchmarking tools for producer and consumer throughput.
- RabbitMQ PerfTest - ๐ข Official RabbitMQ benchmarking tool for throughput and latency measurement.
- k6 + xk6-websockets - ๐ข๐ต Built-in k6 WebSocket support for testing real-time and bidirectional protocols.
Browser & Frontend Performance
- Lighthouse - โญ๐ข Google's auditing tool for performance, accessibility, and SEO with actionable scores.
- WebPageTest - โญ๐ข Web performance analysis with filmstrip views, waterfall charts, and multi-location testing.
- Playwright - โญ๐ข Browser automation framework with built-in performance timing APIs for Chromium, Firefox, and WebKit.
- Sitespeed.io - ๐ข Open-source web performance monitoring integrating Lighthouse, WebPageTest, and Grafana dashboards.
- Puppeteer - ๐ข Chrome DevTools Protocol API enabling programmatic access to performance traces and network interception.
- Yellowlab Tools - ๐ข Frontend code quality and performance auditing for JavaScript, CSS, and rendering issues.
- SpeedCurve - ๐ Continuous frontend performance monitoring with Core Web Vitals tracking and competitive benchmarking.
Service Virtualization and Mocking
- WireMock - โญ๐ข๐ต HTTP mock server with request matching, stateful behavior, response templating, and fault injection.
- Mountebank - ๐ข Multi-protocol service virtualization supporting HTTP, HTTPS, TCP, and SMTP.
- Hoverfly - ๐ข๐ต Lightweight service virtualization with capture-and-replay mode for API simulation.
- MockServer - ๐ข HTTP/HTTPS mock server with expectation-based matching and callback actions.
- Microcks - ๐ข๐ต Kubernetes-native API mocking and testing importing OpenAPI, AsyncAPI, gRPC, and GraphQL contracts.
Synthetic Data Generation
- Faker - โญ๐ข Realistic fake data generation for JavaScript/TypeScript with massive locale support.
- DataFaker - ๐ข Modern Java data generation library with expression-based generation.
- Mimesis - ๐ข๐ High-performance fake data generator for Python with strong locale support.
- Neosync - ๐ต Open-source platform for anonymizing production data and generating synthetic datasets.
Database Performance Testing & Benchmarking
- HammerDB - โญ๐ข Open-source database benchmarking tool supporting TPC-C and TPC-H workloads across major databases.
- sysbench - โญ๐ข๐ Scriptable multi-threaded benchmark tool for OLTP, CPU, memory, and I/O tests.
- pgbench - ๐ข PostgreSQL built-in benchmarking tool with custom scripts for workload simulation.
- YCSB (Yahoo! Cloud Serving Benchmark) - โญ๐ข Framework for benchmarking NoSQL and NewSQL databases with standard workloads.
- benchbase (formerly OLTPBench) - ๐ข Multi-DBMS benchmarking framework supporting TPC-C, TPC-H, and YCSB workloads.
- mysqlslap - MySQL built-in load emulation client for quick benchmarks.
System & Infrastructure Benchmarking
- fio - โญ๐ข๐ Reference I/O benchmarking tool with configurable workloads and multiple engines (libaio, io_uring).
- stress-ng - ๐ข๐ System stress testing tool with 300+ methods covering CPU, memory, I/O, and network.
- Phoronix Test Suite - ๐ข Comprehensive benchmarking platform with 500+ test profiles and result comparison.
- iperf3 - โญ๐ข๐ Network bandwidth measurement tool for TCP/UDP throughput testing.
Chaos Engineering & Fault Injection
- Litmus - โญ๐ข๐ต CNCF incubating Kubernetes chaos engineering platform with extensive experiment library.
- Chaos Mesh - โญ๐ข๐ต CNCF incubating Kubernetes-native chaos platform with pod, network, and I/O fault injection.
- Gremlin - ๐ Enterprise chaos engineering platform with managed experiments and safety controls.
- Chaos Monkey - โญ๐ข Netflix's pioneering chaos tool that randomly terminates instances in production.
- Pumba - ๐ข๐ต Chaos testing for Docker containers with network delay and packet loss injection.
- Steadybit - ๐ ๐ต Enterprise reliability platform combining chaos engineering with resilience validation.
- AWS Fault Injection Service - ๐ ๐ต Managed fault injection for AWS resources with native service integration.
Network Simulation & Traffic Shaping
- tc (Traffic Control) - Linux kernel traffic shaping with netem qdisc for network emulation.
- Comcast - CLI tool for simulating bad network conditions wrapping tc/pfctl.
- Clumsy - ๐ข Windows network condition simulator for packet drop, lag, throttle, and reordering.
CI/CD Integration & Performance Gates
- Gatling Enterprise - ๐ Managed Gatling execution with CI/CD integrations and historical comparison.
- Lighthouse CI - ๐ข Run Lighthouse in CI with performance budgets, baseline comparison, and trend tracking.
- Taurus - ๐ข YAML-based automation wrapper for JMeter, Gatling, Locust with unified reporting.
Results Analysis & Reporting
- k6 HTML Report - ๐ข Standalone HTML report generator for k6 test results.
- HdrHistogram - ๐ข๐ High Dynamic Range Histogram for accurate latency measurement capturing the full distribution.
- Gatling Reports - ๐ข Built-in HTML reports with percentile distributions and response time series.
- Apache JMeter Dashboard - ๐ข Built-in HTML dashboard generating APDEX scores and response time distributions.
- Taurus Reporting - ๐ข Unified reporting across multiple load testing engines with BlazeMeter integration.
Cloud Provider Services
- Azure App Testing - ๐ ๐ต Microsoft's managed load testing service supporting JMeter and Locust with multi-region simulation.
- AWS Distributed Load Testing - ๐ ๐ต Distributed load testing architecture on AWS via CloudFormation supporting JMeter, k6, and Locust.
Developer-Centric Platforms
- Grafana k6 Cloud - ๐ Managed k6 execution with multi-region load zones and real-time Grafana visualization.
- Octoperf - ๐ SaaS performance testing platform built on JMeter with distributed load generation.
Enterprise Platforms
- BlazeMeter - ๐ Cloud performance testing platform supporting JMeter, Gatling, Locust, Selenium, and Playwright.
Tools & Integrations
- Datadog Synthetic Monitoring + AI - ๐ Synthetic API and browser tests with ML-powered anomaly detection and APM correlation.
- Dynatrace Load Testing Integration - ๐ Automated CI/CD quality gates using AI-based performance evaluation against baselines.
Related
- awesome-sre-tools - SRE and production engineering tools.
- sre-learning-resources - Learning paths for modern SRE skills.
- awesome-monitoring - Monitoring and observability tools.
- awesome-testing - Software testing methodologies and frameworks.
- awesome-chaos-engineering - Chaos engineering platforms and resources.
- awesome-kubernetes - Kubernetes tooling and cloud-native patterns.
- awesome-k8s-security - Kubernetes security and hardening.
- awesome-scalability - Scalability patterns and architectures.