Data Sources
February 13, 2026 ยท View on GitHub
Overview
spotinfo combines multiple data sources to provide comprehensive AWS EC2 Spot Instance information, including pricing, interruption rates, and placement scores.
Primary Data Sources
1. AWS Spot Instance Advisor Data
- Source: AWS Spot Advisor JSON feed
- Maintained by: AWS team
- Update frequency: Regularly updated by AWS
- Contains:
- Instance specifications (vCPU, memory, EMR compatibility)
- Interruption frequency ranges
- Savings percentages compared to on-demand pricing
- Regional availability data
2. AWS Spot Pricing Data
- Source: AWS Spot Pricing JS callback file
- Maintained by: AWS team
- Update frequency: Regularly updated by AWS
- Contains:
- Current spot prices by region and instance type
- Operating system pricing variations (Linux/Windows)
- Historical pricing trends
3. AWS EC2 Live Spot Pricing API
- Source: AWS
DescribeSpotPriceHistoryAPI - Access: Real-time API calls (requires
ec2:DescribeSpotPriceHistorypermission) - Purpose: Fills in pricing for newer instance types (e.g., m8g, r8g) missing from the static feed
- Trigger: Only called when instances have advisor data but $0 pricing from the static feed
- Contains:
- Current spot prices per instance type and region
- Prices from the last hour of trading
- Behavior:
- Fetches prices in parallel across regions
- Batches requests (up to 50 instance types per call)
- Gracefully degrades โ if unavailable, prices remain $0
- Results marked with
live_price: truein output
4. AWS Spot Placement Scores API
- Source: AWS
GetSpotPlacementScoresAPI - Access: Real-time API calls (requires IAM permissions)
- Contains:
- Regional placement scores (1-10 scale)
- Availability zone-level placement scores
- Likelihood of successful spot instance launch
- Contextual scoring based on request composition
Data Flow Architecture
graph TB
A[AWS Spot Advisor<br/>JSON Feed] --> D[Data Aggregation]
B[AWS Spot Pricing<br/>JS Feed] --> D
B2[AWS EC2 Live Pricing<br/>DescribeSpotPriceHistory] --> D
C[AWS Placement Scores<br/>API] --> D
D --> E[spotinfo Engine]
E --> F[Embedded Data<br/>Fallback]
E --> G[Cached Results]
G --> H[CLI Output]
F --> H
style A fill:#e1f5fe
style B fill:#e1f5fe
style B2 fill:#fff3e0
style C fill:#fff3e0
style F fill:#f3e5f5
style G fill:#e8f5e8
Network Resilience
Embedded Data
- Purpose: Ensure functionality without network connectivity
- Implementation: Data is embedded into the binary during build
- Coverage: Complete spot advisor and pricing data snapshot
- Update process: Refreshed during each build via
make update-data
Fallback Strategy
- Primary: Fetch fresh data from AWS feeds
- Secondary: Use embedded data if network unavailable
- Live Pricing: For instance types with $0 in the static feed, fetch current prices via EC2
DescribeSpotPriceHistoryAPI (requires AWS credentials) - Placement Scores: Graceful degradation to mock scores if API inaccessible
Data Processing Pipeline
1. Data Fetching
// Pseudo-code flow
func fetchData() {
advisorData := fetchFromURL("https://spot-bid-advisor.s3.amazonaws.com/spot-advisor-data.json")
if advisorData == nil {
advisorData = loadEmbeddedAdvisorData()
}
pricingData := fetchFromURL("http://spot-price.s3.amazonaws.com/spot.js")
if pricingData == nil {
pricingData = loadEmbeddedPricingData()
}
}
2. Data Transformation
- JSON parsing: Convert AWS JSON format to internal structures
- Price extraction: Parse JavaScript callback format for pricing
- Data normalization: Standardize formats across sources
- Validation: Ensure data integrity and completeness
3. Data Enrichment
- Instance type mapping: Combine advisor and pricing data
- Score integration: Add placement scores when requested
- Regional filtering: Apply user-specified region constraints
- Specification filtering: Apply CPU, memory, and price filters
Cache Strategy
Placement Score Caching
- Cache duration: 10 minutes
- Cache key format:
region:az_flag:instance_types - Purpose: Reduce AWS API calls and improve performance
- Implementation: LRU cache with expiration
Data Freshness Tracking
- Timestamp tracking: Record when data was last fetched
- Freshness indicators: Visual indicators for stale data (>30 minutes)
- JSON metadata: Include
score_fetched_attimestamps in output
Data Accuracy and Limitations
Spot Advisor Data
- Accuracy: High - directly from AWS
- Limitations:
- Static snapshot updated periodically by AWS
- May not reflect real-time market conditions
- Regional variations in update frequency
Spot Pricing Data
- Accuracy: High - current market prices
- Limitations:
- Prices change frequently
- Some regions may have delayed updates
- Embedded data becomes stale over time
Live Spot Pricing (EC2 API)
- Accuracy: Real-time from AWS API
- Limitations:
- Requires
ec2:DescribeSpotPriceHistoryIAM permission - Only triggered for instance types missing from the static feed
- Adds latency (parallel fetches with 10s timeout per region)
- Prices marked with
live_price: truein output to distinguish from static data
- Requires
Placement Scores
- Accuracy: Real-time from AWS API
- Limitations:
- Requires proper IAM permissions
- May be restricted by Service Control Policies
- Contextual scoring can be confusing to users
- API rate limits apply
Data Update Process
Build-Time Updates
# Update embedded data during build
make update-data # Updates spot advisor data
make update-price # Updates spot pricing data
make build # Embeds fresh data in binary
Runtime Data Flow
- Startup: Load embedded data as baseline
- Network fetch: Attempt to fetch fresh data from AWS feeds
- Merge: Combine fresh data with embedded fallback
- API calls: Fetch placement scores on demand (if enabled)
- Cache: Store results for performance optimization
Monitoring and Observability
Data Source Health
- Connection testing: Verify AWS feed accessibility
- Data validation: Ensure JSON structure integrity
- Fallback detection: Log when embedded data is used
Performance Metrics
- Fetch duration: Monitor AWS feed response times
- Cache hit rate: Track placement score cache effectiveness
- API quota usage: Monitor placement score API consumption
Security Considerations
API Access
- IAM permissions:
ec2:DescribeSpotPriceHistory(live pricing),ec2:GetSpotPlacementScores(placement scores) - Credential management: Uses AWS SDK default credential chain
- Network security: HTTPS for advisor data, HTTP for pricing (AWS provided)
- Optional: Both API features degrade gracefully without credentials
Data Privacy
- No personal data: All data is public AWS pricing information
- No data retention: Only temporary caching for performance
- No external transmission: Data stays within AWS and local system
Troubleshooting Data Issues
Common Problems
Stale pricing data:
# Force fresh data fetch
make update-data update-price build
Missing placement scores:
# Verify API permissions
aws ec2 get-spot-placement-scores --instance-types t3.micro --target-capacity 1 --region us-east-1
Network connectivity issues:
- Tool automatically falls back to embedded data
- Check network connectivity to
spot-bid-advisor.s3.amazonaws.com - Verify firewall settings for outbound HTTPS
Permission errors:
- Check IAM policy includes
ec2:GetSpotPlacementScores - Verify no Service Control Policy blocks the action
- Test with AWS CLI:
aws sts get-caller-identity
See Also
- AWS Spot Placement Scores - Detailed placement score documentation
- Troubleshooting - Common issues and solutions
- Usage Guide - Command reference and examples