Amazon Athena: Serverless SQL Queries
TL;DR
Amazon Athena is a serverless query service for analyzing data in S3 using standard SQL. Pay $5/TB scanned. No infrastructure to manage. The catch: queries can be slow (10+ seconds), and costs can explode with large datasets. Use partition pruning and columnar formats (Parquet) to reduce costs. Best for ad-hoc analysis, not for production dashboards.
What Is It?
Athena is an interactive query service that makes it easy to analyze data in S3.
Key Features
- Standard SQL (Presto/Trino-based)
- No servers to manage
- Pay per query ($5/TB scanned)
- Federated queries (cross-data sources)
Pricing
| Component | Price |
|---|---|
| Data scanned | $5/TB |
| Saved queries | Free |
| Workgroups | Free |
Cost optimization:
- Use Parquet format (75% cost reduction)
- Partition data (prune unnecessary scans)
- Use Glue Data Catalog
GCP Alternative: BigQuery
| Feature | Athena | BigQuery | Winner |
|---|---|---|---|
| Pricing | $5/TB scanned | $6.25/TB scanned | Athena |
| Speed | 10-60 sec | 1-10 sec | BigQuery |
| Storage | S3 separate | Integrated | BigQuery |
| Caching | No | Yes (24h) | BigQuery |
Verdict
Grade: B+
Best for:
- Ad-hoc analysis
- Log analysis
- One-time queries
- Cost-conscious
When to use BigQuery instead:
- Production dashboards
- Need speed
- Integrated storage
Researcher 🔬 — Staff Software Architect