The Performance Challenge
When our data warehouse queries started taking minutes instead of seconds, it was time for serious optimization. Working with terabytes of data across multiple fact and dimension tables, we needed a systematic approach to improve performance while maintaining data integrity and availability.
Understanding Redshift Architecture
Before optimizing, it's essential to understand how Redshift works:
- Columnar Storage: Data is stored by column, not row
- Massively Parallel Processing (MPP): Queries execute across multiple nodes
- Data Distribution: How data is distributed across compute nodes affects performance
- Query Compilation: Queries are compiled and cached for reuse
Distribution Keys: The Foundation
Choosing the right distribution key is perhaps the most critical optimization decision.
Distribution Styles
- KEY Distribution: Distributes rows based on a single column's value
- ALL Distribution: Copies entire table to all nodes (small dimension tables)
- EVEN Distribution: Distributes rows evenly across slices
- AUTO Distribution: Lets Redshift choose (useful for small tables)
Best Practices
- Choose columns used frequently in joins as distribution keys
- Aim for even data distribution to avoid skew
- Use ALL distribution for small dimension tables (under 3M rows)
- Monitor distribution using
SVV_TABLE_INFO
Sort Keys: Accelerating Queries
Sort keys determine the physical order of data storage, enabling zone maps for efficient data pruning.
Types of Sort Keys
- Compound Sort Keys: Multiple columns in order of significance
- Interleaved Sort Keys: Equal weight to all columns (use sparingly)
Selection Strategy
We improved query performance dramatically by:
- Using timestamp columns for time-series data
- Choosing columns used in WHERE clauses and range filters
- Avoiding interleaved sort keys unless absolutely necessary
- Regular VACUUM operations to maintain sort order
Compression Encoding
Proper compression reduces storage costs and improves I/O performance.
Automatic Compression
Redshift can automatically choose encodings during COPY operations:
- Let Redshift analyze sample data
- Review recommendations using
ANALYZE COMPRESSION - Apply encodings manually for fine-tuning
Encoding Types
- RAW: No compression (rarely used)
- AZ64: High compression, good for all data types
- LZO: Fast compression/decompression
- ZSTD: High compression ratio
- DELTA: Great for sequential numeric values
Query Optimization Techniques
1. Minimize Data Movement
Data redistribution is expensive. Strategies to minimize it:
- Join tables on their distribution keys
- Filter data early in the query
- Use subqueries judiciously
- Avoid cross-joins at all costs
2. Use Materialized Views
For frequently executed aggregations:
- Create materialized views for complex aggregations
- Enable automatic query rewrite
- Schedule refreshes during low-traffic periods
3. Leverage Result Caching
Redshift caches query results automatically:
- Identical queries return cached results instantly
- Cache remains valid until underlying data changes
- Monitor cache hit ratio in
STL_QUERY
Table Maintenance
VACUUM Operations
Regular VACUUM operations are crucial:
VACUUM DELETE ONLY: Reclaims space from deleted rowsVACUUM SORT ONLY: Resorts unsorted dataVACUUM REINDEX: Rebuilds interleaved sort keys- Schedule during maintenance windows
ANALYZE Statistics
Keep query planner statistics current:
- Run ANALYZE after large data loads
- Enable automatic ANALYZE
- Monitor using
SVV_TABLE_INFO
Monitoring and Troubleshooting
Key System Tables
STL_QUERY: Query execution historySVL_QUERY_REPORT: Detailed query execution stepsSTL_ALERT_EVENT_LOG: Performance warningsSVV_TABLE_INFO: Table statistics and skew
Query Execution Plans
Use EXPLAIN to understand query execution:
- Identify expensive operations
- Look for data redistribution steps
- Check for sequential scans on large tables
- Verify join types and order
Workload Management (WLM)
Proper WLM configuration ensures fair resource allocation:
Queue Configuration
- Separate queues for ETL, reporting, and ad-hoc queries
- Assign appropriate memory to each queue
- Set concurrency limits based on workload
- Use query monitoring rules to catch runaway queries
Automatic WLM
Consider enabling automatic WLM:
- Dynamic memory allocation
- Priority-based scheduling
- Less manual configuration
Results Achieved
After implementing these optimizations:
- 10x faster queries: Average query time reduced from 45s to 4s
- 50% cost reduction: Through better compression and node optimization
- 99.9% cache hit rate: For frequently accessed queries
- Zero downtime: All optimizations applied without service interruption
Key Takeaways
- Distribution and sort keys are the foundation of performance
- Regular maintenance (VACUUM, ANALYZE) is essential
- Monitor query patterns and optimize based on actual usage
- Use system tables to identify and fix performance issues
- Proper WLM configuration prevents resource contention
Conclusion
Optimizing Redshift requires understanding its architecture and systematically addressing performance bottlenecks. The effort is worthwhile—our 10x performance improvement transformed user experience and significantly reduced costs. Start with distribution and sort keys, maintain your tables regularly, and monitor continuously for best results.