Optimizing Data Warehouse Performance on AWS Redshift

The Performance Challenge

When our data warehouse queries started taking minutes instead of seconds, it was time for serious optimization. Working with terabytes of data across multiple fact and dimension tables, we needed a systematic approach to improve performance while maintaining data integrity and availability.

Understanding Redshift Architecture

Before optimizing, it's essential to understand how Redshift works:

Columnar Storage: Data is stored by column, not row
Massively Parallel Processing (MPP): Queries execute across multiple nodes
Data Distribution: How data is distributed across compute nodes affects performance
Query Compilation: Queries are compiled and cached for reuse

Distribution Keys: The Foundation

Choosing the right distribution key is perhaps the most critical optimization decision.

Distribution Styles

KEY Distribution: Distributes rows based on a single column's value
ALL Distribution: Copies entire table to all nodes (small dimension tables)
EVEN Distribution: Distributes rows evenly across slices
AUTO Distribution: Lets Redshift choose (useful for small tables)

Best Practices

Choose columns used frequently in joins as distribution keys
Aim for even data distribution to avoid skew
Use ALL distribution for small dimension tables (under 3M rows)
Monitor distribution using SVV_TABLE_INFO

Sort Keys: Accelerating Queries

Sort keys determine the physical order of data storage, enabling zone maps for efficient data pruning.

Types of Sort Keys

Compound Sort Keys: Multiple columns in order of significance
Interleaved Sort Keys: Equal weight to all columns (use sparingly)

Selection Strategy

We improved query performance dramatically by:

Using timestamp columns for time-series data
Choosing columns used in WHERE clauses and range filters
Avoiding interleaved sort keys unless absolutely necessary
Regular VACUUM operations to maintain sort order

Compression Encoding

Proper compression reduces storage costs and improves I/O performance.

Automatic Compression

Redshift can automatically choose encodings during COPY operations:

Let Redshift analyze sample data
Review recommendations using ANALYZE COMPRESSION
Apply encodings manually for fine-tuning

Encoding Types

RAW: No compression (rarely used)
AZ64: High compression, good for all data types
LZO: Fast compression/decompression
ZSTD: High compression ratio
DELTA: Great for sequential numeric values

Query Optimization Techniques

1. Minimize Data Movement

Data redistribution is expensive. Strategies to minimize it:

Join tables on their distribution keys
Filter data early in the query
Use subqueries judiciously
Avoid cross-joins at all costs

2. Use Materialized Views

For frequently executed aggregations:

Create materialized views for complex aggregations
Enable automatic query rewrite
Schedule refreshes during low-traffic periods

3. Leverage Result Caching

Redshift caches query results automatically:

Identical queries return cached results instantly
Cache remains valid until underlying data changes
Monitor cache hit ratio in STL_QUERY

Table Maintenance

VACUUM Operations

Regular VACUUM operations are crucial:

VACUUM DELETE ONLY: Reclaims space from deleted rows
VACUUM SORT ONLY: Resorts unsorted data
VACUUM REINDEX: Rebuilds interleaved sort keys
Schedule during maintenance windows

ANALYZE Statistics

Keep query planner statistics current:

Run ANALYZE after large data loads
Enable automatic ANALYZE
Monitor using SVV_TABLE_INFO

Monitoring and Troubleshooting

Key System Tables

STL_QUERY: Query execution history
SVL_QUERY_REPORT: Detailed query execution steps
STL_ALERT_EVENT_LOG: Performance warnings
SVV_TABLE_INFO: Table statistics and skew

Query Execution Plans

Use EXPLAIN to understand query execution:

Identify expensive operations
Look for data redistribution steps
Check for sequential scans on large tables
Verify join types and order

Workload Management (WLM)

Proper WLM configuration ensures fair resource allocation:

Queue Configuration

Separate queues for ETL, reporting, and ad-hoc queries
Assign appropriate memory to each queue
Set concurrency limits based on workload
Use query monitoring rules to catch runaway queries

Automatic WLM

Consider enabling automatic WLM:

Dynamic memory allocation
Priority-based scheduling
Less manual configuration

Results Achieved

After implementing these optimizations:

10x faster queries: Average query time reduced from 45s to 4s
50% cost reduction: Through better compression and node optimization
99.9% cache hit rate: For frequently accessed queries
Zero downtime: All optimizations applied without service interruption

Key Takeaways

Distribution and sort keys are the foundation of performance
Regular maintenance (VACUUM, ANALYZE) is essential
Monitor query patterns and optimize based on actual usage
Use system tables to identify and fix performance issues
Proper WLM configuration prevents resource contention

Conclusion

Optimizing Redshift requires understanding its architecture and systematically addressing performance bottlenecks. The effort is worthwhile—our 10x performance improvement transformed user experience and significantly reduced costs. Start with distribution and sort keys, maintain your tables regularly, and monitor continuously for best results.

AWS Redshift SQL Data Warehouse Performance