Blossom Graphs provide basic monitoring and performance metrics to help identify bottlenecks and make good scaling decisions. Here’s an example:
Key Metrics
- CPU Usage: Current and historical CPU utilization across all cores
- Load Average: 1-minute, 5-minute, and 15-minute load averages
- Memory Usage: RAM consumption and available memory
- Disk Usage: Storage utilization and available space
- Disk I/O: Read/write throughput and IOPS
- Network Traffic: Inbound and outbound data transfer rates
Performance Decisions
If your server is having constant CPU or memory pressure, consider these options:
- Upgrade Server Specs: Add more CPU and memory to your existing server
- Dedicated Server Roles: Break out process types to separate dedicated servers. For example, dedicated servers just for worker processes, which tend to do more intensive computation. You don’t want processing to crush the performance of the server and affect other things like web requests. Separate web servers from worker servers.
- App Performance: It might not be a server resource issue. It could be an application performance issue - maybe an infinite loop or not limiting/paginating large database queries.
Tips
- Check graphs after changes like app deployments or server changes
- Try to establish normal operating ranges for your applications
- For worker processes, make sure the number of threads or processes isn’t too much for the server specs. Example: A server with 2 CPUs can probably only support 1 docker compose process with 2 threads
- Also useful: Debug High CPU Usage