Imagine that you have a website that has 10,000 users making database requests simultaneously β accounting for some 20,000 transactions per second. How big should your connection pool be? You might be surprised that the question is not how big but rather how small!
For an application doing 20,000 transactions per second, determining the exact number of connection pool connections to open with the database depends on various factors:
-
Transaction Duration: The time taken to complete a single transaction. If transactions are short-lived, you might require more connections to handle the high throughput.
-
Connection Handling Overhead: Opening and closing connections have overhead. You might need more connections if the cost of creating a connection is relatively high.
-
Database Performance: The performance of your PostgreSQL database and the maximum number of concurrent connections it can handle effectively.
-
Resource Limitations: The hardware resources available, especially the number of CPU cores. Itβs generally recommended to have a few connections per CPU core.
-
Connection Pool Mechanism: Some connection pool libraries are more efficient in managing connections than others.
Given these factors, you might initially start with a connection pool size around 2-4 times the number of CPU cores on your server and see how the application and database perform. For a system with 4 CPU cores, this would mean starting with a connection pool size of around 8-16 connections.
However, you should closely monitor the application and database performance under this load and adjust the connection pool size as needed. Itβs important to strike a balance between resource utilization and performance to achieve optimal results.
As people say, no source is enough for a badly written program so it totally depends upon your programming strategy
Also how we can distribute the connection pool in case of multiple instances (scaling)?
I think 2 to 4 times the number of CPU is a good enough number to start with and we can duplicate it in that scaling itself.
If my system is auto scalable, then I would definitely start with 1-2 connection pool per CPU and then monitor my application under load to see whether it is able to handle the load properly or getting any connection pool error.
Correct me if my understanding is wrong.