Background
1 million database connections, absolutely crazy, how ordinary people do this kind of thing.
Yes, what does it mean for a database to support 1 million connections? Can't I use a connection pool?
In addition to holding a game of mind, can also understand the operating system layer of some knowledge, he le?
Wall
Based on how I measured kernel resources for PostgreSQL, which I wrote a few days ago, we can assess how the operating system should be configured to support 1 million database client connections.
https://yq.aliyun.com/articles/58690
But can you actually do it?
Take PostgreSQL 9.5 For example, 1 million connections, how much signal is required?
How many sets of semaphores are required?
Semmni >= (max_connections + max_worker_processes + autovacuum_max_workers + 5)/16
1 million connections, Semmni >= 62500
How much signal do you need?
Semmns >= (max_connections + max_worker_processes + autovacuum_max_workers + 5)/16) * 17 + other program requirements
1 million connections, Semmns >= 1062500
How much signal does each group need?
SEMMSL >= 17
The test environment is as follows
CentOS 6.x x64, 512GB memory.
Kernel.sem = 18 2147483647 2147483646 512000000
Max number of arrays = 512000000
Max semaphores per array = 18
Max semaphores system wide = 2147483647
Max Ops per semop call = 2147483646
Above the kernel configuration, the semaphore fully meets the 1 million connection requirements.
So does the database start?
VI postgresql.conf
Max_connections = 1000000
Pg_ctl start
Failed to start.
Cause analysis
The error comes from the following code:
Failed to create SEM using Semget.
Src/backend/port/sysv_sema.c
/*
Internalipcsemaphorecreate *
77 *
* Attempt to create a new semaphore set with the specified key.
The * would fail (return-1) if such a set already exists.
80 *
Bayi * If We fail with a failure code other than Collision-with-existing-set,
* Print out an error and abort. Other types of errors suggest nonrecoverable
* problems.
84 */
Static Ipcsemaphoreid
Internalipcsemaphorecreate (ipcsemaphorekey semkey, int numsems)
87 {
the int semid;
89
Semid = Semget (Semkey, Numsems, Ipc_creat | Ipc_excl | Ipcprotection);
91
if (Semid < 0)
93 {
int saved_errno = errno;
95
96/*
Fail quietly if error indicates a collision with existing set. One
Would expect eexist, given that we said Ipc_excl, but perhaps we
Could get a permission violation instead? Also, eidrm might occur
The * if is slated to destruction but not gone yet.
101 */
102 if (Saved_errno = = Eexist | | saved_errno = = eacces
EIDRM #ifdef
104 | | Saved_errno = = Eidrm
#endif
106)
return-1;
108
109/*
The * Else complain and abort
111 */
112 Ereport (FATAL,
113 (ErrMsg ("Could not create semaphores:%m"),
114 Errdetail ("Failed system call is Semget (%lu,%d, 0%o)."
Semkey (unsigned long), Numsems,
116 Ipc_creat | Ipc_excl | Ipcprotection),
117 (Saved_errno = = ENOSPC)?
118 Errhint ("This error does *not* mean" to have run out of disk spaces. "
119 "It occurs when either the system limit for the maximum number of"
Semaphore sets (Semmni), or the system wide maximum number of "
121 "semaphores (Semmns), would be exceeded. You are need to raise the "
122 "respective kernel parameter. Alternatively, reduce PostgreSQL ' s "
123 "Consumption of semaphores by reducing its max_connections parameter.\n"
124 "The PostgreSQL documentation contains more information about"
"Configuring your system for PostgreSQL.") : 0));
126}
127
128 return Semid;
129}
Semget failure is not a kernel.sem configuration issue, but a macro limitation of the operating system kernel.
The number of groups of SEM can not be greater than SEMVMX, that is, the maximum open 50多万个 connection.
As follows
Kernels/xxx.x86_64/include/uapi/linux/sem.h
#define semmni 128 /* <= IPCMNI MAX # semaphore identifiers */
#define semmsl 250 & nbsp; /* <= 8 max num of semaphores per ID */
#define semmns (SEMMNI*SEMMSL)/* <= I Nt_max MAX # of semaphores in System */
#define semopm 32 /* <= 1 Max num of OPS per semop call */
#define semvmx 32767 & nbsp; /* <= 32767 semaphore Maximum value */
#define semaem semvmx /* Adjust on exit max value */
* Unused * *
#define Semume SEMOPM/* Max Num ' undo entries per process * *
#define SEMMNU semmns/* num of Undo Structures system Wide * *
#define SEMMAP semmns/* Entries in semaphore map * *
#define Semusz/* sizeof struct Sem_undo * *
More than 32,767 array will be an error
$ ipcs-u
------Semaphore Status--------
Used arrays = 32768
Allocated semaphores = 557056
$ pg_ctl Start
Fatal:could not create semaphores:no spaces left on device
detail:failed system call is Semget (1953769, 17, 03600).
Hint:this error does *not* mean that you have run out of disk spaces.
It occurs when either the system limit for the maximum number of semaphore sets (Semmni),
or the System wide maximum number of semaphores (Semmns), would be exceeded.
You are need to raise the respective kernel parameter.
Alternatively, reduce PostgreSQL ' s consumption of semaphores by reducing its max_connections parameter.
The PostgreSQL documentation contains more information about configuring your system for PostgreSQL.
You can also see current values using Ipcs-l
Semaphore max value = 32767
This value can only be modified by recompiling the kernel.
This is also seen in the manuals of Semctl and SEMOP.
Mans Semctl
SEMVMX Maximum value for semval:implementation dependent (32767).
Mans SEMOP
SEMVMX Maximum allowable value for semval:implementation dependent (32767).
The implementation has no intrinsic limits for the Adjust on exit maximum value (SEMAEM), the System wide maximum number O F Undo Structures (Semmnu) and the per-process maximum number of undo entries system
Parameters.
Is the 1 million connection over?
Of course you can.
For example, modifying the SEMVMX and recompiling the kernel is obviously a path, but is there any other way?
It's a spring, PostgreSQL supports POSIX sem
As we saw earlier, the error code is
Internalipcsemaphorecreate@src/backend/port/sysv_sema.c
Look at the corresponding header file, found that PG support several ways to create semaphores, it is really a good spring:
SysV, POSIX (named, unamed), Win32
The corresponding header file source code is as follows
Src/include/storage/pg_sema.h
/*
* Pgsemaphoredata and pointer type Pgsemaphore are the data structure
* Representing an individual semaphore. The contents of Pgsemaphoredata
* Vary across implementations and must never is touched by platform-
* Independent code. Pgsemaphoredata structures are always allocated
* In shared memory (to support implementations where the data changes during
* Lock/unlock).
*
* Pg_config.h must define exactly one of the use_xxx_semaphores symbols.
*/
#ifdef Use_named_posix_semaphores
#include <semaphore.h>
typedef sem_t *PGSEMAPHOREDATA;
#endif
#ifdef Use_unnamed_posix_semaphores
#include <semaphore.h>
typedef sem_t PGSEMAPHOREDATA;
#endif
#ifdef Use_sysv_semaphores
typedef struct PGSEMAPHOREDATA
{
int semid; /* Semaphore Set Identifier * *
int semnum; /* semaphore number within set * * *
} Pgsemaphoredata;
#endif
#ifdef Use_win32_semaphores
typedef HANDLE PGSEMAPHOREDATA;
#endif
where POSIX named and unamed use the following system calls, respectively
POSIX named way to create a signal:
Mysem = Sem_open (semname, O_creat | O_EXCL,
(mode_t) Ipcprotection, (unsigned) 1);
POSIX unamed way to create a signal:
Sem_init (SEM, 1, 1)
POSIX source code is as follows, pay attention to the use of macros
Src/backend/port/posix_sema.c
#ifdef Use_named_posix_semaphores
/*
* Posixsemaphorecreate
*
* Attempt to create a new named semaphore.
*
* If we fail with a failure code other than Collision-with-existing-sema,
* Print out an error and abort. Other types of errors suggest nonrecoverable
* problems.
*/
Static sem_t *
Posixsemaphorecreate (void)
{
int Semkey;
Char semname[64];
Sem_t *mysem;
for (;;)
{
Semkey = nextsemkey++;
snprintf (semname, sizeof (Semname), "/pgsql-%d", Semkey);
Mysem = Sem_open (semname, O_creat | O_EXCL,
(mode_t) Ipcprotection, (unsigned) 1);
#ifdef sem_failed
if (Mysem!= (sem_t *) sem_failed)
Break
#else
if (Mysem!= (sem_t *) (-1))
Break
#endif
/* Loop If error indicates a collision * *
if (errno = = Eexist | | errno = eacces | | errno = = EINTR)
Continue
/*
* Else complain and abort
*/
Elog (FATAL, "Sem_open" (\ "%s\") failed:%m ", semname);
}
/*
* Unlink the semaphore immediately, so it can ' t be accessed externally.
* This also ensures that it would go away if we crash.
*/
Sem_unlink (Semname);
return Mysem;
}
#else/*! Use_named_posix_semaphores *
/*
* Posixsemaphorecreate
*
* Attempt to create a new unnamed semaphore.
*/
static void
Posixsemaphorecreate (sem_t * sem)
{
if (Sem_init (SEM, 1, 1) < 0)
Elog (FATAL, "Sem_init failed:%m");
}
#endif/* use_named_posix_semaphores * *
As you can see from the src/include/storage/pg_sema.h, you must have a specified use_xxx_semaphores symbols in the pg_config.h.
This symbol is not directly set Pg_config.h, is set when configure, will automatically add to Pg_config.h.
Select semaphore implementation type.
if test "$PORTNAME"!= "Win32"; Then
if test X "$USE _named_posix_semaphores" = x "1"; Then
$as _echo "#define Use_named_posix_semaphores 1" >>confdefs.h
sema_implementation= "SRC/BACKEND/PORT/POSIX_SEMA.C"
Else
if test X "$USE _unnamed_posix_semaphores" = x "1"; Then
$as _echo "#define Use_unnamed_posix_semaphores 1" >>confdefs.h
sema_implementation= "SRC/BACKEND/PORT/POSIX_SEMA.C"
Else
$as _echo "#define Use_sysv_semaphores 1" >>confdefs.h
sema_implementation= "SRC/BACKEND/PORT/SYSV_SEMA.C"
Fi
Fi
Else
$as _echo "#define Use_win32_semaphores 1" >>confdefs.h
sema_implementation= "SRC/BACKEND/PORT/WIN32_SEMA.C"
Fi
Use SYSV by default, if you want to use a different SEM method.
You can do that.
Export Use_unnamed_posix_semaphores=1
Libs=-lpthread./configure--prefix=/home/digoal/pgsql9.5
Remember to add-lpthread, or the error
/bin/ld:port/pg_sema.o:undefined reference to symbol ' sem_close@ @GLIBC_2.2.5 '
/bin/ld:note: ' sem_close@ @GLIBC_2.2.5 ' is defined into dso/lib64/libpthread.so.0 so try adding it to the linker command Li Ne
/lib64/libpthread.so.0:could not read Symbols:invalid operation
Collect2:error:ld returned 1 exit status
MAKE[2]: * * * [postgres] Error 1
MAKE[2]: Leaving directory '/home/digoal/postgresql-9.5.3/src/backend '
MAKE[1]: * * * [All-backend-recurse] Error 2
MAKE[1]: Leaving directory '/HOME/DIGOAL/POSTGRESQL-9.5.3/SRC '
Make: * * * [World-src-recurse] Error 2
Through the Linux programming Help documentation for these system calls, learn about POSIX semaphore management
Mans Sem_overview
Man Sem_init, unnamed SEM
Man Sem_open, named Sem
Einval value was greater than Sem_value_max.
You can learn that Sem_open is also restricted by SEMVMX.
So in order to implement PostgreSQL support 1 million connections, or even more, without modifying the kernel.
Must use Use_unnamed_posix_semaphores
Start, let PostgreSQL support 1 million connections
Compiling with Use_unnamed_posix_semaphores
Export Use_unnamed_posix_semaphores=1
Libs=-lpthread./configure--prefix=/home/digoal/pgsql9.5
Make World-j 32
Make Install-world-j 32
Modify parameters to allow 1 million connections
VI postgresql.conf
Max_connections = 1000000
Restart the database
Pg_ctl restart-m Fast
Test 1 million database concurrent connections
Pgbench is a good test tool, but it restricts 1024 connections and needs to be modified to support 1 million connectivity tests.
Code
VI src/bin/pgbench/pgbench.c
#ifdef WIN32
#define FD_SETSIZE 1024/* Set before winsock2.h is included/*
#endif/*! WIN32 *
* Max number of clients allowed * *
#ifdef fd_setsize
#define MaxClients (FD_SETSIZE-10)
#else
#define MAXCLIENTS 1024
#endif
Case ' C ':
Benchmarking_option_set = true;
nclients = Atoi (Optarg);
if (nclients <= 0)//change here | | Nclients > MaxClients)
{
fprintf (stderr, "Invalid number of clients: \"%s\ "\ n",
OPTARG);
Exit (1);
}
Test table
postgres=# Create unlogged table test (id int, info text);
CREATE TABLE
Test scripts
VI Test.sql
\setrandom S 1 100
Select Pg_sleep (: s);
INSERT into test values (1, ' Test ');
Continue to hit the wall
Start the pressure test, encounter the first problem
Pgbench-m prepared-n-r-f/test.sql-c 999900-j 1-t 10000
Need at least 999903 open files, but system limit is 655360
Reduce number of clients, or use Limit/ulimit to increase the system limit.
This problem is OK, is open the number of files limited, change some restrictions can be solved
Modify Ulimit
Vi/etc/security/limits.conf
* Soft Nofile 1048576
* Hard Nofile 1048576
* Soft Noproc 10000000
* Hard Noproc 10000000
* Soft Memlock Unlimited
* Hard Memlock Unlimited
modifying kernel parameters
Sysctl-w fs.file-max=419430400000
Re-test, the problem again, because pgbench use IP address link PG, causing pgbench dynamic port exhaustion.
Pgbench-m prepared-n-r-f/test.sql-c 999900-j 1-t 10000
Connection to database "Postgres" failed:
Could not connect to Server:cannot assign requested address
is the server running on host "127.0.0.1" and accepting
TCP/IP connections on port 1921?
Transaction Type:custom Query
Scaling Factor:1
Query mode:prepared
Number of clients:999900
Number of Threads:1
duration:10000 s
Number of transactions actually processed:0
Replace with a UNIX socket connection to resolve.
Pgbench-m prepared-n-r-f/test.sql-c 999900-j 1-t 10000-h $PGDATA
Connection to database "Postgres" failed:
Could not fork the new process for Connection:cannot allocate memory
Could not fork the new process for Connection:cannot allocate memory
Transaction Type:custom Query
Scaling Factor:1
Query mode:prepared
Number of clients:999900
Number of Threads:1
duration:10000 s
Number of transactions actually processed:0
Can't fork new process, followed by a cannot allocate memory such a hint, I saw the current configuration
vm.swappiness = 0
Vm.overcommit_memory = 0
So I added a swap partition, and I changed a few parameters.
DD If=/dev/zero of=./swap1 bs=1024k count=102400 oflag=direct
Mkswap./swap1
Swapon./swap1
Sysctl-w Vm.overcommit_memory=1 (always over commit)
Sysctl-w Vm.swappiness=1
Re-test, found that there are still problems
Pgbench-m prepared-n-F./test.sql-c 999900-j 1-t 10000-h $PGDATA
Could not fork the new process for Connection:cannot allocate memory
Use the following means to observe and find fault in about connecting to 65535 times
Sar-r 1 10000
Psql
Select COUNT (*) from pg_stat_activity;
\watch 1
Support for the million-link target reached
Found the source, the kernel limits the
kernel.pid_max=65535
Modify this kernel parameter
Sysctl-w kernel.pid_max=4096000
Re-Test
Pgbench-m prepared-n-F./test.sql-c 999900-j 1-t 10000-h $PGDATA
Continue to observe
Psql
Select COUNT (*) from pg_stat_activity;
\watch 1
Sar-r 1 10000
When connected to 260,000, memory used about 330GB, each connection 1MB or so.
It looks like there should be no problem, as long as the memory is enough to handle the 1 million connection.
Summary
In order for PostgreSQL to support 1 million concurrent connections, there is a sufficient amount of resources (mainly memory).
The database itself needs to be compiled as well, and the operating system kernel needs some tweaking.
Use POSIX unname sem when compiling PostgreSQL.
Export Use_unnamed_posix_semaphores=1
Libs=-lpthread./configure--prefix=/home/digoal/pgsql9.5
Make World-j 32
Make Install-world-j 32
If you are not going to use unnamed POSIX sem, be sure to recompile the operating system kernel and increase the SEMVMX.
To open the number of files limit
Ulimit
Vi/etc/security/limits.conf
* Soft Nofile 1048576
* Hard Nofile 1048576
* Soft Noproc 10000000
* Hard Noproc 10000000
modifying kernel parameters
Sysctl-w fs.file-max=419430400000
Using UNIX sockets
Breakthrough Pgbench test, dynamic port quantity limit.
Each connection is approximately 1mb,100 and requires approximately 1TB of RAM, which needs to be given sufficient memory.
Enable swap
DD If=/dev/zero of=./swap1 bs=1024k count=102400 oflag=direct
Mkswap./swap1
Swapon./swap1
.
.
Sysctl-w vm.overcommit_memory=0
Sysctl-w Vm.swappiness=1
Oom actually happened, and hang for a long time.
[67504.841109] Memory cgroup out of Memory:kill process 385438 (PIDOF) score 721 or sacrifice child
[67504.850566] killed process 385438, UID 0, (pidof) total-vm:982240kb, anon-rss:978356kb, file-rss:544kb
[67517.496404] pidof invoked oom-killer:gfp_mask=0xd0, order=0, oom_adj=0, oom_score_adj=0
[67517.496407] pidof cpuset=/mems_allowed=0
[67517.496410] pid:385469, comm:pidof tainted:g---------------H
Limit of maximum PID value
Increase
Sysctl-w kernel.pid_max=4096000
Limitations of Pgbench Clients
Modify the source code, support unlimited connections.
IPCS does not count POSIX sem information, so after using POSIX sem, IPCS does not see how much SEM is used.
System V and POSIX sem
NOTES
System V semaphores (Semget (2), SEMOP (2), etc.) are an older semaphore api.
&nb Sp POSIX semaphores provide a simpler, and better designed interface than System V semaphores;
on the other hand posix semaphores are less widely available (especially O n older systems) than System V semaphores.