listen() 的 backlog 及 TCP 相关参数 =================================== .. versionadded:: @2014-05-17 创建 listen 函数的原型为:`int listen(int sockfd, int backlog);`, 其中 backlog 的解释在 `man 2 listen` 里说明如下: :: The backlog argument defines the maximum length to which the queue of pending connections for sockfd may grow. If a connection request arrives when the queue is full, the client may receive an error with an indication of ECONNREFUSED or, if the underlying protocol supports retransmission, the request may be ignored so that a later reattempt at connection succeeds. The behavior of the backlog argument on TCP sockets changed with Linux 2.2. Now it specifies the queue length for completely established sock- ets waiting to be accepted, instead of the number of incomplete connec- tion requests. The maximum length of the queue for incomplete sockets can be set using /proc/sys/net/ipv4/tcp_max_syn_backlog. When syncook- ies are enabled there is no logical maximum length and this setting is ignored. See tcp(7) for more information. If the backlog argument is greater than the value in /proc/sys/net/core/somaxconn, then it is silently truncated to that value; the default value in this file is 128. In kernels before 2.4.25, this limit was a hard coded value, SOMAXCONN, with the value 128. 可见在 2.2 之后,backlog 对应于已经完成握手但还未被 accept 的连接数。 但该值的大小不能超过 net.core.somaxconn。 :: somaxconn - INTEGER Limit of socket listen() backlog, known in userspace as SOMAXCONN. Defaults to 128. See also tcp_max_syn_backlog for additional tuning for TCP sockets. 内核里的代码在 net/socket.c: .. code-block:: c SYSCALL_DEFINE2(listen, int, fd, int, backlog) { ... ... if (sock) { somaxconn = sock_net(sock->sk)->core.sysctl_somaxconn; if ((unsigned int)backlog > somaxconn) backlog = somaxconn; ... ... } 未完成 TCP 握手的的连接队列长度由另一个参数 net.ipv4.tcp_max_syn_backlog 控制。 :: tcp_max_syn_backlog - INTEGER Maximal number of remembered connection requests, which have not received an acknowledgment from connecting client. The minimal value is 128 for low memory machines, and it will increase in proportion to the memory of machine. If server suffers from overload, try increasing this number. `man 7 tcp` 里的解释如下: :: tcp_max_syn_backlog (integer; default: see below; since Linux 2.2) The maximum number of queued connection requests which have still not received an acknowledgement from the connecting client. If this number is exceeded, the kernel will begin dropping requests. The default value of 256 is increased to 1024 when the memory present in the system is adequate or greater (>= 128Mb), and reduced to 128 for those systems with very low memory (<= 32Mb). It is recommended that if this needs to be increased above 1024, TCP_SYNQ_HSIZE in include/net/tcp.h be modified to keep TCP_SYNQ_HSIZE*16<=tcp_max_syn_backlog, and the kernel be recom- piled. 该值的实际大小参考 net/core/request_sock.c 里的代码: .. code-block:: c int reqsk_queue_alloc(struct request_sock_queue *queue, unsigned int nr_table_entries) { ... ... nr_table_entries = min_t(u32, nr_table_entries, sysctl_max_syn_backlog); nr_table_entries = max_t(u32, nr_table_entries, 8); nr_table_entries = roundup_pow_of_two(nr_table_entries + 1); ... ... } nr_table_entries 的初始值为 listen 的 backlog 值,经过一系列调整后该值大小为 2 的幂,最小值为 16。上述的 nr_table_entries + 1 也正是某些文档里推荐 listen 的 backlog 值为 511 的原因。在 2.6.20 之前该值为固定值,相应的 commit 参考 `这里 `_ 。 在 listen(2) 的文档里也提到了,如果打开了 :index:`syncookie` (net.ipv4.tcp_syncookies=1) 选项,则 net.ipv4.tcp_max_syn_backlog 的值等价于被忽略。 事实上上述代码的相关分析可以从 net/ipv4/tcp_ipv4.c 里的函数 tcp_syn_flood_action() 处开始切入。SYN Cookie 工作时,内核会输出如下信息: :: TCP: TCP: Possible SYN flooding on port 80. Sending cookies. Check SNMP counters. 和 :index:`SYN Flood` 防护相关的几个参数附带说明如下: :: tcp_synack_retries - INTEGER Number of times SYNACKs for a passive TCP connection attempt will be retransmitted. Should not be higher than 255. Default value is 5, which corresponds to 31seconds till the last retransmission with the current initial RTO of 1second. With this the final timeout for a passive TCP connection will happen after 63seconds. tcp_syncookies - BOOLEAN Only valid when the kernel was compiled with CONFIG_SYN_COOKIES Send out syncookies when the syn backlog queue of a socket overflows. This is to prevent against the common 'SYN flood attack' Default: 1 Note, that syncookies is fallback facility. It MUST NOT be used to help highly loaded servers to stand against legal connection rate. If you see SYN flood warnings in your logs, but investigation>shows that they occur because of overload with legal connections, you should tune another parameters until this warning disappear. See: tcp_max_syn_backlog, tcp_synack_retries, tcp_abort_on_overflow. syncookies seriously violate TCP protocol, do not allow to use TCP extensions, can result in serious degradation of some services (f.e. SMTP relaying), visible not by you, but your clients and relays, contacting you. While you see SYN flood warnings in logs not being really flooded, your server is seriously misconfigured. If you want to test which effects syncookies have to your network connections you can set this knob to 2 to enable unconditionally generation of syncookies.