listen() 的 backlog 及 TCP 相关参数¶
@2014-05-17 新版功能: 创建
listen 函数的原型为:int listen(int sockfd, int backlog);, 其中 backlog 的解释在 man 2 listen 里说明如下:
The backlog argument defines the maximum length to which the queue of
pending connections for sockfd may grow. If a connection request
arrives when the queue is full, the client may receive an error with an
indication of ECONNREFUSED or, if the underlying protocol supports
retransmission, the request may be ignored so that a later reattempt at
connection succeeds.
The behavior of the backlog argument on TCP sockets changed with Linux
2.2. Now it specifies the queue length for completely established sock-
ets waiting to be accepted, instead of the number of incomplete connec-
tion requests. The maximum length of the queue for incomplete sockets
can be set using /proc/sys/net/ipv4/tcp_max_syn_backlog. When syncook-
ies are enabled there is no logical maximum length and this setting is
ignored. See tcp(7) for more information.
If the backlog argument is greater than the value in
/proc/sys/net/core/somaxconn, then it is silently truncated to that
value; the default value in this file is 128. In kernels before 2.4.25,
this limit was a hard coded value, SOMAXCONN, with the value 128.
可见在 2.2 之后,backlog 对应于已经完成握手但还未被 accept 的连接数。 但该值的大小不能超过 net.core.somaxconn。
somaxconn - INTEGER
Limit of socket listen() backlog, known in userspace as SOMAXCONN.
Defaults to 128. See also tcp_max_syn_backlog for additional tuning
for TCP sockets.
内核里的代码在 net/socket.c:
SYSCALL_DEFINE2(listen, int, fd, int, backlog)
{
... ...
if (sock) {
somaxconn = sock_net(sock->sk)->core.sysctl_somaxconn;
if ((unsigned int)backlog > somaxconn)
backlog = somaxconn;
... ...
}
未完成 TCP 握手的的连接队列长度由另一个参数 net.ipv4.tcp_max_syn_backlog 控制。
tcp_max_syn_backlog - INTEGER
Maximal number of remembered connection requests, which have not
received an acknowledgment from connecting client.
The minimal value is 128 for low memory machines, and it will
increase in proportion to the memory of machine.
If server suffers from overload, try increasing this number.
man 7 tcp 里的解释如下:
tcp_max_syn_backlog (integer; default: see below; since Linux 2.2)
The maximum number of queued connection requests which have still
not received an acknowledgement from the connecting client. If
this number is exceeded, the kernel will begin dropping requests.
The default value of 256 is increased to 1024 when the memory
present in the system is adequate or greater (>= 128Mb), and
reduced to 128 for those systems with very low memory (<= 32Mb).
It is recommended that if this needs to be increased above 1024,
TCP_SYNQ_HSIZE in include/net/tcp.h be modified to keep
TCP_SYNQ_HSIZE*16<=tcp_max_syn_backlog, and the kernel be recom-
piled.
该值的实际大小参考 net/core/request_sock.c 里的代码:
int reqsk_queue_alloc(struct request_sock_queue *queue,
unsigned int nr_table_entries)
{
... ...
nr_table_entries = min_t(u32, nr_table_entries, sysctl_max_syn_backlog);
nr_table_entries = max_t(u32, nr_table_entries, 8);
nr_table_entries = roundup_pow_of_two(nr_table_entries + 1);
... ...
}
nr_table_entries 的初始值为 listen 的 backlog 值,经过一系列调整后该值大小为 2 的幂,最小值为 16。上述的 nr_table_entries + 1 也正是某些文档里推荐 listen 的 backlog 值为 511 的原因。在 2.6.20 之前该值为固定值,相应的 commit 参考 这里 。
在 listen(2) 的文档里也提到了,如果打开了 syncookie (net.ipv4.tcp_syncookies=1) 选项,则 net.ipv4.tcp_max_syn_backlog 的值等价于被忽略。 事实上上述代码的相关分析可以从 net/ipv4/tcp_ipv4.c 里的函数 tcp_syn_flood_action() 处开始切入。SYN Cookie 工作时,内核会输出如下信息:
TCP: TCP: Possible SYN flooding on port 80. Sending cookies. Check SNMP counters.
和 SYN Flood 防护相关的几个参数附带说明如下:
tcp_synack_retries - INTEGER
Number of times SYNACKs for a passive TCP connection attempt will
be retransmitted. Should not be higher than 255. Default value
is 5, which corresponds to 31seconds till the last retransmission
with the current initial RTO of 1second. With this the final timeout
for a passive TCP connection will happen after 63seconds.
tcp_syncookies - BOOLEAN
Only valid when the kernel was compiled with CONFIG_SYN_COOKIES
Send out syncookies when the syn backlog queue of a socket
overflows. This is to prevent against the common 'SYN flood attack'
Default: 1
Note, that syncookies is fallback facility.
It MUST NOT be used to help highly loaded servers to stand
against legal connection rate. If you see SYN flood warnings
in your logs, but investigation>shows that they occur
because of overload with legal connections, you should tune
another parameters until this warning disappear.
See: tcp_max_syn_backlog, tcp_synack_retries, tcp_abort_on_overflow.
syncookies seriously violate TCP protocol, do not allow
to use TCP extensions, can result in serious degradation
of some services (f.e. SMTP relaying), visible not by you,
but your clients and relays, contacting you. While you see
SYN flood warnings in logs not being really flooded, your server
is seriously misconfigured.
If you want to test which effects syncookies have to your
network connections you can set this knob to 2 to enable
unconditionally generation of syncookies.