目录
背景:TCP 连接建立流程
在 TCP 三次握手中:
- 客户端调用
connect() → 发送 SYN → 套接字状态变为 TCP_SYN_SENT - 服务端收到
SYN → 回复 SYN+ACK → 套接字状态变为 TCP_NEW_SYN_RECV - 客户端收到
SYN+ACK → 发送 ACK → 连接建立成功(状态变为 TCP_ESTABLISHED)。在此期间,客户端处于 TCP_SYN_SENT 状态。此时若收到任何 TCP 报文(包括合法或非法的),都需要由 tcp_rcv_synsent_state_process() 来处理。 - 服务端收到
ACK → 套接字状态变为TCP_ESTABLISHED。
本文接下来将分析 服务端收到ACK场景。
关键数据结构
- struct request_sock *req: 半连接请求块,存储在 tcp_hashinfo哈希表中
- struct sock *nsk: 服务端生成的accept sock
- sk->sk_state:当前 TCP 状态(此处为 TCP_NEW_SYN_RECV)
核心逻辑详解
- tcp_v4_rcv[1]收到数据以后,先根据源目
ip与port在tcp_hashinfo表中查找对应的sk。此sk正是之前收到syn请求后生成的req。找到对应的sk以后,由于sk->sk_state == TCP_NEW_SYN_RECV,服务端接着会调用tcp_check_req来生成一个新的sock,加入到listen sock的inet_csk(sk)->icsk_accept_queue链表中,最后通过调用tcp_child_process来通知应用程序,tcp服务端accept成功。
inttcp_v4_rcv(struct sk_buff *skb){structnet *net = dev_net(skb->dev);structsk_buff *skb_to_free;int sdif = inet_sdif(skb);int dif = inet_iif(skb);conststructiphdr *iph;conststructtcphdr *th;bool refcounted;structsock *sk;int ret; ………… th = (const struct tcphdr *)skb->data; iph = ip_hdr(skb);lookup: sk = __inet_lookup_skb(&tcp_hashinfo, skb, __tcp_hdrlen(th), th->source, th->dest, sdif, &refcounted); …………if (sk->sk_state == TCP_NEW_SYN_RECV) {structrequest_sock *req = inet_reqsk(sk);bool req_stolen = false;structsock *nsk; sk = req->rsk_listener; ………… nsk = NULL;if (!tcp_filter(sk, skb)) { th = (const struct tcphdr *)skb->data; iph = ip_hdr(skb); tcp_v4_fill_cb(skb, iph, th); nsk = tcp_check_req(sk, skb, req, false, &req_stolen); } tcp_child_process(sk, nsk, skb) ………… } } …………}
tcp_check_req首先对此tcp包进行非法检查,检查成功后调用inet_csk(sk)->icsk_af_ops->syn_recv_sock生成一个new sock,设置sk->sk_state为TCP_SYN_RECV,接着调用inet_csk_complete_hashdance将新生成的sock链入inet_csk(sk)->icsk_accept_queue链表中。
struct sock *tcp_check_req(struct sock *sk, struct sk_buff *skb, struct request_sock *req,bool fastopen, bool *req_stolen){structtcp_options_receivedtmp_opt;structsock *child;conststructtcphdr *th = tcp_hdr(skb); __be32 flg = tcp_flag_word(th) & (TCP_FLAG_RST|TCP_FLAG_SYN|TCP_FLAG_ACK);bool paws_reject = false;bool own_req; tmp_opt.saw_tstamp = 0; …………/* OK, ACK is valid, create big socket and * feed this segment to it. It will repeat all * the tests. THIS SEGMENT MUST MOVE SOCKET TO * ESTABLISHED STATE. If it will be dropped after * socket is created, wait for troubles. */ child = inet_csk(sk)->icsk_af_ops->syn_recv_sock(sk, skb, req, NULL, req, &own_req);//tcp_prot:ipv4_specific:tcp_v4_syn_recv_sock …………return inet_csk_complete_hashdance(sk, child, req, own_req); …………}
tcp_child_process函数会设置newsk的sk->sk_state状态为TCP_ESTABLISHED,然后调用parent->sk_data_ready来通知应用程序,newsk建立成功。
inttcp_child_process(struct sock *parent, struct sock *child, struct sk_buff *skb) __releases(&((child)->sk_lock.slock)){int ret = 0;int state = child->sk_state;/* record NAPI ID of child */ sk_mark_napi_id(child, skb); tcp_segs_in(tcp_sk(child), skb);if (!sock_owned_by_user(child)) { ret = tcp_rcv_state_process(child, skb);/* Wakeup parent, send SIGIO */if (state == TCP_SYN_RECV && child->sk_state != state) parent->sk_data_ready(parent); } else {/* Alas, it is possible again, because we do lookup * in main socket hash table and lock on listening * socket does not protect us more. */ __sk_add_backlog(child, skb); } bh_unlock_sock(child); sock_put(child);return ret;}
参考资料
[1] linux内核版本: linux5.7.8。