Detailed explanation of TCP's three-way handshake and four-way wave

 8 minutes to read

TCP’s three-way handshake and four-way wave

The so-called three-way handshake (Three-way Handshake) means that when establishing a TCP connection, the client and the server need to send a total of 3 packets.

The purpose of the three-way handshake is to connect to the specified port of the server, establish a TCP connection, synchronize the serial numbers and confirmation numbers of both parties, and exchange TCP window size information. In socket programming, when the client executes connect(). A three-way handshake will be triggered.

First handshake:

The client sets the TCP packet flag SYN to 1, randomly generates a sequence number value seq=J, and saves it in the Sequence Number field of the TCP header. Indicate the port of the server that the client intends to connect to, and send the packet to the server. After sending, the client enters the SYN_SENT state and waits for the server to confirm.

Second handshake:

After the server receives the data packet, the flag bit SYN=1 knows that the client requests to establish a connection. The server side sets both the TCP packet flag bits SYN and ACK to 1, ack=J+1, and randomly generates a sequence number value seq=K , and send the data packet to the client to confirm the connection request, and the server enters the SYN_RCVD state.

Third handshake:

After the client receives the confirmation, it checks whether the ack is J+1 and whether the ACK is 1. If it is correct, the flag bit ACK is set to 1, ack=K+1, and the data packet is sent to the server, and the server checks Whether the ack is K+1 and the ACK is 1, if it is correct, the connection is established successfully, the client and the server enter the ESTABLISHED state, complete the three-way handshake, and then the client and the server can start to transmit data.

Note: The ack and ACK we wrote above are not the same concept:

The lowercase ack represents the acknowledgement number of the header, Acknowledge number, abbreviated ack, which is the number for confirming the sequence number of the previous packet, ack=seq+1. The uppercase ACK is the flag bit of the TCP header we mentioned above. It is used to mark whether the TCP packet has confirmed the previous packet. If it is confirmed, the ACK flag bit is set to 1.

The schematic diagram of the three-way handshake process is as follows:

sequenceDiagram
    participant c as client
    participant s as server
    c->>s: 1.SYN=1, seq=J, request to establish a connection
    activate c
    c-->>c:SYN_SENT
    deactivate c
    activate s
    s-->>s: SYN_RECEIVED
    s->>c: 2.SYN=1,seq=K,ACK=1,ack=J+1
    deactivate s
    activate c
    c-->>c: ESTABLISHED
    c->>s: 3.ACK=1,seq=J+1,ack=K+1
    deactivate c
    activate s
    s-->>s: ESTABLISHED
    deactivate s
    Note over c,s: data transfer
Why three-way handshake is needed?

We assume that the first connection request message segment sent by the client is not lost, but stays at a network node for a long time, so that it is delayed until a certain time after the connection is released before reaching the server.

Originally this was a long-defunct segment. However, after the server receives the invalid connection request segment, it mistakenly believes that it is a new connection request sent by the client again. So it sends a confirmation segment to the client and agrees to establish a connection.

Assuming that the “three-way handshake” is not used, as long as the server sends an acknowledgement, a new connection is established. Since the client has not issued a request to establish a connection now, it will ignore the confirmation of the server and will not send data to the server. But the server thinks that a new transport connection has been established and has been waiting for the client to send data. In this way, many resources of the server are wasted in vain.

Therefore, the “three-way handshake” method can prevent the above phenomenon from happening. For example, in the case just now, the client will not issue confirmation to the server’s confirmation. Since the server does not receive the confirmation, it knows that the client has not requested to establish a connection.

TCP waves four times to close the connection

Four waves of hands terminate the TCP connection, which means that when a TCP connection is disconnected, the client and the server need to send a total of 4 packets to confirm the disconnection of the connection. In socket programming, this process is triggered by either the client or the server executing close. Since the TCP connection is full-duplex, each direction must be closed separately. The principle is that when one party completes the data transmission task, it sends a FIN to terminate the connection in this direction. Receiving a FIN only means that There is no data flow in this direction, that is, no more data will be received, but data can still be sent on this TCP connection until FIN is also sent in this direction. The side that shuts down first will perform an active shutdown, while the other side will perform a passive shutdown.

The wave request can be initiated by the client or the server. We assume that it is initiated by the client:

First wave:

The client side initiates a wave request, sends the flag bit is the FIN segment to the server side, and sets the sequence number seq. At this time, the client side enters the FIN_WAIT_1 state, which means that the client side has no data to send to the server side.

Second wave:

The server side receives the FIN segment sent by the client side, and returns a segment with the flag ACK to the client side. The ack is set to seq plus 1, and the client side enters the FIN_WAIT_2 state. The server side tells the client side, I confirm and agree your close request.

The third wave:

The server side sends a message segment with the flag bit FIN to the client side, requesting to close the connection, and the server side enters the LAST_ACK state.

Fourth wave:

The client side receives the FIN segment sent by the server, and sends the segment with the ACK flag to the server, and then the client enters the TIME_WAIT state. After the server receives the ACK segment from the client, it closes the connection. At this point, if the client does not receive a reply after waiting for 2MSL, it proves that the server has been shut down normally. Well, the client can also close the connection.

The schematic diagram of the four waving process is as follows:
sequenceDiagram
    participant c as client
    participant s as server
    c->>s: 1.FIN=M, request to disconnect
    activate c
    c-->>c:FIN_WAIT_1
    deactivate c
    activate s
    s-->>s: CLOSE_WAIT
    s->>c: 2.ack=M+1
    deactivate s
    activate c
    c-->>c:FIN_WAIt_2
    deactivate c
    s->>c: 3.FIN=N, request to disconnect
    activate s
    s-->>s: LAST_ACK
    deactivate s
    c->>s: 4.ACK=1,ack=N+1
    activate c
    activate s
    s-->>s: CLOSED
    deactivate s
    c-->>c:TIME_WAIT
    c-->>c: CLOSED
    deactivate c
    
Why is there a three-way handshake when connecting and a four-way handshake when closing?

When the connection is established, when the server receives the SYN connection request message from the client, it can directly send the SYN+ACK message. The ACK message is used for response, and the SYN message is used for synchronization. So only three handshakes are required to establish a connection.

Since the TCP protocol is a connection-oriented, reliable, byte stream-based transport layer communication protocol, TCP is a full-duplex mode. This means that when the connection is closed, when the client sends a FIN segment, it just means that the client tells the server that the data has been sent. When the server side receives the FIN message and returns the ACK segment, it means that it already knows that the client side has no data to send, but the server side can still send data to the client side, so the server may not immediately close the SOCKET until the server The terminal also sends the data. When the server side also sends the FIN segment, it means that the server side has no data to send, it will tell the client side that I have no more data to send, and then they will happily interrupt the TCP connection.

Why wait for 2MSL?

MSL: The maximum lifetime of a segment, which is the longest time in the network before any segment is discarded.

There are two reasons:

The first point: ensure that the full-duplex connection of the TCP protocol can be closed reliably: Due to the unreliability of the IP protocol or other network reasons, the server does not receive the ACK message from the client, and the server will resend the FIN after the timeout. If the connection of the client has been closed and is in the CLOESD state, Then the retransmitted FIN cannot find the corresponding connection, resulting in confusion of the connection. Therefore, the client side cannot directly enter the CLOSED state after sending the last ACK, but maintains TIME_WAIT. When the FIN is received again, it can guarantee the other party. Received ACK and finally closed the connection properly.

The second point: ensure that the repeated data segment of this connection disappears from the network If the client side sends the last ACK and directly enters the CLOSED state, and then initiates a new connection to the server side, it cannot be guaranteed that the port numbers of the new connection and the connection just closed are different, that is, the new connection and the old connection. The port number may be the same, then there may be a problem: if some data from the previous connection is stuck in the network, the delayed data arrives at the client side after the new connection is established. Since the port number and IP of the new and old connections are the same, TCP The protocol thinks that the delayed data belongs to the new connection, and the new connection will receive dirty data, which will lead to packet confusion. Therefore, the TCP connection needs to wait 2 times MSL in the TIME_WAIT state to ensure that all data of this connection disappears in the network.