TCP / UDP

anchored to [[143.00_anchor]] last exercise of this lab focusing on the transport layer ( 4 ), especially features of TCP and UDP


Overview

Both protocols find themselves in the Transport-Layer ( 4 of TCP/IP-model ) and are necessary to send application data across networks - encapsulating the payload of all layers above within its Payload. Both are "going two ways" while TCP is reliable, connection oriented and comes with features like flow / congestion control, error correction and some additional ones.

On the other hand UDP is the unreliable counterpart which primarily allows transmission of large - time sensitive - packets that can be lost on the way - so no connection-oriented system - without any large setup making it easy to send data fast with the risk of losing packets and such. Some trick to make this reliable again would be to simply incorporate mechanisms to check the data within the udp-packet and establish a routine to resend packets and such --> done by QUIC for example


UDP - User Datagram Protocol

Definition to be found at RFC 768

[!Definition] Definition by RFC

This User Datagram Protocol (UDP) is defined to make available a datagram mode of packet-switched computer communication in the environment of an interconnected set of computer networks.

This protocol assumes that the Internet Protocol (IP) 1 is used as the underlying protocol.

This protocol provides a procedure for application programs to send messages to other programs with a minimum of protocol mechanism. The protocol is transaction oriented, and delivery and duplicate protection are not guaranteed. Applications requiring ordered reliable delivery of streams of data should use the Transmission Control Protocol (TCP) 2.

Structure of UDP Packets

Below is the rough overview of the usual UDP-packet. The most important parts that define the whole header are

  • Source-Port ( port used by the sending instance ) ( its optional!)
  • Destination-Port ( port used at receiving end )
  • Length - length of the appended datagram - of this packet
  • Checksum to possibly detect error of the header ( not used to provide any sort of error correction!) There are more information to be found at the linked RFC above but those are the most prominent aspects.

[!Tip] What we can observe here what is making UDP special/different to TCP? #card In its structure UDP is really barebone with close to now features to enable / allow for error correction or similar Because its so small this packet is great for sending large information fast with the caveet of no reliable transport ( no guarantee of any arrival nor the correct information sent) --> Hence it cannot be used to simply send large packets around like files --> it would be awful due to all the possible transmission errors, missing packets ( invalid fragments) etc.

However because this is not using any connection-oriented principle its fast to send out data without having to establish the means to communicate reliable. --> VoIP / Video-Streams ( or DNS with small queries and time sensitive operations) make up some good examples for the usage of UDP ( or QUIC by Google )

[!Warning] UDP is not caring about fair transmission Meaning that UDP is not using any form of Fairness control to establish a certain fairness between other participants sending data over one connection. --> TCP on the other hand tries to care about this by regulating its transmission amount based on implicit feedback by the network ( latency / congestion and such )

UDP is just blasting without caring much!

  0      7 8     15 16    23 24    31
 +--------+--------+--------+--------+
 |     Source      |   Destination   |
 |      Port       |      Port       |
 +--------+--------+--------+--------+
 |                 |                 |
 |     Length      |    Checksum     |
 +--------+--------+--------+--------+
 |
 |          data octets ...
 +---------------- ...

	  User Datagram Header Format

[!Information] Size of UDP-packets whats the allowed size of a udp packet? what makes up the header-size? #card ( I was a little stupid figuring this out, this message / answer gave some good information too, otherwise just take a look at the corresponding RFC! ) link to stackoverflow

The header is defined with 8bytes. Within 2 bytes are allocated for the size of the udp-packet Hence we can have payload of size: $2^{16-1}$ Bytes

Limits of UDP

We may as well cover certain limitations of UDP packets ( some were mentioned previously already). what are possible limitations of UDP? #card

As mentioned previously we don't have much of an option to correct errors in transmission. Furthermore we also lack the ability to trace / find out whether a packet was lost or not --> we are not keeping track of segments and whether they arrived or not! -> reordering segment also remains undetected here Theres the possible risk of overflowing the receiver by sending them too much information and thus killing their Buffer

The total size of the UDP-Segments depends on the Ip-packet that UDP is encapsulated in.

Benefits of UDP

We may as well discuss have a short summary of possible benefits with UDP which can we denote? #card

  • we can detect transmission errors --> the checksum! ( although not being able to resolve them without help by the user application)
  • Theres no setup/ management of connections hence we don't have to maintain this state for both sender and receiver
  • the header is small --> less overhead of data, especially if we are sending barely anything anyway ( like with easy DNS requests!)
  • because there's no congestion control we can send much data at once --> throughput is not really limited by anything because we just send data

TCP - Transmission Control Protocol

TCP is pretty much a counterpart to UDP regarding many features and traits. Definition of TCP can be found in the following RFCs: first definition: RFC 793 updated version: RFC 9293 From the updated version (part 2.2) the key concepts for TCP are:

  • TCP provides a reliable, in-order, byte-stream service to applications.
  • The application byte-stream is conveyed over the network via TCP segments, with each TCP segment sent as an Internet Protocol (IP) datagram.T
  • CP reliability consists of detecting packet losses (via sequence numbers) and errors (via per-segment checksums), as well as correction via retransmission.
  • TCP supports unicast delivery of data.
  • There are anycast applications that can successfully use TCP without modifications, though there is some risk of instability due to changes of lower-layer forwarding behavior 46.
  • TCP is connection oriented, though it does not inherently include a liveness detection capability.
  • Data flow is supported bidirectionally over TCP connections, though applications are free to send data only unidirectionally, if they so choose.
  • TCP uses port numbers to identify application services and to multiplex distinct flows between hosts.A more detailed description of TCP features compared to other transport protocols can be found in Section 3.1 of 52.
  • Further description of the motivations for developing TCP and its role in the Internet protocol stack can be found in Section 2 of [16] and earlier versions of the TCP specification.

Structure of TCP HEADER

As with UDP we have a header that defines important information about the payload conveyed and the packet itself - also giving context to other packets ( i.e whenever we are fragmenting packets for example )

    0                   1                   2                   3
    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |          Source Port          |       Destination Port        |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                        Sequence Number                        |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                    Acknowledgment Number                      |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |  Data |       |C|E|U|A|P|R|S|F|                               |
   | Offset| Rsrvd |W|C|R|C|S|S|Y|I|            Window             |
   |       |       |R|E|G|K|H|T|N|N|                               |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |           Checksum            |         Urgent Pointer        |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                           [Options]                           |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                                                               :
   :                             Data                              :
   :                                                               |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

          Note that one tick mark represents one bit position.

As seen TCPs header is way larger than the UDP counterpart primarily because it is implementing and supporting more features which require additional information ( Sequences Number to correctly map some ). The most important might be the following entries:

  • Source Port / Destination Port - same principle as with UDP!
  • Sequence Numbers -> denotes the first octet of data for this segment ( so if we send something large we may have to split it and with this information we can then reassemble them in the correct order ) ( also important if we gather packets in different time periods which led them to be unsorted )
  • acknowledgement number -> (only active when ACK-control bit set) describes the next sequence number for the sender to receive --> if we sent some data with a sequence number ( + the total amount of data sent there ) we end up at a "new position within the whole dataset" ( because we now traversed and received more data thus the sequence-numebr will then have to describe a new portion of data we don't know about yet!)
  • Window(size) -> this is a field necessary for proper flow control, it denotes the amount data the receiver is capable / willing to receive at a time ( like denoting whether it can handle a lot of data at once ( or is heavily buffering and thus needs a smaller limit))
  • Control Bits / Flags -> Here we have plenty 1bit large flags that can denote different modes / properties:
    • ACK -> Whether the packet is an acknowledgement
    • RST -> can be used to reset the connection ( remember its a connection-oriented transmission )
    • SYN -> whether the packet is part of the SYN progress
    • FIN -> if set it describes that no more data will be sent

further information, the complete list of flags and all is to be found in the corresponding RFC 9293

Connection Set-Up with TCP

As mentioned before TCP is a connection-oriented protocol that establishes and maintains a connection between two hosts to share data.

For that we have to establish a certain set to create and signal this connection -> both participants should be aware whether the connection is actually established or not ( get an ack for their attempt to connect ).

[!Tip] 3 Way Handshake By creating a 3-way handshake / message exchange we can successfully setup a communication between two hosts.

Structure:

  1. The given Client ( party that wants to establish the connetion ) is sending a SYN-flagged Request which is containing its initial Sequence-Number
  2. The receiving end now ought to accept the received and send its own solicitation to gather intel of the connection status later. For that it will send a SYN-flagged Request that contains its initial sequence-number ( it must be different!) and also sets the ACK-Bit + the next sequence-number from the sending part ( so incremented by 1 )
  3. Client 1 is now aware of its successful connection yet ought to signal this to Client 2 too. Hence it will send its current sequence-number and an ACK containing the next sequence number for Client 2 ( also incrementing by 1 ) After this handshake the tcp connection is alive and can exchange data and such.

Setting Sequence Numbers: For security reasons the sequence numbers( or better their initial start) are random and should not be predictable to prevent spoofing messages by guessing the correct sequence numbers -> link to stackoverflow

Now its also necessary to close a connection - so that both parties are aware that the transmission has ended.

This is done in the following way:

  1. Client 1 sends a FIN-tagged Request
  2. Client 2 responds with an ACK-tagged Request
  3. Client 2 sends a FIN-tagged Request
  4. Client 1 responds with an ACK-tagged Request After this procedure both parties have gathered intel that their connection and the connection from their peers was closed - and also signaled that to each other.

TCP Congestion Control

Amongst connection-oriented transmission with error correction and some other properties we may observe the idea of Congestion Control deployed by TCP.

Whats the idea of Cognestion control, why can't we just send more at once? #card Congestion Control ( in networks ) is describing the effect of reduced quality of transmission within a network - reduced goodput - due to issues like packet loss, or some device not capable of handling a certain load. Now if the throughput decreases within a network - specifically in a transmission between hosts - we may want to compensate for this loss of performance. Some ideas / options to compensate could be the increase of sent packets by increasing the retransmissions of lost or corrupted packets. In return this could further congest the whole system and degrade its performance or kill it too.

WITH TCP there's a mechanism that tries to maximize throughput while also trying to reduce congestion caused by this idea. This is done by using mechanisms / algorithms that try to find a correlation of the implicit effects of congestion/throughput and actions to take. --> We focus on implicit effects because its unusual that a system would be communicating its current congestion ( after all that could be delayed or lost too declaring it useless -> we cant really improve if the status report is missing or delayed ) and thus its better to implicitly listen and find cues to sense the current congestion. TCP can do this by measuring delays in acknowledgements ( answers from a sent packet ) and its linked retransmission-timer --> ( this is primarily for sensing delays! )

And to detect packet-loss it deploys the usage of the sequence-numbers --> after all each response should denote the currently expected sequence number <-- because those are ordered and if one is missing the receiving part would send 3 wrong duplicate sequence numbers in response because of the missing packet lost.

[!Tip] Goal of Congestion Control

With those cues we can somewhat establish a congestion control that will try to maximize throughput while also responding to possible congestion caused by it ( or other connections on the same network ).

Furthermore a certain fairness amongst other connections is deployed / used or at least attempted. We don't want to steal all the network-capabilities for ourselves if other transmissions are happening too.

Establishing a good transmission speed

As mentioned above its crucial to somehow find the best throughput for our TCP-connection to send data fast and utilizing the whole bandwidth available. We could just send the max we suspect yet this is not predictable at all hence we ought to deploy the idea of probing.

[!Definition] Probing to find the best window size why probe, how does AIMD work? #card

By gradually probing a window size ( the size of segments to send per packet) we slowly grow in our bandwidth while not directly killing the network with the maximum. Further we can somewhat gather responses from the network - and its dynamics - by probing to find the correct speed.

A famous example for this idea is AIMD - Additive increase / multiplicative decrease. Here we grow our window size with a given addend and then increase / continue doing that until we encounter congestion. If detected we will then drop the window size by a given factor ( so divide it by X ) to reduce the amount of congestion again. ( observing this in a graph will show a sawtooth-chain graph which is somewhat cool? )

There are plenty of algorithms that are trying to calculate this behavior while growing faster/ reacting better to congestion.

One such example would be TCP CUBIC: what are traits of CUBIC? #card

  • successor of BIC-TCP
  • used for long fat networks
  • default in MacOS / Windows / Linux
  • fair because it grows independently on the RTT
  • window size depends on the previous congestion event only
  • CUBIC spends a lot of time at a plateau between the concave and convex growth region which allows the network to stabilize before CUBIC begins looking for more bandwidth.

Information are to be found on the paper comparing / analyzing it link to paper I also logged it here locally 211.14_TCP_CUBIC_analysis

Some resources regarding congestion:

  • https://en.wikipedia.org/wiki/Network_congestion#Mitigation
  • https://en.wikipedia.org/wiki/CUBIC_TCP
  • https://en.wikipedia.org/wiki/Round-trip_delay

Unfairness of TCP with in large RTT context

This was a question on stackoverflow which was asking about the reasons of TCP being unfair with high RTT:

Versions of TCP that use Van Jacobson algorithm are unfair in some contexts, such as satellite communication. I cannot understand why. Is this problem caused by asymmetric links, in which the receiver has more possibility to send acknowledgement packets than the sender?

Some answer provided:

After some researches I have found an answer. It is not only for the delay-bandwidth product as suggested in the comment but for some reasons:

  • The throughput of a sender could be written as Throughput=CongestionWindow/RoundTripTime, so if you have a bigger RTT you need bigger CW to reach the same throughput;
  • The capacity of the channel could be written as Capacity=Delay*Bandwidth, so you could retrieve the bandwidth available in this way Bandwidth=Capacity/Delay (and Delay could be the half of the RTT or equal the RTT considering that for each packet the ACK is needed);
  • The CongestionWindow could be written as function of the RTT in that way:
    • CongestionWindow = 2 ^ (t/RTT) in slow start phase, where t is the time;
    • CongestionWindow = ss + (t - tss)/RTT in congestion avoidance phase, where ss is the slow start threshold, t is the time, tss is the time when you reach the slow start threshold and ss is the value of the slow start threshold.
      Avoiding to make formulas more complicate because of the possible error that can occur that change the CongestionWindow and th slow start threshold, it could be easly seen that the CongestionWindow strongly depends on the RTT, and since it appears on the denominator both in slow start phase and in congestion avoidance phase, bigger the RTT is, more disadvantaged the sender is.

P2P through NAT

NATs are primarily deployed to compensate for the decreasing amount of public Ipv4 addresses - well and to further delay rise and replacement by Ipv6 ?? - and are basically deploying a private-subnet from perspective of an ISP towards a given area / region. This region is then only handled by the router spanning and maintaining this network ( well and some redundancy deployed too ) and these outgoing-points have a real public address which all hosts within this net use to communicate with the world.

Traversing from "the internet" to the "internal network" requires to modify packets because outward clients / systems are not aware of the ip-address of the client they are talking to ( because the router is translating from outside + specific port to _internal with given port)

This - as one might be able to observe - kills the idea of P2P connections because its not possible for the clients to connect seemingly without some machine interrupting ( and translating the packets and such).

In come some ideas to traverse a NAT and allow p2p connections - with possible translations in mind! - like STUN or simple RELAYING Servers.

STUN - Session Traversal Utilities for NAT

see the original RFC 5389 for more information.

As per definition by the linked RFC:

Session Traversal Utilities for NAT (STUN) is a protocol that serves as a tool for other protocols in dealing with Network Address Translator (NAT) traversal.

It can be used by an endpoint to determine the IP address and port allocated to it by a NAT.

It can also be used to check connectivity between two endpoints, and as a keep-alive protocol to maintain NAT bindings.

STUN works with many existing NATs, and does not require any special behavior from them.

STUN is not a NAT traversal solution by itself. Rather, it is a tool to be used in the context of a NAT traversal solution.

STUN - in its concept - is relatively simple: hows nat established #card

We use the concept that we have two different hosts - there are different combinations possible: one behind NAT, the other not; both behind NAT; both behind multiple NATS ... - and further a server that is publicly available ( not within the / a nat!).

Now we know that traversing from within the NAT to the internet will change the source Ip from our sending-host. If we now establish a server ( STUN-Server) that we can connect to they will obtain both the public address( in front of the the NAT) ( and its port) and also the private address ( within the NAT ). We further tell the STUN-Server who we want to talk to. Now the Stun-Server will check the information on the requested host and send the asking host the information to reach the desired host by supplying them the private and public address ( that was previously communicated / exchanged). It also sends the desired client a message containing the host information of the requesting-client. Both parties will now try to establish a connection over the two given addresses - the NAT'ted and the public address ( leading to the NAT basically). One of the connections will come through - which one depends on the underlying structure - and thus enable both clients to communicate with each other directly with a NAT - or multiple ones - in-between translating ip-packets.

Hole Punching ( with STUN)

theres a really good paper explaining all the ideas of achieving a good NAT traversing P2P-connection here: website link

I've also logged this website here to preserve - and easily access it locally 143.19_p2p_nat_udp_hole_punching

IP-Fragmentation:

The following section was mostly taken from the lab - hereby the origin is from the "Lehrstuhl Kommunikationsnetze Universität Tübingen":

Introduction:

Data link layers generally impose an upper bound on the length of a frame, and, thereby, on the length of an IP datagram that can be encapsulated in one frame. If the size of an IP datagram exceeds the maximum length a data link layer can transmit, the datagram has to be fragmented.

MTU

For each network interface, the Maximum Transmission Unit (MTU) specifies the maximum length of an IP datagram that can be transmitted over a given data link layer protocol.

For example Ethernet II and IEEE 802.3 networks have an MTU of 1500 bytes and 1492 bytes, respectively.

Some protocols set the MTU to the largest datagram size of 65536 bytes.

However the MTU for any data link layer protocol must be at least 576 bytes.

If an IP datagram exceeds the MTU size, the IP datagram is fragmented into multiple IP datagrams, or, if the DF flag is set in the IP header, the IP datagram is discarded.

IP Fragmentation Basics

When an IP datagram is fragmented, its payload is split into multiple IP datagrams, each satisfying the limit imposed by the MTU. Each fragment is an independent IP datagram, and is routed in the network independently from the other fragments. Fragmentation can occur at the sending host or at an intermediate router. It is even possible that an IP datagram is fragmented multiple times, e.g., an IP datagram may be transmitted on a network with an MTU of 4000 bytes, then forwarded to an network with an MTU of 2000 bytes, and then to a network with an MTU of 1000 bytes.

Fragments are reassembled only at the destination hosts. If a host receives fragments of a larger IP datagram it holds the fragments until the original IP datagram has been fully restored.

Fragments do not have to be received in the correct order. The destination host can use the fragment offset field to place each fragment in the right position.

IP assumes that a fragment is lost if no new fragment have been received for a timeout period. If such a timeout occurs, all fragments of the original datagram that have been received so far are discarded.

Involved Header Fields

Fragmentation of IP datagrams involves the following fields in the IP header: total length, identification, DF and MF flags, and fragment offset. The fields that are relevant during fragmentation are included in the figure. In the figure an IP datagram with a length of 2400 bytes is transmitted on a network with an MTU of 1000. We assume that the IP header of the datagram has the minimum size of 20 bytes. Since the DF flag is not set in the original IP datagram on the left, the IP datagram is now split into three fragments. All fragments are given the same identification as the original IP datagram. The destination host uses the identification field when reassembling the original IP datagram. The first and second IP datagram have the MF flag set, indicating to the destination host that there are more fragments to come. Without this flag, the receiver of fragments cannot determine if it has received the last fragment.

Fragment Size & Fragment Offset

To determine the size of the fragments we recall that, since there are only 13 bits available for the fragment offset, the offset is given as a multiple of eight bytes.

As a result, the first and second fragment have a size of 996 bytes (and not 1000 bytes).

This number is chosen since 976 is the largest number smaller than 1000-20 = 980 that is divisible by eight. The payload for the first and second fragments is 976 bytes long, with bytes 0 through 975 of the original IP payload in the first fragment, and bytes 976 through 1951 in the second fragment.

The payload of the third fragment has the remaining 428 bytes, from byte 1952 through 2379. With these considerations, we can determine the values of the fragment offset, which are $0, 976/8$, and $1952/8 = 244$, respectively, for the first, second and third fragment.

Drawbacks of Fragmentation

Even though IP fragmentation provides flexibility that can deal effectively with heterogeneity at the data link layer, and can hide this heterogeneity to the transport layer, it is has considerable drawbacks.

For one, fragmentation involves significant processing overhead. Also, if a single fragment of an IP datagram is lost, the entire IP datagram needs to be retransmitted (by a transport protocol). To avoid fragmentation, TCP tries to set the maximum size of TCP segments to conform to the smallest MTU on the path, thereby avoiding fragmentation.

Likewise, applications that send UDP datagrams often avoid fragmentation by limiting the size of UDP datagrams to 512 bytes, thereby, ensuring that the IP datagrams is smaller than the minimum MTU of 576 bytes.

Note that in IPv6, it is generally deprecated to fragment IP datagrams.

Further Resources:

Things about NATs:

  • https://www.zerotier.com/blog/the-state-of-nat-traversal/