Okay. So now you've got your network size and you can route data between hosts via IP addresses. Now we need to move up the layer model from network host-to-host transport into the Session and Application layers. Now that we're moving data around, how does the computer know what to do with that data?
You've probably heard the term "packet" before and maybe you know that TCP/IP moves data around in the form of packets. So what is a packet and how does data get to be a packet?
TCP/IP is said to be a packet-switched networking method. These easiest way to understand this is to contrast it against the more traditional circuit-switched telephone network. In the phone network you pick up the phone and dial a number. when the phone on the other end is taken off-hook that completes a circuit -- a single point-to-point connection is established. If there is a break anywhere in that circuit, the call is lost. (I know this is an oversimplification).
Packet-switched networks, however, first break the data into small chunks called packets. A header is attached to the packet containing routing information and the individual packets are sent out onto the network. By breaking up the data into packets, changes in routing can occur dynamically. Packets arrive on the other end of a communication and the data is reassembled by the receiving host computer. Packets lost during transmission will be retransmitted by the sender, possibly taking a different route to the receiver.
TCP, Transmission Control Protocol, is responsible for keeping track of packet sequences in both the sending and receiving hosts. In other words, TCP maintains a connection session for duration of a network transmission. This can consist of receiving an email, receiving a web page, etc. TCP is also responsible for handing those packets off to the appropriate application.
All the information regarding packet routing and session information is contained in the packet's header. Here's the basic construction of a TCP/IP Packet:
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |Version| IHL | TOS | Total Length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Identification |Flags| Fragment Offset | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | TTL | Protocol | Header Checksum | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Source Address | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Destination Address | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Options.... (Padding) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Data... +-+-+-+-+-+-+-+-+-+-+-+-+- |
As you can see from the diagram above, the 32-bit header contains all the information need to get the packet from it's source to it's destination as well as details concerning what to do with the packet once it gets there. The numbers at the top are the actual bits of the header so each row makes up a single 32 bit "word". The contents of each portion of the header are as follows: [1]
Version: Specifies the IP version of the packet. The current version of IP is version 4.
Internet Header Length (IHL): Indicates the length of the datagram header in 32 bit (4 octet) words. A minimum-length header is 20 octets.
Type of Service (TOS): Allows an originating host to request different classes of service for packets it transmits. It's not generally supported today in IPv4.
Total Length: Indicates the length (in bytes, or octets) of the entire packet, including both header and data. Given the size of this field, the maximum size of an IP packet is 64 KB, or 65,535 bytes. In practice, packet sizes are limited to the maximum transmission unit (MTU).
Identification: Used when a packet is fragmented into smaller pieces while traversing the Internet, this identifier is assigned by the transmitting host so that different fragments arriving at the destination can be associated with each other for reassembly.
Flags: Also used for fragmentation and reassembly. The first bit is called the More Fragments (MF) bit, and is used to indicate the last fragment of a packet so that the receiver knows that the packet can be reassembled. The second bit is the Don't Fragment (DF) bit, which suppresses fragmentation. The third bit is unused (and always set to 0).
Fragment Offset: Indicates the position of this fragment in the original packet. In the first packet of a fragment stream, the offset will be 0; in subsequent fragments, this field will indicates the offset in increments of 8 bytes.
Time-to-Live (TTL): A value from 0 to 255, indicating the number of hops that this packet is allowed to take before discarded within the network. Every router that sees this packet will decrement the TTL value by one; if it gets to 0, the packet will be discarded.
Protocol: Indicates the higher layer protocol contents of the data carried in the packet; options include ICMP (1), TCP (6), UDP (17), or OSPF (89).
Header Checksum: Carries information to ensure that the received IP header is error-free. Remember that IP provides an unreliable service and, therefore, this field only checks the header rather than the entire packet.
Source Address: IP address of the host sending the packet.
Destination Address: IP address of the host intended to receive the packet.
Options: A set of options which may be applied to any given packet, such as sender-specified source routing or security indication. The option list may use up to 40 bytes (10 words), and will be padded to a word boundary
The rules of TCP/IP also define a standard set of ports in use for differing types of connections. You may already know that webserver connections come into a machine on port 80. Here is a list of definitions of the most common ports.
Table 1-6. Common ports
Port # | Service | Description |
---|---|---|
20 | ftp-data | Data port for ftp connections |
21 | ftp | actual ftpd service runs on this port |
22 | ssh | Secure shell |
23 | telnet | Telnet |
25 | smtp | Mail servers (ie. sendmail) run on this port |
53 | domain | Name server (ie. bind) for DNS |
80 | www | Web server |
110 | pop3 | POP3 mail retrieval daemons |
119 | nntp | USENET news |
137 | netbios-ns | NETBIOS (windows file sharing) name service |
138 | netbios-dgm | NETBIOS (windows file sharing) Datagram service |
139 | netbios-ssn | NETBIOS (windows file sharing) Session service |
143 | imap2 | IMAP mail retrieval |
443 | https | secure (SSL) web server |
Actually your Linux box provides even more information than the above list. The file /etc/services is a more comprehensive list of available services and the ports they utilize. And of course you can turn to RFC 1700 and later RFC's that update it for the canonical list of ports.
As a Linux user you should also be aware of the concept of reserved or privileged ports. Ports from 1-1024 fall into this category. All this means is that services running on those ports must be run by root. Regular users can open ports above 1024 for their own use.
Your machine temporarily opens up ports all the time on your behalf. When you look at a web page on port 80 of a remote machine, your machine also opens up a port above 1024 temporarily. The # of this port is in your request packet for the web page. When the packets consisting of the page are returned to you, they are returned to the temporary port opened when you initially sent the request (ie. clicked the link). After all, the page must come back to you via standard TCP/IP methods.
[1] | For a more detailed treatment (and the original source for this diagram and list) see this link. |