Application Layers

  • Network Applications

    • Client-Server Model
    • Peer-to-Peer (P2P) Model
  • FTP

  • P2P Applications

    • P2P File Distribution
    • DHT (Distributed Hash Table)
  • The application layer provides services to the user.

    • The application layer of a web browser enable a user surfing websites.
  • Communication is provided using a logical connection

    • which means that the two application layers assume that there is an imaginary direct connection through which they can send and receive messages.

Processes Communicating

  • Process is a program running within a host
    • Within same host, two processes communicate using inter-process communication (defined by OS)
    • Processes in different hosts communicate by exchanging messages
  • Client-Server model
    • Client process: process that initiates communication
    • Server process: process that waits to be contacted
  • Applications with P2P architectures have client processes & server processes

Application Layer Protocol

  • Rules for when and how processes send & respond to messages

What transport service does an application need?

  • Data integrity
    • some apps (e.g., file transfer, web transactions) require 100% reliable data transfer
      • TCP for reliable transfer
    • other apps (e.g., audio) can tolerate some loss
      • UDP
  • Timing
    • some apps (e.g., Internet telephony, interactive games) require low delay to be “effective”
      • UDP for real-time
  • Throughput
    • some apps (e.g., multimedia) require minimum amount of throughput to be “effective”
    • other apps (“elastic apps”) make use of whatever throughput they get
  • Security
    • encryption, data integrity, …

Some examples:

Requirements:

img

Application:

img

Internet Transport Protocols Services

  • TCP service:
    • reliable transport between sending and receiving process
    • flow control: sender won’t overwhelm receiver
    • congestion control: throttle sender when network overloaded
    • does not provide: timing, minimum throughput guarantee, security
    • connection-oriented: setup required between client and server processes
  • UDP service:
    • unreliable data transfer between sending and receiving process
    • does not provide: reliability, flow control, congestion control, timing, throughput guarantee, security, orconnection setup,

Application-Layer Model

  • To use the Internet we need two application programs to interact with each other.
  • The two programs need to send messages to each other through the Internet infrastructure.
  • Should both application programs be able to request services and provide services, or should the application programs just do one or the other?
    • Client-Server model
      • Consists of two different processes
        • Server process
          • Runs continuously
          • Waiting for the client process to make a connection through the Internet and ask for service
        • Client process
          • Started when the client needs to receive service
          • Ask for service from the server process
    • Peer-to-Peer model
      • No always-on server
      • Arbitrary end systems directly communicate (run as both client and server)
      • Peers request service from other peers, provide service in return to other peers
        • Self scalability – new peers bring new service capacity, as well as new service demands
      • Peers are intermittently connected and change IP addresses
        • Complex management

Client-Server Model

  • Client-server paradigm uses the direction of initiation to categorize whether a program is a client or server.
  • An application that initiates the communication is called a client
    • E.g., Web browser, FTP client, Email client
    • Usually, the client contacts a server, sends a request, and awaits a response
  • An application that waits for incoming communication requests from clients is called a server
    • E.g., Web server, FTP server, SMTP server
    • The server receives a client’s request, performs the necessary job, and returns the result to the client
    • Remark: A server is usually designed to provide service to multiple clients
      • E.g., a web server can be accessed by many web browsers simultaneously
  • 2-way handshake

img

API for Internet - Socket

Application Programming Interface (API) for Internet

  • A computer language has a set of instructions for mathematical operations, a set of instructions for string manipulation, a set of instructions for input/output access, and so on.
  • If we need a process to be able to communicate with another process, we need a new set of instructions to tell the lowest four layers of the TCP/IP suite to open the connection, send and receive data from the other end, and close the connection.
  • A set of instructions of this kind is normally referred to as Application Programming Interface (API).

The API for Internet, we call it Socket Interface.

  • The interface is a set of instructions between two entities:
    • The process at the application layer
    • and The operating system (OS)
      • Encapsulates the first four layers of the TCP/IP
    • Socket interface is the most common one.
img

Sockets

  • Process sends/receives messages to/from its socket
  • Socket analogous to door
    • sending process shoves message out door
    • sending process relies on transport infrastructure on other side of door to deliver message to socket at receiving process

img

  • “Socket” is an interface between applications and the network services provided by OS
  • An application can send and receive data through a local socket

Use of sockets in process-to-process communication

  • Communication between a client process and server process can be considered as communication between sockets.
    • The client thinks that the socket is the entity that receives the request and gives the response.
    • The server thinks the socket is the one that has a request and needs the response.

img

Socket address

  • In the two-way communication between the sockets, we need a pair of socket addresses
    • local socket address and remote socket address
  • To define the computer on which a client or a server is running, we need the IP address.
  • However, several client or server processes may be running at the same time on a computer. To specify a client or a server process, we need the port number.
  • Socket = IP Address + Port number
img

Client-Server Applications

  • WWW and HTTP
  • File Transfer Protocol (FTP)
  • Electronic Mail – SMTP, POP3 and IMAP4
  • TELNET
  • SSH
  • DNS

We will look at the File Transfer Protocol (FTP).

FTP: File Transfer Protocol

  • Transfer file to/from remote host
    • Client*:* side that initiates transfer (either to/from remote)
    • Server: remote host
  • ftp: RFC 959
  • ftp server: port 21 (Control)
  • ftp data: port 20

img

FTP Connections

There are two different connections for the FTP.

The two FTP connections, control and data, use different strategies and different port numbers.

  • Control connection
  • Data connection

img

Control Connections

  • The control connection is created in two steps:
    • The server issues a passive open on the well-known port 21 and waits for a client.
    • The client uses an ephemeral port and issues an active open.
img

Once setup, the data connection is established.

Data Connections

  • Exchange files through the data channel
img

Communication over Control Connection

  • FTP uses the NVT ASCII character set
  • Communication is achieved through commands and responses.
img

Communication over Data Connection

  • To transfer files through the data connection, the client must define the type of file to be transferred, the structure of the data, and the transmission mode.
  • Before sending the file through the data connection, we prepare for transmission through the control connection.
img

Examples of FTP: File Transfer Protocol

  • Example1: Using FTP for retrieving a list of items in a directory
img
  • Example2: Using FTP for storing an image (binary)
img

FTP Commands

img img

FTP Responses

img img

Peer-to-Peer Model

  • Abbreviated as P2P

  • No need for a server process running all the time and waiting for the client processes to connect

    • The responsibility is shared among peers.
  • A computer connected to the Internet can provide service at one time and receive service at another time

  • A computer provide and receive services at the same time

img

Peer-to-peer gained popularity with Napster (1999-2001), an online music file. Napster paved the way for peer-to-peer file- distribution models that came later.

Gnutella (2000): First decentralized peer-to-peer network

Then, Fast-Track, BitTorrent, WinMX, and GNUnet.

P2P Networks

Internet users that are ready to share their resources become peers and form a network.

  • When a peer (A) in the network has a file to share, it makes it available to the rest of the peers.
    • An interested peer (B) can connect itself to the computer (A) where the file is stored and download it.
    • After a peer (B) downloads a file, it can make it available for other peers (C & D) to download.
    • As more peers join and download that file, more copies of the file become available to the group.
img

Problem:

  • How to keep track of peers and the location of the files?
    • List of peers may grow and shrink

To keep track of peers, we have multiple solutions:

  • Centralized Networks
  • Decentralized Network
    • Unstructured Networks
    • Structured Networks

Centralized network

  • The directory – listing of the peers and what they offer – uses the client-server model.
  • Storing and downloading of the files are done using the peer-to- peer model.

Example: Napster (music sharing)

  • A Peer first registers itself with a central server
  • The Peer provides its IP address and a list of files it has to share.
  • A peer, looking for a particular file, send a query to the central server
  • The server responds with the IP addresses of nodes that have a copy of the file.
  • The peer contacts one of the nodes and downloads the file.
  • Peer Once join the network, registers itself with a central server
  • Server provides a list of files available
  • Then Peer directly connect to other peer who have the file to down the file

Directory System

  • Client-server model
img

File Sharing

  • Peer-to-peer model
img

Limitations of Centralized network

  • As the directory (location of files) is stored in a few center servers
  • Hugh traffic at the central servers
  • The central servers are vulnerable to attack
    • Once server down, whole network is not available
  • The server providing the directory may infringe the copyright of the shared files.
    • Napster lost the copyright lawsuit and closed in 2001.

So people switch to Decentralized Network.

Decentralized Network

  • For decentralized network, peers arrange themselves into an overlay network (a logical connection, not physical connection).
    • We have unstructured and structured decentralized networks.

Unstructured Decentralized Network

  • For unstructured decentralized network, the nodes are linked randomly.
    • To find a file, a query must be flooded through the network. It produces significant traffic.

Example: Gnutella

  • a P2P network
  • The directory is randomly distributed between nodes.
img
  • When node A wants to access a file, it contacts one of its neighbors W by sending a query message.
    • The neighbor is any node whose address is known to node A.
    • The query includes the identity of the object (e.g. file name)
  • If node W knows the address of node X, which has the file, it sends a response message back to node A.
    • The response message includes the address of node X.
  • Now A can use get the file from node X using standard protocol such as HTTP.
  • If node W does not know the address of X, it floods the request from A to all its neighbors. Eventually one of the nodes in the network responds to the query message, and node A can get access to node X

Not efficient enough. So people use Structured Decentralized Network.

Structured Decentralized Network

  • Uses a predefined set of rules to link nodes
    • A query can be effectively and efficiently resolved.
    • Most common technique is Distributed Hash Table (DHT).
    • Example: BitTorrent
img

Distributed Hash Table (DHT)

  • distributes data among a set of nodes according to some predefined rules.
  • Each peer in a DHT-based network becomes responsible for a range of data items.
  • To avoid the flooding overhead for unstructured P2P networks, DHT- based networks allow each peer to have a partial knowledge about the whole network.
    • This knowledge can be used to route the queries about the data items to the responsible nodes using effective and scalable procedures.

Quick view:

  • Address Space - Depends on how many bits (mm bits) of the Hash value ( Address Space = 2m2^m)
    • In practice, m=160m = 160
  • Each peer, it’s IP address will through a hash function to form an address (Hashing Peer Identifier)
  • All the address are arranaged in acsending order (0 to 2m12^m - 1)
  • The Content name also through a same hash function to a Hashing Object Identifier
    • We call it a key
img

Address Space

  • Size: 2m2^m
    • Most of the implementation use m = 160
    • Address space is designed using modular arithmetic, i.e., the next number of 2m12^m- 1 is 0.
  • Each data item and the peer is mapped to a point
img

Hash Peer Identifier

  • The peer identifier is hashed into the address space.
    • The result is called node ID
    • Normally, the peer identifier is its IP address
  • node ID = hash (Peer IP address)
    • e.g. : 5 = hash (110.24.56.20)

Hashing Object Identifier

  • The name of the object (e.g. a file) is hashed into the address space.
    • The result is called a key
  • key = hash (Object name)
    • e.g.: 14 = hash (“Liberty”)

Storing the Object

  • Two strategies
    • Direct method
      • The object is stored in the node whose ID is closest to the key in the ring
    • Indirect method
      • Most DHT system used.
      • The peer that owns the object keeps the object.
      • A reference is of the object stored in the node whose ID is closet to the key point.

Indirect method Example

  • Although the normal value of m is 160, for the purpose of demonstration, we use m = 5.
  • Peer and Object Identifier
    • The node N5 with IP address 110.34.56.20 has a file named Liberty that wants to share with its peers. The node makes a hash of the file name, “Liberty,” to get the key = 14.
  • Storing the object
    • Since the closest node to key 14 is node N17, N5 creates a reference to file name (key), its IP address, and the port number (and possibly some other information about the file) and sends this reference to be stored in node N17.
img

Routing

The main function of DHT is to route a query to the node which is responsible for storing the reference to the object.

  • Each node needs to have a partial knowledge about the ring
    • For example, N5 knows its successor node is N10 and its predecessor node is N2 in the previous example.

Arrival and Departure of Nodes

  • Each peer can be a desktop or laptop computer
    • When a computer launches the DHT software, it join the network; when the computer is turned off or the peer closes the software, it leaves the network.
  • A DHT implementation needs to have a clear and efficient strategy to handle arrival or departure of nodes and the effect of this on the rest of the peers.
  • Most DHT implementations treat the failure of a node as a departure.

Chord Protocol

  • There are several protocols that implement DHT systems. In this section, we introduce the Chord protocol for its simplicity and elegant approach to routing queries.

Chord was published by Stoica et al in 2001.

Other protocols: Pastry, Kademlia

Chord - Finger Table

  • Each node has a routing table
  • Each node knows about mm successor nodes and one predecessor node.

img

  • For example, for node N=5,
    • it knows the successor of target key 6, 7, 9, 13, 21
      • 5+20,5+21,5+22,5+23,5+245+2^0 , 5+2^1, 5+2^2, 5+2^3, 5+2^4
      • Select the closest successor that exist
      • Note Address space is designed using modular arithmetic, i.e., the next number of 2m12^m- 1 is 0.

Example: An example of a ring in Chord

  • Get the successor of target key of the Node
  • Note Address space is designed using modular arithmetic, i.e., the next number of 2m12^m- 1 is 0.
  • Get the key values
  • Then the entry equals to the successor node of that corrsponding key.

img

Lookup

  • Given a key, we need to determine the node that is responsible the object.
  • It is equivalent to find the successor of the key
img img img

Step 1-3

  • Node N5 is not the successor of key 14
  • N5 finds the closest predecessor of the key 14
  • From its own routing table, the node is N10

Step 4-7

  • N5 Contacts node N10
  • Node N10 is not the successor of key 14
  • N10 finds the closest predecessor of the key 14
  • Form its own routing table, the node is N12

Step 6-8

  • N5 contacts N12
  • N12 is not the successor of key 14
  • But N12 is the predecessor of the key 14
  • N12 returns its successor N20 to N5

Finally, the Node 5 can contact N20 and get the file.

Stablize the P2P network

  • In a P2P network, there are new nodes joining the network or nodes leaving the network.
  • To stabilize the network, each node periodically uses this operation “Stabilize” to validate about its successor and let the successor validate its information about its predecessor.
  • Suppose the successor shown in the finger table of Node N is Node S
  • However, there is a new Node P between Node N and Node S. Therefore, the finger tables need to be stabilized.

Original : N - - - - S

New: N - - - - P - - - - S

  • Node N uses the value of finger[1] (first row of the finger table), S, to ask node S to return its predecessor, P.
  • If the return value, P, from this query, is between N and S, this means that there is a node with ID equals P that lies between N and S. Then node N makes P its successor and notifies P to make node N its predecessor.
img

Maintain Table Update: Fix_Finger

  • Each node N in the ring must periodically call this function “Fix_Finger” to maintain its finger table update.
  • The procedure is equivalent to finding the successor of N+2i1N+2^{i-1}.
    • The procedure is similar to the “Look_up” function
  • To avoid heavy traffic, each node update of its finger in each call.

Join

  • When a peer join the ring,
  • it uses the “join” operation and known ID of another peer to find its successor and set its predecessor to null.
  • Calls the “stabilize” function to validate its successor.
  • Ask the successor to call the “move-key” function that transfer that the new peer is responsible for.

Example:

In this example, node 17 join the previous ring with the help of N5.

  • N17 set its predecessor to null and its successor (finger[1]) to N20.
    • From the finger table of N5, the successor of key 13 is N20 and the successor of key 21 is N25 so that the finger[1] of N17 is N20.
img
  • N17 then asks N20 to send k14 and k16 to N17 because N17 is now responsible for these keys.
  • In the next time-out, N17 uses “stabilize” operation to validate its own successor.
    • In this example, N17’s successor is N20. Then, N20 change its predecessor to N17.
img
  • When N12 uses “stabilize”, the predecessor of N17 updated to N12.
img
  • Finally, when some nodes use “fix-finger” function, the finger table of nodes N17, N10, N5 and N20 is changed.

img

To summarize:

  • At the very beginning, When a node join the network, it will locate the succssor first
  • Once it locate the succussor, and after stabilize, it will update the predecessor so So the succssor will know
  • After contacting the succussor to pass the key closer the 17, it use the Fix_Finger to find out the values of the remaining table entries
  • Fill up the finger table
  • Other node will also update

Leave or fail

  • If a peer leaves the rings or the peer fails, the operation of the ring will be disrupted unless the ring stabilizes itself.
  • In addition to the “stabilize” and “fix-finger” operations which are called periodically, each node exchanges “ping and pong” messages with neighbors to find out if they are alive.
    • Each node periodically contact the predecessor and succussor
    • If no response after sending a “ping” message to its neighbor, the node knows that the neighbor is dead.
  • As the data and references managed by the node is no longer available, Chord requires that data and references be duplicated on other nodes in this case

Example:

Assume N10 leaves the ring

img
  • Assume N5 finds out about N10’s departure when it does not receive a pong message to its ping message. N5 changes its successor ( finger[1] ) to N17.
  • After several “stabilize” and “fix-finger” calls, the ring is stabilized.

P2P File Distribution: BitTorrent

  • File divided into 256Kb chunks
  • Peers in torrent send/receive file chunks

img