Outline
- Exercise 1
- connections and sockets
- Sockets API
- HTTP and HTML
Exercise 1
- telnet is
- a program for logging into other computers
- a generic client program useful for debugging servers
- sometimes we wonder: are we connected to this host?
Ping tells us if we are
- if we are not, traceroute tells us where the problem is (usually,
local or remote, sometimes in the backbone)
- study the individual system calls or library
functions in the sockets API (using "man" or the web) --
you MUST understand what they do
- be sure to understand the flow of the entire program, and also
the corresponding programs in the book (Section 2.7), which are in Java
Connection-Oriented Service
- TCP is an internet protocol that provides Connection-Oriented Service:
a connection must be established by calling connect
before any data can be sent or received
- any connection consumes resources, and must be closed after
it is no longer needed
- as long as the connection is up, data can be sent on it and received
from it without having to explicitly specify the peer
- a connection-oriented communication is analogous to a telephone call
Connectionless Service
- UDP is an internet protocol that provides Connectionless Service:
data can be sent and received without establishing connections
- a destination address must be provided each time data is sent,
- a single (unconnected) socket can be used to receive from multiple peers
- a connectionless communication is analogous to a letter (email
or snail-mail)
Sockets
- when we write to a file, we call open,
specifying the file name, and in return get a file descriptor.
Note that many students are only familiar with fopen,
which returns a FILE *: In contrast, open returns a
file descriptor, which is a small integer that tells the OS
what file you are accessing. For example, file descriptor 0 is
stdin, 1 is stdout, and 2 is stderr
- when we connect to a peer, we perform several system calls,
specifying the peer address and the protocol to use, and in return
we get a
socket
- the socket is a special kind of file descriptor: all the operations
that can be performed on a file descriptor (especially read
and write) can be performed on a socket, but some special
operations can only be performed on sockets
Unix Sockets API -- principles
- a socket is an endpoint of communication -- we need two
sockets to communicate
- a socket pair is uniquely identified by:
- protocol: TCP or UDP
- two IP addresses (for example 128.171.10.123), one for each socket
- two port numbers (for example 1234), one for each socket
- when we first create a socket, it has only the first of these attributes
(protocol)
Unix Sockets API -- Managing Connections
/* create a socket of a given type/protocol */
int socket(int domain, int type, int protocol);
/* bind a socket to a given local address (usually default) and port number */
/* mostly used by servers to specify the port on which to accept connections */
int bind(int sockfd, struct sockaddr *my_addr, socklen_t addrlen);
/* specify this is a server and how many connections may be pending */
int listen(int s, int backlog);
/* wait for an incoming connection, and return a new socket descriptor */
/* after the connection is established */
int accept(int s, struct sockaddr *addr, socklen_t *addrlen);
/* create a (client) connection to the give IP address and port number */
/* the local IP address and port number are selected automatically, */
/* unless bind was called first (very unusual) */
int connect(int sockfd, struct sockaddr *serv_addr, socklen_t addrlen);
/* close (or half-close, with shutdown) a connection */
int shutdown(int s, int how);
int close(int fd);
/* get the name of the local host */
int gethostname(char *name, int len);
/* get one or more IP addresses for the given host name */
struct hostent *gethostbyname(const char *n);
struct hostent {
char *h_name; /* official name of host */
char **h_aliases; /* alias list */
int h_addrtype; /* host address type */
int h_length; /* length of address */
char **h_addr_list; /* list of addresses */
}
/* get the protocol number for the given protocol */
/* returns 6 for TCP, 17 (0x11) for UDP */
struct protoent *getprotobyname(const char *n);
Unix Sockets API -- Sending and Receiving data
/* send the data in the buffer pointed to be msg with len bytes */
int send(int s, const void *msg, int len, int flags);
/* send the data on an un-connected socket */
int sendto(int s, const void *msg, int len, unsigned int flags,
const struct sockaddr *to, socklen_t tolen);
/* send the data on any file descriptor, including connected sockets */
int write(int fd, const void *buf, int count);
/* receive up to len bytes of data from the connected socket */
int recv(int s, void *buf, int len, int flags);
/* receive up to len bytes of data from the un-connected socket, */
/* recording the address of the sender. */
int recvfrom(int s, void *buf, int len, unsigned int flags,
struct sockaddr *from, socklen_t *fromlen);
/* read or receive the data on any file descriptor, including connected sockets */
int read(int fd, void *buf, int count);
Windows Sockets API
/* call before making any socket-related call */
int WSAStartup(int version,
WSADATA *implementation);
/* call after the last socket-related call */
int WSACleanup();
- call before and after using sockets API
- version is e.g. 0x21 for version 2.1
- cannot use read or write, use send and
recv instead
- use closesocket instead of close
Socket types
- stream sockets:
- always TCP
- conceptually send individual bytes (grouped into buffers, but grouping
is not preserved), in order
- always reliable: all bytes sent are received, or the
connection is lost
- grouping/chunking may be different at sender and receiver
- datagram sockets:
- always UDP
- conceptually send individual packets -- grouping/chunking is the same at
the two endpoints
- may not be reliable: packets may be lost, or reordered (not corrupted)
- packets will be truncated if the receive buffer is too small
- raw sockets, Unix sockets, etc.
Reading a STREAM Socket
/* create a buffer and declare some variables */
char buffer [BUFFER_SIZE];
int num_bytes, new_bytes, socket;
/* we have not yet received anything */
num_bytes = 0;
do {
/* receive as much as the socket can give us now */
/* note the buffer argument begins AFTER all the already-received data */
/* and the buffer size is correspondingly smaller*/
new_bytes = read (socket, buffer + num_bytes, BUFFER_SIZE - num_bytes);
/* the byte count reflects the new bytes we have received */
num_bytes += new_bytes;
/* receive more if we have room, AND if what we received is not complete */
/* deciding if what we received is complete is application-dependent */
} while ((num_bytes < BUFFER_SIZE) && expecting_more (buffer, num_bytes));
- cannot expect all our data to be received with
one read call
- even if it was sent in a single send call!
- if we receive more data than we expected, we must save it for the
next iteration (not shown in the code)
- in-class exercise (work with your neighbors): put this code
into a generic function called to read data into a given buffer
- in-class exercise (work with your neighbors): write the
expecting_more function to detect the sequence CRLF CRLF
(\r\n\r\n)
HTTP
- HyperText Transport Protocol
- All the HTTP header fields are encoded using ASCII:
- a header is a sequence of lines, each terminated by \r\n
- the header is terminated by an empty line (\r\n\r\n)
- every line but the first has the form: field: value
- the first line is special:
- GET /a/b/c/d HTTP/1.0: the request, followed by the
identification of the specified object and the protocol
- HTTP/1.1 200 OK: the protocol used in the reply,
followed by the result code and the result text
- in-class exercise (work with your neighbors): use the code
above to write a function that reads an HTTP header.
HTTP Client
- parse the URL/URI
- if an HTTP request, connect to the corresponding server
- send a request header, possibly with a body (in case of POST)
- read the reply, in a loop, until the connection is closed
by the server (read returns 0 bytes) -- or until we have
finished reading the number of bytes specified by the header (if the
header specified a byte count)
- parse and display the data (perhaps while receiving additional data),
checking the status code
HTTP server
- listen for a connection
- read a request header, possibly followed by a request body
(the request header must uniquely determine whether there is a
request body, and if so how long it is)
- generate the data (if any) for the response, perhaps by reading
a file from disk
- send the data to the client
- close the connection
HTTP timing diagram
HTML