Fundamental Insights
- Digital Representation of Data
- Source Coding
- Channel Coding
- Packet Switching
- Project 1: Web server
Analog Representation of Data
- The world as we experience it is continuous
- Records
- Audio Tapes
- Telephones
- Radio
- TV
- Analog amplification
- Voltages, physical position
Analog Representation of Data:
Problems
- Amplification cannot restore what is no longer there
- Noise is amplified with the signal
- Compare with handwriting (digital):
- limited number of symbols aids in recognition
- redundancy in shapes, words, and prose, also aids in recognition
- reconstructed (unreadable) signal is just as good as original.
Digital representation of data
- represent analog values (quantities) as numbers
- represent numbers as sequences of bits, each with value 0 or 1
- can sample a continuous analog signal into a stream of
digital values
- even in the presence of noise, can often correctly distinguish
a 0 from a 1
- if circuitry can work in the presence of noise, can be cheaper
- circuitry can be general-purpose (eg DSPs), leading to even
cheaper implementation
Source Coding
- Shannon (Bell Labs, 1940s)
- Meaning and quantification of "information"
- How do you measure information?
Shannon Information Theory
- information means we are learning something we didn't know before
- information = uncertainty
- uncertainty modeled using random variables
- number of bits per symbol required to encode the output of
a source is the entropy rate of a source
- no compression algorithm can do better than the Shannon limit
Distortion
- use fewer bits than the limit
- "lose" \epsilon$ of the data
- average number of bits H(\epsilon)$ is less than limit
- if \epsilon is small, H(\epsilon)$ is close to the
entropy rate
- this is lossy compression
- try to keep the losses to where they are not easily detected (by humans)
Channel Coding
- Send bits over a real channel, e.g. telephone line
- In the real world there is no guarantee -- might always get an
error (noise is due to quantum randomness)
- How fast can we go?
- Rate R0 gives us bit error rate \epsilon0$
- Rate R1 > R0 gives us BER \epsilon1 > \epsilon0$
Shannon: Channel Coding
- what is maximum rate R$ at which bits can be sent with
arbitrarily low error rate \epsilon$?
- call that maximum rate C$
- for any R < C and any given 0 < \epsilon <= 1$, we
can transmit at R and have \epsilon$ or fewer errors.
- actually getting such a rate can be complicated!
Packet Switching
- different from telephone communication (circuit switching)
- package some data with control information and send it out
- analogous to post office
- circuit switching implies resource reservation, which may
be wasteful
- circuit switching requires more overhead to set up, maintain,
and tear down the connection
Statistical Multiplexing
- how many bits per second do you need?
- averaged over a minute
- averaged over an hour
- averaged over a day (40bps at Berkeley in 1995)
- with packet switching, idle resources can be used by others
- examples:
- email (up to 10^3$ bytes)
- files (up to 10^6$ bytes)
- overhead of connection setup might be worth it for files, might
not for email
Cost estimation
Circuit Switching
- connection setup time TC = 0.1s$
- data rate R = 64Kb/s$
- message size M = 8Kb$
- time to transmit message TM = M/R = 0.125s$
- total time T = TC + TM = 0.225s$
- effective rate RE = M / T = 36Kb/s$
- this is a waste of bandwidth!
Packet switching
- data rate R = 64Kb/s$
- message size M = 8Kb$
- header size H = 200b$
- time to transmit message TM = M/R = 0.125s$
- time to transmit header TH = H/R = 0.003125s$
- total time T = TH + TM = 0.128125$
- effective rate RE = M / T = 62Kb/s$
- statistical multiplexing gain: e.g. internet phone might only
use 30% of the bandwidth of a regular phone call
Project 1
Web Servers
- accept a connection
- read a request from the connection
- Single-line request, or
- Multi-line request with "HTTP/1.0" in first line
- parse the request
- determine correctness of request
- write header to connection
- if request is correct, read data from file, write to connection