Operating System Design
- OS Structure
- implementation issues
- operating system trends
- operating system review
Kernel Structure
- a monolithic system has no abstract internal structure
- a layered system has a well defined internal structure,
which should aid programming
- a microkernel has a fairly minimal kernel, with most
functions implemented in user-space processes
- an exokernel has a very minimal kernel, with all
functions other than security implemented in user-space processes
- a client-server system provides services accessed through
system calls. The services may be provided by the local kernel
or by an external processor, e.g. across the network
Implementation Issues
- Structure:
- how many pieces, and what does each do
- how organized: layers, threads, modules -- same context (faster)
or different context (safer)
- how much to give to the user (more flexible) or retain in
the kernel (single copy, so generally cheaper and faster)
- binding time: are decisions made
- when compiling (early binding, high overhead to change but
very efficient)
- at boot/initialization time (medium)
- at runtime (late binding, easy to change but higher overhead to use)
- checking for errors: if any error is found, must exit
cleanly, so if possible check for errors before acquiring resources.
- problem: acquiring a resource may result in an error...
Performance
- can try to optimize everything, but it is more likely to lead
to missing the primary goals
- performance depends on design as well as implementation:
plug-and-play booting is inherently slower than booting a
preconfigured system
- performance also depends on what must be implemented: fewer
features usually mean a faster system
- it is also not uncommon to select a certain design just
because it should allow for faster implementation
- technology changes, and what used to be fast sometimes
turns out not to in later technology
- for example, it used to be that the fastest programs were the
ones with the smallest number of instructions executed
- now, the fastest programs are the with the smallest number of
non-local memory accesses
Optimization Strategies
- use a better algorithm -- better asymptotic performance,
but also better constants
- space-time tradeoffs, e.g. for a function on characters, can
precompute a table or can use comparisons
- caching (another space-time tradeoff) helps when values are
reused. Examples: lazy programming languages, directory trees
- hints are like caches, but must be checked. Example,
likely() and unlikely() declarations in the Linux kernel
- use locality, e.g. working set for processes, files within a
directory
- optimize the common case: most of the exception handling
can be slow, but the normal, correct path should be fast
Operating System Trends: Machine size and speed
- in 1997, I had a laptop with 4MB RAM that ran Linux just fine, but
would not run Windows or X-Windows
- in 2006, 32 bit addressing is already limiting for high-end machines, and
- 64-bit addressing is becoming common
- single-level page tables cannot be used for virtual memory with
64-bit addressing, unless the pages are very large
- with 264 bytes of virtual memory, objects can also
be hidden in the address space -- available, but virtually
inaccessible until a pointer is provided (just as on the web)
- the entire disk (for the foreseeable future) can be mapped to the 64-bit
virtual address space -- could we have named storage paged in?
Operating System Trends: Distributed Operation
- while a kernel is always needed on each machine, most of the
data that a machine uses may be somewhere else across a network
- to the user, this looks like the machine is just a terminal
accessing data elsewhere, e.g. on the web
- making the data the focus may mean changing how we refer to named
data (e.g. URLs or similar machine+localname combinations), how we
design the OS (maximizing disk throughput may be less important than
maximizing data/network access speed), refocusing on security
issues, and making it natural to run code on other machines
- multiprocessors can be loosely coupled, e.g. beowulf
clusters or SETI-at-home, or tightly coupled, e.g.
4-way multiprocessors
- these two designs lead to very different OS issues and goals
Operating System Trends: Applications
- multimedia:
- an OS for a home entertainment center may have very different
needs than an OS for a desktop
- or, the desktop OS might need to process applications efficiently,
but also do multimedia well
- soft real-time components for multimedia I/O
- large variety of I/O devices
- perhaps different strategies for disk storage, e.g. really
emphasize real-time sequential access
- low-power computing
- design for battery-powered computers
- good power management needed, probably better than currently available
(better battery modeling, smarter batteries and charging)
- power management may extend to using different devices or
different device modes, e.g. lower-power radio transmissions or
slower CPU and disk operations, as needed
- slow down the CPU or put it to sleep whenever it is not needed (already
happening)
- embedded computing
- design for low cost systems
- may not have to be general purpose
- example: TinyOS runs on computers with 8K of RAM and has components
that can be selected according to the needs of the application
Operating System Trends: Ownership
- software can be copied for very low cost
- in an ideally efficient economy, people would be paid to develop
software, and anybody could use their product
- such a scheme is hard to make practical: how do we know
which software is worth developing, and how much to pay the developers?
- proprietary software: effort spent developing the OS goes to
waste because fewer people use the software than would use it if
it were free
- proprietary software: the developer's goals are to maximize
profit (e.g. by establishing a monopoly), and these goals do not
match (though they may overlap) the user's goals of maximizing their benefit
- free software: anybody can use, which maximizes the social return
from the development, but minimizes the developer's return (e.g. BSD
license, X license)
- open-source software: generally also free, guarantees that user
will have the option of modifying and fixing their software (e.g. GPL)
- some current models which give the developers some return on their
work developing free and open-source software:
- funded like research: government or a big company sponsors the
work because of the benefit to themselves (company) or to the
public (government)
- shareware: please donate if you like this product
- bragging rights
- sell, but without depriving others of rights to redistribute:
RedHat, Mandrake
- developers can do maintenance or special-purpose enhancements
in exchange for consulting fees (amount of fees may be related
to bragging rights)
- software started out as almost all open source, then became
productized, and is now seeing a resurgence of open source, at
least in some areas
Operating System Review
- resource management on computers
- scheduling: managing the CPU time
- processes and threads, inter-process communication, deadlock
- protection: managing access to memory, disk, other devices
- kernel mode and kernel entry points: system calls, context switches,
interrupts, delayed processing
- input/output: managing devices
- virtual memory: managing physical memory and backups to disk (overlays)
- disks and file systems: managing persistent storage, caching
Operating Systems as Resource Managers for Computers
- lots of resources on a computer: installed software, configuration,
data, ability to access a network, hardware including CPU and memory
- managing the resources means using them in controlled ways
for maximum benefit
- for example, resources should be protected against accidents and
against malicious programs (malware)
- also, limited resources must be shared among different uses:
real-time programs, supervisory programs, user applications, responding
to outside devices such as networks
Scheduling
- the resource managed in scheduling is the CPU time
- ``slices'' of CPU time are assigned to each thread
- a thread loses its time slice and stops executing when:
- the timeslice ends, or
- the thread blocks, e.g. waiting for I/O, synchronization, or a timer, or
- an interrupt causes the scheduling of a higher priority thread
- at such time, the scheduler must choose the next thread to
execute, usually based on priority
- long timeslices give high throughput for CPU-intensive processes,
short time slices give better response time
- Linux schedulers; give higher priority to I/O intensive
processes where response time is more important
- a thread join the ready list when:
- the thread is rescheduled for a new timeslice, or
- the thread is awakened after blocking, e.g. after an I/O completes
Processes, Threads, Contexts, IPC, deadlock
- a thread is simply the execution of a program
- a process is associated with all the resources used by one or more
threads (processes with zero threads are terminated), especially virtual
memory
- the registers of the CPU and MMU hold the context of a
computation: stack pointer, virtual memory tables
- on a thread switch, only need to save and restore
the CPU computation registers and the stack pointer
- on a process switch, also need to save and restore
the MMU state and the process descriptor, and may have to flush the cache
- both process switch and thread switch are referred to as
context switch, but a thread switch is much more lightweight
- thread switches can be done in user space or in kernel space,
process switches must be done in kernel space
- thread switches done in user space must face the issue of
what happens if a thread blocks -- does the entire set of
threads block, or is there a mechanism (usually, requiring that I/O
be done through the threading system) to allow other threads to be
started?
- inter-process communication uses mutexes, semaphores, and other
mechanisms to insure at most one thread at a time is accessing a given
resource
- semaphores can also be used to safely count resources used by
different threads, e.g. the producer and the consumer
- pipes provide a simple mechanism to connect a producer and
a consumer, even in different processes
- different threads trying to acquire each other's resources
may lead to deadlock. Simplest solution for avoiding deadlock is
to always acquire multiple resources in the same order
Protection for Memory, Disks, and Devices
- most operating systems prevent one process
from affecting other processes, except in controlled ways
- to do this, different processes run in different virtual address spaces,
so a pointer error in one program cannot modify memory in another
- the kernel has access to all the memory spaces
- disk access is also usually reserved for the kernel or the root
user
- as a result, data on disk has an "owner" (UID) and "permissions"
- devices usually also are protected, e.g. to prevent most processes
from writing directly to the frame buffer for the display
Kernel Mode, Kernel Entry Points
- the kernel is the part of the system that operates in protected
mode after the boot process is complete
- leaving protected mode is relatively easy (in assembly) for the kernel
- entering protected mode is only allowed at specific kernel entry
points, specifically:
- system calls
- interrupts handlers
- if the kernel data structures are mapped to the process's virtual
memory, the context switch on entering the system call does not need to
change the MMU, otherwise, the context switch may have to do a lot of work
- delayed processing is also a kernel entry point, but usually from
inside the kernel itself
Input and Output
- the performance of many system is judged more on I/O speed than
on CPU speed
- it would be good to do high-priority I/O before low-priority I/O, but:
- that information is not usually available to the kernel
- even low-priority I/O should not starve
- I/O priority may vary dynamically, e.g. writing back dirty pages
may become more important if free memory is all allocated
- external considerations (e.g. disk geometry, packet sequencing) may
suggest I/O sequencing that does not directly reflect external priority
- optimizing disk access is important because disk access is many orders
of magnitude slower than memory access
Virtual Memory
- an address (pointer) in a program (virtual address)
does not reflect the address in memory (physical address)
- the low-order bits of the virtual address are the same as the
low-order bits of the physical address
- the high-order bits of the virtual address are translated through
an arbitrary page table maintained by the operating system, and
cached within the memory management unit (MMU)'s translation
lookaside buffer (TLB)
- one of the main jobs of the OS is to assign physical pages
to processes for their virtual pages
- another one of the main jobs of the OS is to assign physical pages
to files to cache their disk blocks
- when physical pages are in short supply, some of the virtual
memory for one or more processes may be saved to swap space
(or swap files) on disk
- this means very large processes can run on machines with small
physical memory, as long as they don't need all their address space
at the same time: the different virtual address pages form
overlays on the same physical pages
Disks and File Systems
- current practice is to store file systems on disks, though flash
is beginning to make inroads, and storing a file on a computer across
a network is well established (NFS, SMB, etc)
- the most common file access pattern is sequential, but random
access must also be supported, and most file systems also support
file ``holes'' where nothing has been written
- files are organized hierarchically (using directories or folders)
with optional cross-links or back links, including both hard links and
soft links
- opening a file is used to tell the OS that it is advisable to cache
information for the file, closing the file suggests writing back or
discarding the cache
Operating Systems
- interesting, concurrent, influential, useful programs