Memory Management: allocation, swapping, and paging
paging
AT hard disk control flow
at_winchester_task called
init_params checks the boot parameters and the BIOS,
to figure out which disks are present
eventually, a DEV_OPEN message leads to calling
w_do_open, which calls w_prepare (almost every
message leads to calling w_prepare), w_identify, and
finally, if this is the first open for the device, partition
from drivers/libdriver/drvlib.c. partition calls
transfer (line 11566, in get_part_table) to get
the boot sector
w_identify does the hard work of requesting and decoding
the device information
the identification request is built by com_simple (p. 803),
and sent to the device by com_out (p. 801-802)
com_out waits for the controller to not be busy (line 12960),
then selects the drive, then waits again
the drive controller registers are shown in Figure 3-23 on p. 296
the driver will use cylinder/head/sector addressing if necessary,
but LBA (Logical Block Addressing) if possible, with up to 48 bits for
a block number
the final step in w_identify enables the interrupt from
the disk device (line 12701), specifying that interrupts should not
remain blocked while the device driver executes
AT hard disk read and write
on a w_transfer (p. 800), adjacent requests are done
in a single operation, up to wn->max_count.
the transfer request is built by do_transfer (p. 799-800),
and sent to the device by com_out (described above)
after do_transfer, there is a loop that is executed
once per sector, beginning on line 12890
read calls at_intr_wait (line 12909)
which calls w_intr_wait which calls receive, as previously
discussed in class
on a read or a write, w_waitfor reads the status
from the disk (line 13190) until the status is STATUS_DRQ
(data transfer request, line 12190)
bytes are finally copied with sys_insw or sys_outsw,
which is programmed I/O rather than DMA
a write now waits for an interrupt (line 12920), to check whether
everything was written
data is always transferred one sector at a time
for either reads or writes, the next I/O descriptor is then
selected if this I/O descriptor has been satisfied
in case of timeout (w_need_reset, p. 802, called from
line 13198), set a bit and request a re-initialization of the device
in case of timeout on the interrupt, w_timeout, p. 803,
(called from line 13134) decreases the maximum size given to
do_transfer from n to 8 sectors, and from 8 sectors to 1,
in case that was what tripped up the drive
Minix Terminal Driver
very complex: supports memory-mapped keyboard and displays, RS-232
serial terminals, and network-based logins (Pseudo-ttys, or PTYs, on other
systems)
character devices, but with two-dimensional
positioning capabilities (screens)
screens can be character-based or pixel-based (minix only supports
character-based)
buffering can be reserved per terminal (as is the case in Minix)
or shared among all terminals (central buffer pool)
terminal driver must perform line editing functions to
honor erase characters and possibly change newlines into CR-LF or
viceversa (or other combinations)
Terminal Modes
editors will do their own screen redrawing, can handle erase
characters, so should be given the raw stream of characters
the user enters
most programs take line input and would prefer to have the
operating system take care of editing: canonical or cooked
mode
special characters can control freezing (^S) or restarting (^Q)
the output, mark end of file (^D), end of line, etc
in non-canonical mode, Minix allows the specification of a
minimum number of characters to read and of a timeout for terminal
reads -- if either is satisfied, the read call completes
Reading from the terminal
book, figure 3-33, p. 319
user process sends message to file system
file system sends message to TTY task, which may directly go to (6),
but most likely
replies asking the file system to suspend the process
when a key is pressed, the generic interrupt handler notifies
the TTY task
the TTY task reads the I/O ports to determine which key was
pressed, and adds the character to a queue
at the top of the TTY task, the task copies available
data directly to the user space using physical
memory copying
the TTY task tells the file server (whenever it is ready to
receive the message) that it may wake up the user process
the FS server wakes up the user process
this complex technique gives acceptable performance for large bursts
of characters from the serial port on slow hardware, since the user process
and the file system are only involved when enough characters (or a timeout)
have been received
serial line controllers may be configured to only interrupt after
receiving several characters, rather than once per character
Bitmapped displays
each pixel (usually 1, 8, 16, 24, or 32 bits) represents one dot
on the display: one scan sweep position or one pixel on an LCD display
more resolution requires more memory (1280x1024x24 requires 4MB
for each display, without counting virtual desktop space)
data can be arranged in various fashions in memory, but usually
such that adjacent elements of a row are adjacent in memory
basic display operations include moving a block (bitblt),
drawing a point or a line, or filling in a rectangle (can also be done
with bitblt)
the device controller may include a fast data-parallel (SIMD)
computer to operate on many bits at once -- the GPU
a window manager, which may be part of the OS, must create
windows, associate them with processes, and support opening, closing,
moving, iconifying, etc
the X-window system is a user-level program (an X
server) that supports basic window operations for (possibly
remote) client programs
one of these client programs is the X window manager
other client programs show the time, check for mail, allow for
user command-line input (by running shells, perhaps remote shells),
support surfing the web, etc.
Memory Management
a single program in memory can occupy all the space not used
for the OS or the ROM
memory can be shared among a number of tasks using fixed partitions
each executable must be either location-independent code, or
relocatable code relocated at load time
each area of memory must also be protected by matching bits for
the memory area to bits in the CPU that user programs cannot modify
relocation can also be achieved (in hardware) by using two
segment registers, base and limit: on each access,
the CPU checks the address against the limit, then adds the base
for protection, only the OS can change the base and limit registers
Keeping track of memory
memory is divided into fixed-sized units
each unit is allocated or free
two ways to keep track of the allocated/free memory:
bitmap: each bit is 0 if the corresponding unit is free, or 1 if
it is taken. For units of n bits, the bitmap takes up 1/n
of the memory, e.g. for n = 215 bits
and 1GB = 233 bits of memory, the bitmap takes
218 bits or
32KB = 215 bytes.
linked list: each allocated or free segment is stored in a list.
When a segment is freed, adjacent list entries must be merged, when a
segment is allocated, an existing entry must be changed or split
first fit: first free block that fits is split and used. Fast,
fairly good with regard to fragmentation, may waste memory
next fit: same as first fit, but start from the end of the last
search, and wrap around. Almost the same performance as first fit
best fit: search the entire list to find the smallest hole
that will fit. Works great if exact matches are found (i.e. if allocations
are a few different sizes), otherwise leaves unusably small holes.
worst fit: use the biggest free block. makes it hard to ever
allocate large blocks.
quick fit: keep free lists for commonly requested sizes --
but merging after deallocation is expensive
Memory on Disk
if the OS needs more memory and doesn't have it available, it can
copy something to disk
any process trying to access the memory that is copied to disk
will have to wait for it to be brought back from disk
a process can be swapped to disk in its entirety, and then
of course its execution is suspended, only to be resumed when the
process is swapped in
or, a process's memory can be divided into fixed-sized blocks
(pages), each of which can be written to disk while not in use
when the disk is copied back to memory, it may be in a different
location:
this is easy if an entire segment is copied back in and
segment registers are in use -- only the base register needs to be
updated
if pages are used, essentially a segment base register
is needed for each page (since the size is fixed, no limit register is
needed) -- the collection of these "segment base registers" is called
the page table
page tables (one for each process) are kept in memory while any
part of the process is in memory
Fast Paging
the program computes an address and requests the corresponding
virtual memory location
the Memory Management Unit (MMU) is the hardware that
translates the virtual address to a physical address by
reading the page table from memory as needed
the page table entry must contain the physical address of the
page frame corresponding to the given virtual page, and additional
bits to record:
whether the translation is valid (mapped)
whether the page may be written (and/or read or executed)
whether the page has been read (referenced) since this bit was cleared
whether the page has been written (modified, made dirty) since this bit
was cleared
whether the page may be cached
dirty pages will have to be written back to disk, whereas
pages that already have a copy on disk but are not dirty can be
discarded
a small associative memory, the Translation Lookaside
Buffer (TLB), caches translations (and other bits) for
recently used pages
some RISC computers require the OS to manage the TLB
Efficient Paging
keeping a page table entry for every page of a process's
virtual address space can be expensive
one answer is to keep a top-level page table, and have pointers
to second-level page tables for only those areas that are allocated
another answer is to keep a table of physical-to-virtual address
translations instead: this table takes up a fixed fraction of the
memory (one entry per physical page), and is called an inverted
page table
this efficiency is important as virtual address spaces grow,
e.g. to 64 bits
inverted page tables must be searched (perhaps using hashing)
when a new TLB translation is needed
Effective Paging
paging only works well if most of the memory references are
cache hits not requiring disk access or page table access
most programs exhibit locality: most of the accesses
are in a small group of pages (the working set of the program),
and the most-recently used pages are the most likely to be accessed again
it is easy to design programs that have no locality, and context
switches also tend to destroy locality, but in general locality
is common
when in need of space, one or more pages must be selected for
eviction from memory:
ideally, the page that will be not be referenced the longest
should be evicted. This is optimal, but usually hard to compute
as an approximation, the page that was accessed least recently
should be evicted (assuming past predicts future): Least Recently Used, LRU,
also somewhat hard to compute
as an approximation to LRU, periodically mark all pages unused,
then see if they are used. If not, they are candidates for eviction:
Not Recently Used, NRU
NRU only works if the hardware keeps track of page accesses.
If not, pages can be unmapped (but retained in memory), causing a
page fault when accessed. The OS software can then mark the pages
and load the mapping for the page again
to select a page in NRU, select the one that was loaded the
longest ago: FIFO or second chance (because an accessed
page gets a second chance to remain in memory)