SPARC Architecture:
Registers, Pipelining
- 32-bit load-store machine
- 32 general-purpose 32-bit registers
- Example assembly code
- Pipelined architecture
- Example program
- Questions and Answers
SPARC Registers -- global, out
- 8 {\bf global} registers
- For data that has meaning for the entire program,
not just a few functions
- %g0-%g7 = %r0-%r7.
- %r0 is always zero, discards values written to it.
- 8 {\bf out} registers
- For temporaries, passing arguments to functions,
return values from functions.
- %o0-%o7 = %r8-%r15.
- %sp = %r14 = %o6.
- %r15 = %o7 = return address.
SPARC Registers -- local, in
- 8 {\bf local} registers
- For use as temporaries
- %l0-%l7 = %r16-%r23.
- 8 {\bf in} registers
- Input parameters when a function is called (not used for now)
- %i0-%i7 = %r24-%r31
- %fp = %r30 = %i1.
32-bit registers
- store a signed integer i, -2^{31} <= i < 2^{31}
- store an unsigned integer i, 0 <= i < 2^{32}
- store an address
using the assembler
Note: this differs from book
- on uhunix
- assembly-level source in
program.m
- \verb|
- \verb|
- \verb|
as
: the assembler
- line based
- each line has:
-
label:
- tab
- instruction or data
- tab
- operands
- tab
-
!
comment
example
.global main
main: save
add
sub
.global and .word are pseudo-ops
instructions
- literal constants c, -4096 <= c < 4096
- three-operand instructions: register, register-or-immediate,
destination register (e.g. add)
- two-operand instructions: register-or-immediate, register (e.g. mov)
- one-operand instructions: register (e.g. clr)
- RISC: no multiplication or division. To multiply or divide,
- place arguments into \verb|
-
call .mul
or
-
call .div
- a called function may modify \verb|
instruction execution: pipelining
- von Neumann cycle: instruction fetch and decode, operand fetch,
instruction execution, store result.
- can fetch the next instruction while the current one is executing.
- each instruction execution takes the same time, but program
execution is faster.
- if an instruction takes 5 basic operations, and these are overlapped,
we execute on average 1 instruction per basic operation time; if they are
not overlapped, we execute five times slower.
pipelining problem: branches
- pipeline depends on predicting the order of execution of instructions
- in a branch, instruction fetch depends on completion execution
of prior instruction
- branches cannot be easily pipelined
- instead of discarding instruction after branch, SPARC executes it!
- branch modifies "next-PC" instead of "PC"
- delay slot instruction can be useful
- if no useful work to do, use nop
- use after call or any other branch
example program
x = 9, y = ((x - 1)(x - 7) / (x - 11))
...
.global main
main: save
mov 9,
sub
sub
call .mul
nop
sub
call .div
nop
mov
mov 1,
ta 0