Note on Special Relativity
Prof. Pui Lam
Date: 11/5/2003

Terms and Concepts:


Relativity Theory - One aspect of a relativity theory is to relate the measurements by one “observer” to the corresponding measurements by another observer who is in relative motion with respect to the first observer.

“Observer” - The term “observer” does not refer to an individual but rather an entire coordinate system (or called reference frame). You can visualize a coordinate system as a 3-dimensional grid that fills all space with a clock at each grid site to measure the local time.
Note: It is very important to have a mental picture of many clocks, one at every location. Many misconceptions (paradoxes) in relativity arise from using one clock for the entire reference frame.

Coordinate “transformation” and other transformations - Example: an event (say a fire cracker exploded) took place at (x,y,z) at time t according to one observer but according to a second observer, the same event occurred at (x’y’z’) at t’. If the relative motion between these two observer is known, then one should be able deduce the values of (t’,x’,y’,z’) given (t,x,y,z) and vice versa. The formula relating these quantities is called the coordinate “transformation”
(Note: The word “transformation” may be a bad choice because it may cause someone to think of metamorphism. The event is unchanged! It is the same event, only representations are different.)

Another example: The first observer measures the E and B fields at a space-time point (t,x,y,z), i.e. E(t,x,y,z) and B(t,x,y,z). The second observer measuring the E and B fields at the same space-time point may obtain different values for the fields, i.e. E’(t’,x’,y’,z’) and B’(t’,x’,y’,z’). The formula relating (E’(t’,x’,y’,z’), B’(t’,x’,y’,z’)) to (E(t,x,y,z),B(t,x,y,z)) is called the E&B-field transformation.

Relativity theory seeks to discover all formulae relating all measurements of one observer to another observer in relative motion.

“Special” Relativity - Einstein’s first relativity theory which specializes to a subset of observers called the inertial observers who are in relative constant velocity with respective to each other.

Inertial observer (or inertial reference frame) - The definition of an inertial reference frame is a frame where Newton’s First Law is valid.

An example of an inertial reference frame is a coordinate system attached to a spaceship located in outer space far away from any large masses (or gravitational force) and has its engine off. That is, this inertial reference frame is not being acted upon by anything. Another spaceship in the same gravity-free region with engine off is another inertial frame but it could be moving at a constant velocity with respect to the first inertial frame. All possible inertial frames in this gravity free region forms a group. Special relativity theory provide a coordinate transformation formula (called Lorentz transformation) which can relate the measurement of any of these inertial observers to any other inertial observer in this group.
Question: According the definition of inertial reference frame, can you think of other example of inertial reference frame?

If you can’t think of any other example, there is only one more example. In a region of constant gravitational field, any free-falling observer is an inertial frame (example, an elevator free-falls after its cable broke). All free-falling observers in this region forms a group. Lorentz transformation can relate the measurement of any of these inertial observers to any other inertial observer in this group.
** However, Lorentz transformation cannot relate the measurement of the inertial observers in the first group with those of the second group.

** Inertial frames can only be defined “locally”.

General Relativity: Refers to Einstein’s second relativity theory which relates measurements of any observers. General Relativity is also a theory about gravity.

(Aside: According to “General” relativity, gravity can affect space-time measurements, we say space-time is “curved and Euclidean geometry is not valid. In an inertial frame, space-time is “flat” where Euclidean geometry is valid for spatial measurements. The combined space-time is pseudo-Euclidean)


Deducing the Coordinate Transformation in Special Relativity:


Statement of the problem: Two inertial reference frames (S and S’), S’ is moving at a constant velocity V wrt S or likewise S is moving at a constant velocity -V wrt S’. For simplicity, we let the origins of both frames coincide at t=0=t’. For an event labeled as (t,x,y,z) by the S-frame and the same event labeled as (t’,x’,y’,z’) by the S’-frame, what is the formula that gives (t’,x’,y’,z’) in terms of (t,x,y,z) or vice versa? (That is, find the functions t’=f1(x,y,z,t), x’=f2(x,y,z,t), y’=f3(x,y,z,t), z’=f4(x,y,z,t), and their inverse functions)

Solution 1: Perform experiments and observed them in two reference frames and deduce the formula.
Hmm...although this is the ultimate verification but history has showed us that experimental accuracy was not good enough to invalidated Galilean Transformation from the time of Newton to Einstein. Present experiments are accurate enough to show that Galilean Transformation is incorrect. Therefore we should also seek a theoretical basis for deducing the coordinate transformation.

Solution 2: Theoretical Basis.
How should we proceed?
(1) Can we reason what kind of functions are f1 ,f2, f3, and f4? Are they linear functions of the independent variables or quadratic functions, or whatever?
It turns out that we can reason that the functions “must” be linear functions based on the definition of inertial frames (i.e. Newton’s First Law must be obeyed in an inertial frame). Newton’s First Law basically says that if a mass is not acted upon by a force (i.e. no interaction any other object), then its velocity vector as viewed in an inertial frame must be a constant vector hence its position as a function of time must be linear, i.e. x(t), y(t), and z(t) are linear in t. For example, . The same object as viewed in S’ also has constant velocity, i.e. , etc. In order for the object to have constant velocity in both frames, the coordinate transform must be linear also, i.e.



(Note: If we want to relate the coordinate measurements of an event in a non-inertial frame with those in an inertial frame, the transformation will not be linear.)

We can simplify the matrix M by assuming that the relative velocity V is along one of the axis, say, the x-direction. We assume that y and z coordinates are unaffected by the motion of the frame in the x-direction (can we prove this?), hence



Furthermore the equation



which is true for all x’ must be true for x’=0 (the origin of the S’-frame). This point (x’=0) moves with a velocity V as seen in the S-frame and at t=0, x’=0 coincide with x=0, hence
this point is labeled by (t,x=Vt)



This is as far as we can go to deduce the form of the transformation matrix based on definition of inertial frame and relative velocity. We need another piece of information.
We need a quantity whose value is the same in both frames (we say that quantity is invariant).

Pre-Einstein: it was assumed that time interval is an invariant. i.e. . In fact, in our every day life, we still assume this is true when we compare clock reading with our friends. This assumption turns out to be false!

During Einstein’s time, it was verified experimentally that speed of light is invariant, i.e. speed of light (c) is for both S and S’ frames.

Using this additional information, one can derive the coordinate transformation (Lorentz transformation) valid in special relativity. The derivation is left as an exercise for the reader. Hint: Consider a point source of light located at the origin of both frames at t=t’=0. Observe the spherical wavefront in both frame and noting that the wavefront velocity =c in both frames.

To summarize:

Einstein showed that the (Lorentz) coordinate transformation can be deduced based on:

(1) Postulate: Speed of light in vacuum is the same for all inertial observers, regardless the motion of the source. (This postulate is verified by experiments as valid within the present day experimental uncertainty)
(2) The coordinate transformation is a linear transformation (This can be deduced from the definition of inertial reference frame)

There are couple more implicit assumptions:

(3) Postulate that the space is isotropic (This postulate is verified by experiments as valid within the present day experimental uncertainty)
(4) Euclidean geometry is assumed to be valid in an inertial reference frame.


Universal time and universal light speed are not compatible with each each.
(Can be illustrated using light signal to synchronize clocks)



Difference between Galilean Transformation (pre-Einstein) and Lorentz Transformation (post-Einstein)


(1) Lorentz Transformation is derived based on “universal light speed in vacuum”.
Resulting equations:



One important consequence is that c is also the maximum speed possible.
In fact, “universal light speed” <=> “c is the maximum speed possible”.

The finiteness of the maximum speed => impossible to synchronized all the clocks in two inertial frames => disagreement in clock readings

(2) Galilean Transformation is derived based on “universal time or clock readings”
Resulting equations:



Universal time can be achieved (i.e. synchronized all the clocks in two inertial frames) if the “maximum” speed is infinite.

In fact, if you let you can recover Galilean Transform from Lorentz Transform.

Note: At present, experimental evidence and theory => Lorentz Transform is correct and Galilean Transform is incorrect.


“How to use Lorentz Transformation formula and/or the invariance interval formula to do your homework problems”

In mechanics, all quantities can be expressed in terms of mass, length and time, hence
knowing the Lorentz coordinate transformation and the additional postulate that mass is invariant (m=m’), one can relate all mechanics measurements in one inertial frame with those in another inertial frame.

In most problems, you are asked to find either the spatial interval or the temporal interval of two events. Try not to use popular saying such as “moving clock runs slow” or “moving meter stick contracts” to do your homework. Both of these statements emphasize either alone or alone, i.e., they don’t tell the whole story. The interval between two events depends on both and .

In any problem,
(1) Identify the two inertial frame. Use S for the “stationary” or “lab” frame, this is usually you! The object moving wrt you is the “moving frame”, let it be S’
With this convention, the Lorentz formula are


(2) Identify the events, use the given information to find and , use the Lorentz transformation to find and

Example 1: “Time dilation” or “Moving clock runs slow”

(1) You are the “stationary” frame (S). A clock moving wrt you at is the “moving” frame (S’).

(2) The two events are (as observed by you, the S-frame) are:
Event 1: At t=0, the clock is position is x=0 =>(t=0,x=0)
Event 2: At some later t=, the clock position is x=V =>(t,x=V)
hence =t and =Vt
Use the Lorentz transformation we


Instead of saying “moving clock runs slow”, it it better say that the time interval between two events is the shortest in the frame where the spatial interval is zero.

Note: In this problem, we knew 3 quantities right off the bat, ,, and . There is only one unknown , hence we need only one equation, the invariance interval equation.
Furthermore, sometimes it is easier to use the inverse transformation depends on what is known and what is being asked. In this case, since , we could use the inverse transformation



Example 2: Length Contraction

(1) You are the “stationary” frame (S). A “meter” stick moving wrt you at is the “moving” frame (S’).

(2) The key here is that in order for you to measure the “length” of this moving stick, you must measure both ends at the same time, hence for you (S-frame), .
(The two events are the space-time coordinates of the two ends of the stick). In the S’ frame (the moving frame as far as you are concerned but it is the “rest” frame of the meter stick ),



We can also use the invariance interval equation to draw some conclusion.


Instead of saying “moving meter contracts”, it it better say that the spatial interval between two events is the shortest in the frame where the time interval is zero.

Velocity Addition formula:





#