A rigid body is a system of particles with the constraint that the distance between any pair of particles is fixed. No physical body is entirely rigid, but rigid-body models are good enough for many purposes. There are situations where rigid-body models do not work as we would like; for example, during collisions, the forces generated depend on deformations of the bodies.
Displacements
A displacement is a transformation of a system that maintains the distance between every pair of points in the system. Rigid bodies are therefore systems for which we only allow displacements.
There are two special types of displacements:
A rotation is a displacement that leaves a single point (called the center of rotation) fixed. Notice that there's no reason that the point has to a particle in the system; the fixed point could be anywhere. Another thing to notice is that this definition of rotation applies in any number of dimensions.
A translation is a displacement where all points move along parallel lines.
There are displacements in 3D that are neither pure translations nor pure rotations, but it turns out that they are just combinations of translations and rotations. In fact, given a displacement, you choose an arbitrary point in space, and get that displacement by a rotation that leaves that point fixed, followed by a translation. (Intuition. Think about a rigid body in two configurations. Choose a point O. Rotate the body however you like, but keeping the distance of every point on the rigid body the same distance from O. You can eventually get he body to be in the same orientation as the goal configuration. Then you'll have to do a translation to get the body into the right place. A formal proof can be found in the Mason text.)
All displacements in the plane are pure rotations
A displacement can be viewed as a translation composed with a rotation about any point. However, if you choose that point carefully in the plane, you can get rid of the translation component.
Here's an experiment to try for yourself. Cut out a "rigid body" from a piece of paper; any shape you like. Mark two different points on the body, A and B. Outline the body on a piece of "background" paper, and mark the two points also. Move the body from one configuration to another — this motion is a displacement. Outline the body in its new configuration on the background paper, and mark the two points in their new configurations, A' and B'. Notice that the center of rotation, if it exists, must be on the perpendicular bisector of the segment AA'; draw this bisector. The center of rotation must also be on the perpendicular bisector of BB'; draw this bisector. The center of rotation must be at the intersection of the two bisectors you just drew.
But what if the two bisectors are parallel, and don't intersect? That's a pure translation. It doesn't appear to leave any point fixed. However, if the motion were almost a pure translation, the lines would intersect, just very far away. As we get closer and closer to translation, that intersection moves towards infinity. In projective geometry, we are allowed to have points at infinity — a translation is therefore a rotation about a point at infinity, in a particular direction.
The fact that displacements in the plane are pure rotations is a special case of the fact that
All displacements in space are screws (Chasle's theorem)
A "screw" is a rotation about an axis, combined with a translation parallel to that axis. There is a proof of Chasle's theorem in the Mason book.
Reauleaux's method for analyzing planar rigid-body grasps
The fact that all displacements in the plane are pure rotations allows graphical methods for analysis of rigid body motion. For example, consider grasping a rigid body with n point fingers. Assume that the fingers are frictionless, and apply unilateral forces (they push, but do not pull). How many fingers need to be placed around the boundary of a planar rigid body to immobilize it?
Here is a triangular rigid body being grasped by three fingers (one red, one green, and one blue).

Is the body immobilized? What if we add a finger at the location shown by the dotted circle? First, consider the way a single finger restricts the motion of the body. For example, the red finger. Any displacement can be described by a rotation center, and a distance and direction to rotate around that rotation center. So pick a rotation center. Notice that either the finger prohibits clockwise (negative) rotations, or prohibits counterclockwise (positive) rotations, unless the rotation center is on the line through the finger parallel to the normal.
So we can label each rotation center as allowing positive or negative rotations. The red + and - signs show the allowed rotation for rotation centers at various locations.
Now consider the blue finger. Each rotation center may be similarly positive, or negative. What if a rotation center is positive for the red finger, and negative for the blue? That means that rotation in either direction will cause a collision with at least one finger, so these rotation centers are not possible.
We add each additional finger, labeling rotation centers, and removing regions where the signs conflict. If, at the end, the entire plane of rotation centers has been removed, the object is in an immobilizing grasp.
Vectors
Wikipedia has a useful page on vectors. A vector is a direction in space, with an associated magnitude. Vectors have a initial point and a terminal point, but often we do not write down the initial point, and instead just write the vector as the difference between the two points. For example, we would say that the vector from the initial point (2, 3, 1) to the terminal point (10, 10, 10) is (8, 7, 9). We should not forget that the vector has an initial point.
Vectors can be written vertically (a column vector) or horizontally (a row vector). Typically we will write vectors as column vectors, i.e.
(1)
Transposition changes a row vector to a column vector, and vice versa. Transposition is denoted by a superscript T;
.
We say that a set of vectors is independent if none of the vectors can be expressed as a linear combination of the others.
We will use vectors in many ways; one use is to express position. We label some point in the world the origin, and any other location can be expressed by a vector giving the location of that location with respect to the origin.
Dot products
The dot product of two vectors returns a scalar, and can be computed in two ways. First, you can multiply the corresponding elements of the vectors and add them:
. The length of a vector is the square root of the vector of the dot product of that vector with itself.
.
The second way to compute a dot product is as the cosine of the angle between the two vectors, multiplied by the lengths of the vectors.
. The equivalence of these two formulas define cosine in higher dimensions. If you want to compute the angle between two vectors, scale the vectors be be unit vectors, and take the arccosine of the dot product.
Dot products are discussed at more length on Wikipedia. You should make sure you understand the geometric interpretation of dot products as a scaled projection of one vector onto another.
There are several different types of notation for the dot product. Sometimes you'll see a dot between the vectors:
. Sometimes you'll simply see a row vector multiplied by a column vector:
. Sometimes you'll see angle brackets:
.
Frames
A frame in three dimensions is a collection of three independent vectors with the same initial point. Typically we will consider orthogonal frames, where the dot product of any pair of vectors is zero. We will also use only vectors of unit length.
We can attach a frame to the world, to a rigid body, or to any other portion of the world. For example, to describe a coordinate system on the world, we chose some point and call it the origin, then attach a frame to that point so that the vectors point in convenient directions, which we might call x, y, and z.
We can describe the orientation of a frame using a matrix. For example,
(2)![_1F = \left[ \begin{array}{ccc} \sqrt{3}/2 & -.5 & 0\ .5 & \sqrt{3}/2 & 0\ 0 & 0 & 1 \end{array} \right]](/local--math/eqs/d219d0e505d88414c612560e033047d0.png)
The columns of the matrix are the three vectors that make up the frame. We would also need to know where the origin of the frame was. Let's say that the origin of the frame
is (0, 2, 0).
We can attach a frame rigidly to a rigid body — this is called a body-fixed frame. Locations of points on the body can then be expressed relative to a frame. For example, we might talk about a point
with location
in the frame
. The superscript before the variable indicates that the vector is expressed relative to that frame. The coordinates of that same point would be very different if expressed relative to the world frame: in this case,
.
Sometimes we leave the superscript off, if the position is relative to a world frame, or we have been very careful to say which frame we intend the position to be relative to.
If we attach a frame to a rigid body, the location of the frame, as well as the matrix associated with the frame, determine where every point in the rigid body is. Therefore, when we think about moving rigid bodies, we tend to attach a frame to them, and then perform all computations on that frame, rather than worrying about where the individual particles in the body are. This is one reason rigid-body models are so convenient.
Cross products and Handedness
We'll be using frames as coordinate systems, and there is one more convention to know about: handedness. If you point the fingers of your right hand along the first vector, and curl your fingers towards the second vector, your thumb will point either directly along the third vector of an orthogonal frame, or directly oppose the third vector. We say the frame, or coordinate system, is "right-handed" if your thumb points along the third vector. Unless otherwise specified, we will use right-handed frames.
If you've taken linear algebra or a basic physics class, you've seen the cross product. The cross-product is an operation that takes two vectors in
and returns a third vector. The direction of the new vector is perpendicular to each of the first two vectors, and the magnitude of the new vector is equal to the scalar product of the magnitudes of the first two vectors, times the sine of the angle between those vectors. But there are two directions perpendicular to the first two vectors. The cross-product is defined so that if you wrap the finger of your right hand from the first vector to the second, your thumb points in the direction of the computed vector.
Notice that the handedness of cross products means that
; a minus sign is introduces if you swap the orders of the vectors. Formally, this property is called "skew-symmetry".
Given the first two vectors of a right-handed orthogonal frame, you can construct the third vector by taking the cross product. Maybe you don't remember how to compute the cross product. One option is to look up the formula on Wikipedia. But there's also a clever trick to remember how. In physics the coordinate vectors are sometimes called i, j, and k. So the sum
is an alternate way of writing the vector
. It turns out that the cross product of two vectors can be written as the determinant of a matrix in which i, j, and k are the 'elements' of the first row. (It may seem odd to have coordinate directions be elements of a matrix, and to compute a determinant that is not a scalar. This is called an "abuse of notation".)
![a \times b = \mathrm{Det} \left[ \begin{array}{ccc} \mathbf{i}& \mathbf{j} & \mathbf{k} \ a_1 & a_2 & a_3\ b_1 & b_2 & b_3 \end{array} \right]](/local--math/eqs/d03c5fe1d4d2aed8807c1756446e6e37.png)
Of course, you might not remember how to compute the determinant of a 3x3 matrix. Fortunately, Wikipedia does.
Geometry of matrix multiplication
You remember how to multiply matrices by vectors and matrices by matrices. There are useful geometric interpretations of matrix multiplication. A matrix is a rectangle containing numbers.
Matrix multiplication as a weighted sum of column vectors
We could also think of a matrix as being a row vector whose elements are column vectors. That is,
(4)![Ax = \left[ \begin{array}{ccc} a_{11} & a_{12} & a_{13}\ a_{21} & a_{22} & a_{23} \ a_{31} & a_{32} & a_{33} \end{array} \right] x = a_1 x_1 + a_2 x_2 + a_3 x_3](/local--math/eqs/707e6dd81bcff4a7549d5ec4f6a47bdd.png)
where
,
, and
are the columns of A. The matrix-vector product scales each of these vectors by an element of
and takes the sum. So, matrix multiplication can be seen as taking a weighted sum of the column vectors of the matrix.
Matrix multiplication as a projection onto row vectors
A matrix could also be viewed as a column vector whose elements are row vectors. If
,
, and
are the rows of the matrix B, then

That is, the elements of the resulting vector are the projection of the original vector onto each of the rows of the matrix. We'll come back to this idea in the future when we study robot grasping.
Representing displacements
Displacements are combinations of rotations and translations, pure rotations, or pure translations. We would like to represent displacements mathematically, so that we can build algorithms that generate or interpret rigid body motions.
There are three things we want to do:
- Represent (describe) displacements
- Transform a rigid body system using a displacement
- Given a point in one coordinate system, find its coordinates relative to a different frame
How we do this depends somewhat on how we represent rigid bodies. For now, assume the rigid body system is described as a collection of points in the plane or in space, with a 2-vector or 3-vector giving the Cartesian location of each point relative to some frame.
Translation is easy to represent: We just use a vector describing the magnitude of the components of the translation along the axes of some frame.
Transformation by a translation is then implemented by vector addition of each point in the system and the translation vector. The inverse of a translation is represented by the vector multiplied by negative one.
Rotation matrices
Rotations leave one point fixed. Since any displacement can be viewed as a rotation about some arbitrary point, combined with a translation, it follows that a rotation can be viewed as a translation, a rotation about the origin, and a translation that is the inverse of the first translation. For this reason, we will only consider rotations about the origin.
Two dimensions
Consider a rotation in two dimensions about the origin by the angle
. We could represent this rotation just by a single number, the value of
.
How can we then implement transformation by rotation? Probably you've seen a formula that looks like this:
(6)
where
(7)![R = \left[ \begin{array}{cc} \cos \theta & -\sin \theta \ \sin \theta & \cos \theta \end{array}. \right]](/local--math/eqs/989fff6a789b09ff7f75c3ba2556ebad.png)
Why should this be the right formula? Let's try a few points. First, take the point
. We multiply through and get
, which is geometrically what we expect. Then, take the point
. Multiplication gives you
, also what we expect.
This is for me the easiest way to remember the correct elements in a rotation matrix — otherwise I get very confused about which signs, sines, and cosines go where.
This interpretation of the rotation matrix (the columns are the coordinates of the transformed world frame) also lets us see a few important properties of rotation matrices:
- The columns are of unit length (take the dot product of a column with itself. It's one.)
- The columns are orthogonal (take the dot product of two columns. It's zero.)
- The determinant is 1.
There is a geometric interpretation of the determinant of a matrix — it describes how much the matrix "stretches" points away from each other. We won't go into the details, but a rotation preserves areas and volumes of rigid bodies (not surprising!), and thus has determinant one. There is another type of matrix that has orthogonal columns of unit length but determinant -1 — these are called reflections.
The set of all possible 2D rotation matrices has a name: the "special orthogonal group 2", abbreviated SO(2). The "special" means determinant 1 (not -1), and the orthogonal means the columns are orthogonal. You just have to remember that these matrices also have columns of unit length.
Orthogonal matrices have an interesting property: their transpose and their inverse are the same. So to invert a rotation, just transpose the matrix!
Three dimensions
Who cares about special orthogonal matrices? Why not just use the angle
to represent rotations? In 2D, that's perfectly fine, but in 3D, which
do you use?
Euler angles
You could pick three angles, describing, for example, the rotation about the X axis by angle
, followed by a rotation about the Z axis by angle
, followed by a new rotation about the X axis by angle
, as shown on this Wikipedia page.
This representation is called a "XZX" Euler angle representation. It's ok in some cases, but imagine a simple rotation about the X axis by $\theta$. The rotation about the Z axis is zero. There are an infinite number of choices of combinations for $\alpha$ and $\gamma$ — they just have to sum to $\theta$!
This can be problematic. For example, if you just sample the three angles uniformly, you might hope to get a "uniform" distribution of rotations. What does a uniform distribution of rotations mean? I'm not sure, but this isn't it — a lot of the rotations you get will tend to be rotations about, or nearly about, the X axis.
The choice of XZX axes was arbitrary. There are also ZYZ axes, XYZ angles, and other conventions.
SO(3) rotation matrics
Some things are simplified if we use a matrix (with nine numbers) instead of the three Euler angles to describe rotations. The columns of the matrix represent the location of each of the axis of the base frame after the rotation, so the geometric interpretation seems to me to be much much simpler than for Euler angles. There are constraints on these nine numbers: the columns are orthogonal, of unit length, and the determinant is one.
The name for the space of rotation matrices is SO(3): the special orthogonal group 3. If we remember the interpretation of columns as transformations of the vectors along the axes of the base frame, it's easy to write down several special cases of rotation matrices. For example,
is the rotation matrix describing rotation around the $z$ axis by the angle
, and the columns are similar to those for the 2D rotation matrix. (2D rotations are always around the z-axis.)
![R = \left[ \begin{array}{ccc} \cos \theta & -\sin \theta & 0\ \sin \theta & \cos \theta & 0 \ 0 & 0 & 1 \ \end{array}. \right]](/local--math/eqs/3cf16f9a86683df8ea6a37af98ac788a.png)
There's also a convention that we need to keep track of. Given an axis, do we rotate around it clockwise or counterclockwise? We will use right-handed rotation matrices. Point your right thumb along the axis you want to rotate around; your fingers curl in the positive direction.
Homogeneous coordinates
Every displacement is a composition of a rotation about any point you choose, and a translation. Let's say that you use a rotation matrix R to represent rotations, and a vector v to represent translations. Then a displacement transformation can be implemented with:
(9)
The inverse is given by
(10)
There's nothing wrong with that, but it would be nice to represent a displacement by just a matrix, and implement the transformation by a simple multiplication. There's a trick to this. We represent the points in the rigid body using *four* numbers, p = (x, y, z, w). For now let w = 1. Then we can write a 4x4 displacement matrix that looks like:
(11)![T = \left[ \begin{array}{cccc} R & & & p \ 0 & 0 & 0 & 1 \end{array}. \right]](/local--math/eqs/a1c4ae1bf4eee41c767c334b8dea3ca3.png)
This is called a "homogeneous coordinate system". For displacements, w is always 1, but it can take on other values. The physical location of p is interpreted as (x/w, y/w, z/w). Why do we care? Well, if you set w = 0, then you can describe a point at infinity, in a particular direction. Why would you want to? Well, a translation can be viewed as a translation about a point at infinity.
Homogeneous coordinates also show up in computer graphics, and are useful in studying the geometry of projections (since you can scale along a vector by just changing the parameter w.) For now we will not worry too much about projective geometry, but we will use homogeneous transform matrices to represent displacements.
Other ways to represent rotations: Axis-angle, Quaternions
You could describe a rotation by three numbers that describe a rotation axis, together with an angle by which to rotate. A variation of this is to use the length of the rotation axis vector to give the magnitude of the rotation.
Quaternions are another way of representing rotations using four numbers. They have some nice properties, particularly in terms of uniform sampling and stability of numerical computations, and are used very widely for implementations of robotics and graphics algorithms. I'd recommend you read up on them on your own.
Converting between rotation representations
Sometimes it's necessary to convert between rotation representations. For example, Rodrigues formula lets you convert from axis-angle to rotation matrix. It's easy to convert from an Euler-angle representation to a rotation matrix — just write three matrices down representing each coordinate axis rotation, and multiply.





