15.1 General Procedure

Figure 15.1: Illustration of a change of basis in two dimensions. (a) The standard basis ${\hat\imath},{\hat\jmath}$, a new basis $\vec p_1,\vec p_2$, and any arbitrary vector $\vec v$. (b) Any vector $\vec v$ can be written as $\vec v$ $\vphantom0\raisebox{1.5pt}{$=$}$ $v_1{\hat\imath}+v_2{\hat\jmath}$. (c) However, the same vector can also be written as $\vec v$ $\vphantom0\raisebox{1.5pt}{$=$}$ $v_1'\vec p_1+v_2'\vec p_2$. That uses the new basis.
\begin{figure}
\centering
\setlength{\unitlength}{1pt}
\begin{picture}(...
...t(92.9,118.2){\makebox(0,0)[l]{$v_2'\vec p_2$}}
\end{picture}
\end{figure}

Normally, you use Cartesian coordinates when dealing with vectors. That is based on “Cartesian basis vectors” ${\hat\imath},{\hat\jmath},{\hat k},\ldots$. For example, in two dimensions,

\begin{displaymath}
{\hat\imath}= \left( \begin{array}{c} 1  0 \end{array} \...
...h}= \left( \begin{array}{c} 0  1 \end{array} \right) \qquad
\end{displaymath}

These Cartesian basis vectors are shown in red in figure 15.1(a). Using these two basis vectors, you can write any vector $\vec v$ in the form

\begin{displaymath}
\vec v = v_1 {\hat\imath}+ v_2 {\hat\jmath}
\end{displaymath}

This is illustrated graphically in figure 15.1(b). Note how, starting from the origin, if you first move over $v_1{\hat\imath}$ and then over $v_2{\hat\jmath}$ you reach the end point of vector $\vec v$. Now $v_1$ and $v_2$ are ordinary numbers that are called the components, or coefficients,” or “coordinates of vector $\vec v$.

Therefore, in this way, you are never have to deal with more vectors that ${\hat\imath}$ and ${\hat\jmath}$. All the rest is just ordinary numbers.

But sometimes it is convenient to use a different basis than the obvious ${\hat\imath},{\hat\jmath}$ one. For example, you might know that in dealing with plane stresses, it is often convenient to rotate the coordinate system to the principal axes. In principal axes, there are no shear stresses, just normal stresses. But if you rotate the coordinate system, ${\hat\imath}$ and ${\hat\jmath}$ become different vectors, call them ${\hat\imath}'$ and ${\hat\jmath}'$. The point however is that in using these new basis vectors ${\hat\imath}'$ and ${\hat\jmath}'$, your physical problem has simplified.

As we will see later, to simplify a problem, the desired new vectors are not always orthonormal (orthogonal and of length 1) like ${\hat\imath}'$ and ${\hat\jmath}'$ in the example above. In general, the new basis vectors, we will call them $\vec p_1$ and $\vec p_2$, can be anything, as long as they are linearly independent. As long as that is true, you can still write any arbitrary vector $\vec v$ as

\begin{displaymath}
\vec v = v_1' \vec p_1 + v_2' \vec p_2
\end{displaymath}

where $v_1'$ and $v_2'$ are still ordinary numbers. They are called the coordinates of $\vec v$ in the new basis $\vec p_1,\vec p_2$. It is illustrated graphically in figure 15.1(c).

However, in general the coordinates $v_1'$ and $v_2'$ in the new basis are not the same as the coordinates $v_1$ and $v_2$ in the old basis ${\hat\imath},{\hat\jmath}$. So, if you want to use the new basis to your advantage, you will normally have to know how to compute $v_1'$ and $v_2'$ if you know $v_1$ and $v_2$ and/or vice-versa. That is the problem that this section will address.

First, to find the old coordinates $v_1$ and $v_2$ given the new ones $v_1'$ and $v_2'$ is easy. Just write the above equation as a row-column multiplication:

\begin{displaymath}
\vec v
= \Big(  \vec p_1 \; \vec p_2  \Big)
\left( \begin{array}{c} v_1'  v_2' \end{array} \right)
\end{displaymath}

Then write that out in terms of the old Cartesian coordinates:

\begin{displaymath}
\left( \begin{array}{c} v_1  v_2 \end{array} \right)
=...
...g)
\left( \begin{array}{c} v_1'  v_2' \end{array} \right)
\end{displaymath}

or more concisely,

\begin{displaymath}
\vec v
= \Big(  \vec p_1 \; \vec p_2  \Big)
\vec v^{ \prime}
\end{displaymath}

where a prime on a vector means that its coordinates are written out in terms of the new basis, and no prime means they are written out in terms of the old basis. According to the above, to get the old coordinates from the new ones, just put $\vec p_1$ and $\vec p_2$ in a matrix, call it $P$,
\begin{displaymath}
\fbox{$\displaystyle
P = \Big(  \vec p_1 \; \vec p_2  \Big)
$}
\end{displaymath} (15.1)

and multiply by that matrix. Do note that the $\vec p_1$ and $\vec
p_2$ that you put in $P$ must be written out in terms of the old Cartesian coordinates. But what else could you do?

So you get the following transformation formulae between coordinates

\begin{displaymath}
\fbox{$\displaystyle
\vec v = P \vec v^{ \prime} \qquad \vec v^{ \prime} = P^{-1} \vec v
$}
\end{displaymath} (15.2)

with $P$ as above. Note that while $P$ computes the old coordinates from the new ones, it is called “the transformation matrix from old to new”. It does not make any sense, but that is what mathematicians call it. Just remember, old to new really means old from new.

The final thing you need to know is what happens to matrices. If a matrix $A$ converts a vector $\vec v$ to a vector $\vec w$ in terms of Cartesian coordinates, then $A'$ should convert $\vec v^{ \prime}$ to $\vec w^{ \prime}$ in terms of the new coordinates. Since

\begin{displaymath}
A \vec v = \vec w
\quad \Longrightarrow \quad
A P \vec...
...rrow \quad
P^{-1} A P \vec v^{ \prime} = \vec w^{ \prime}
\end{displaymath}

the desired matrix $A'$ is seen to equal $P^{-1}AP$.

So the transformation rules for matrices are

\begin{displaymath}
\fbox{$\displaystyle
A = P A' P^{-1} \qquad A' = P^{-1} A P
$}
\end{displaymath} (15.3)

That is much like the ones for vectors, except there is an additional trailing $P^{-1}$, respectively $P$.

While we used a two dimensional example, you can do all of the above in any number of dimensions. You just add more basis vectors to transformation matrix $P$.