Matrices, Scalars, Vectors and Vector Calculus 1

Let us imagine that we have a system of coordinates {S} and a system of coordinates {S'} that is rotated relatively to {S}. Let us consider a point {P} that has coordinates {(x_1,x_2,x_3)} on {S} and coordinates {(x'_1,x'_2,x'_3)} on {S'}.

In general it is obvious that {x'_1=x'_1(x_1,x_2,x_3)}, {x'_2=x'_2(x_1,x_2,x_3)} and that {x'_3=x'_3(x_1,x_2,x_3)}.

Since the transformation from {S} to {S'} is just a rotation we can assume that the transformation is linear. Hence we can write explicitly

{\begin{aligned} x'_1 &= \lambda _{11}x_1+ \lambda _{12}x_2 +\lambda _{13}x_3 \\ x'_2 &= \lambda _{21}x_1+ \lambda _{22}x_2 +\lambda _{23}x_3 \\ x'_3 &= \lambda _{31}x_1+ \lambda _{32}x_2 +\lambda _{33}x_3 \end{aligned}}

Another way to write the three previous equations in a more compact way is:

\displaystyle x'_i=\sum_{j=1}^3 \lambda_{ij}x_j

In case you don’t see how the previous equation is a more compact way of writing the first equations I’ll just lay out the {i=1} case.

\displaystyle x'_1=\sum_{j=1}^3 \lambda_{1j}x_j

Now all that we have to do is to sum from {j=1} to {j=3} and we get the first equation. For the other two a similar reasoning applies.

If we want to make a transformation from {S'} to {S} the inverse transformation is

\displaystyle x_i=\sum_{j=1}^3 \lambda_{ji}x'_j

The previous notation suggests that the {\lambda} indexes can be arranged in a form of a matrix:

\displaystyle \lambda= \left(\begin{array}{ccc} \lambda_{11} & \lambda_{12} & \lambda_{13} \\ \lambda_{21} & \lambda_{22} & \lambda_{23} \\ \lambda_{31} & \lambda_{32} & \lambda_{33} \end{array} \right)

In the literature the previous matrix has the name of rotation matrix or transformation matrix.

— 1. Properties of the rotation matrix —

For the transformation {x'_i=x'_i(x_i)}

\displaystyle  \sum_j \lambda_{ij}\lambda_{kj}=\delta_{ik}

Where {\delta_{ik}} is a matrix known as Kronecker delta and its definition is

\displaystyle  \delta_{ik}=\begin{cases} 0 \quad i\neq k\\ 1 \quad i=k \end{cases}

For the inverse transformation {x_i=x_i(x'_i)} it is

\displaystyle  \sum_i \lambda_{ij}\lambda_{ik}=\delta_{jk}

The previous relationships are called orthogonality relationships.

— 2. Matrix operations, definitions and properties —

Let us represent the coordinates of a point {P} by a column vector

\displaystyle  x = \left(\begin{array}{c} x_1 \\ x_2 \\ x_3 \end{array}\right)

Using the usual notation of linear algebra we can write the transformation equations as {x'=\mathbf{\lambda} x}

Where we define the matrix product, {\mathbf{AB}=\mathbf{C}}, to be possible only when the number of columns of {\mathbf{A}} is equal to the number of rows of {\mathbf{B}}

The way to calculate a specific element of the matrix {\mathbf{C}}, we will denote this element by the symbol {\mathbf{C}_{ij}} is,

\displaystyle  \mathbf{C}_{ij}=[\mathbf{AB}]_{ij}=\sum_k A_{ik}B_{kj}

Given the definition of a matrix product it should be clear that in general one has {\mathbf{AB} \neq \mathbf{BA}}

As an example let us look into;

\displaystyle \mathbf{A}=\left( \begin{array}{cc} 2 & 1\\ -1 & 3 \end{array}\right) ;\quad \mathbf{B}=\left( \begin{array}{cc} -1 & 2\\ 4 & -2 \end{array}\right)


\displaystyle  \mathbf{AB}=\left( \begin{array}{cc} 2\times (-1)+1\times 4 & 2\times 2+1\times (-2)\\ -1\times (-1)+3\times 4 & -1\times 2+3\times (-2) \end{array}\right)=\left( \begin{array}{cc} 2 & 2\\ 13 & -8 \end{array}\right)


\displaystyle  \mathbf{BA}=\left( \begin{array}{cc} -4 & 5\\ 10 & -2 \end{array}\right)

We’ll say that {\lambda^T} is the transposed of {\lambda} and calculate the matrix elements of the transposed matrix by {\lambda_{ij}^T=\lambda_{ji}}. In a more pedestrian way one can say that in order to obtain the transpose of a given matrix one needs only to exchange its rows and columns.

For a given matrix {\mathbf{A}} it exists another matrix {\mathbf{U}} such as {\mathbf{AU}=\mathbf{UA}=\mathbf{A}}. The matrix {\mathbf{U}} is said to be the unit matrix and usually one can represent it by {\mathbf{U}=\mathbf{1}}.

If {\mathbf{AB}=\mathbf{BA}=\mathbf{1}}, then {\mathbf{A}} and {\mathbf{B}} are said to be the inverse of each other and {\mathbf{B}=\mathbf{A}^{-1}}, {\mathbf{A}=\mathbf{B}^{-1}}.

Now for the rotation matrices it is

{\begin{aligned} \lambda \lambda ^T &= \left( \begin{array}{cc} \lambda_{11} & \lambda_{12}\\ \lambda_{21} & \lambda_{22} \end{array}\right)\left( \begin{array}{cc} \lambda_{11} & \lambda_{21}\\ \lambda_{12} & \lambda_{22} \end{array}\right) \\ &= \left( \begin{array}{cc} \lambda_{11}^2+\lambda_{22}^2 & \lambda_{11}\lambda_{21}+\lambda_{12}\lambda_{22}\\ \lambda_{21}\lambda_{11}+\lambda_{22}\lambda_{12} & \lambda_{21}^2+\lambda_{22}^2 \end{array}\right)\\ &=\left( \begin{array}{cc} 1 & 0\\ 0 & 1 \end{array}\right)\\ &= \mathbf{1} \end{aligned}}

Where the second last equality follows from what we’ve seen in section 1.

Thus {\lambda ^T=\lambda ^{-1}}.

Just to finish up this section let me just mention that even though, in general, matrix multiplication isn’t commutative it still is associative. Thus {(\mathbf{AB})\mathbf{C}=\mathbf{A}(\mathbf{BC})}. Also matrix addition has just the definition one would expect. Namely {C_{ij}=A_{ij}+B_{ij}}.

If one inverts all three axes at the same time the matrix that we get is the so called inversion matrix and it is

\displaystyle  \left( \begin{array}{ccc} -1 & 0 & 0\\ 0 & -1 & 0\\ 0 & 0 & -1 \end{array}\right)

Since it can be shown that rotation matrices always have their determinant equal to {1} and that the inversion matrix has a {-1} determinant we know that there isn’t any continuous transformation that maps a rotation into an inversion.

— 3. Vectors and Scalars —

In Physics quantities are either scalars or vectors (they can also be tensors but since they aren’t needed right away I’ll just pretend that they don’t exist for the time being). These two entities are defined according to their transformation properties.

Let {\lambda} be a coordinate transformation, {\displaystyle\sum_j\lambda_{ij}\lambda_{kj}=\delta_{ij}}, if it is:

  • {\displaystyle\sum_j\lambda_{ij}\varphi=\varphi} then {\varphi} is said to be a scalar.
  • {\displaystyle\sum_j\lambda_{ij}A_j=A'_i} for {A_1}, {A_2} and {A_3} then {(A_1,A_2,A_3)} is said to be a vector.

— 3.1. Operations between scalars and vectors —

I think that most people in here already know this but in the interest of a modicum of self containment I’ll just enumerate some properties of scalars and vectors.

  1. {\vec{A}+\vec{B}=\vec{B}+\vec{A}}
  2. {\vec{A}+(\vec{B}+\vec{C})=(\vec{A}+\vec{B})+\vec{C}}
  3. {\varphi+\psi=\psi+\varphi}
  4. {\varphi+(\psi+\xi)=(\varphi+\psi)+\xi}
  5. {\xi \vec{A}= \vec{B}} is a vector.
  6. {\xi \varphi=\psi} is a scalar.

As an example we will show the second proposition 5 and the reader has to show the veracity of the last proposition.

In order to show that {\xi \vec{A}= \vec{B}} is a vector we have to show that it transforms like a vector.

{\begin{aligned} B'_i &= \displaystyle\sum_j \lambda_{ij}B_j\\ &= \displaystyle\sum_j \lambda_{ij}\xi A_j\\ &= \xi\displaystyle\sum_j \lambda_{ij} A_j\\ &= \xi A'_i \end{aligned}}

Hence {\xi A} transforms like a vector.

— 4. Vector “products” —

The operations between scalars are pretty much well know by everybody, hence we won’t take a look at them, but maybe it is best for us to take a look at two operations between vectors that are crucial for our future development.

— 4.1. Scalar product —

We can construct a scalar by using two vectors. This scalar is a measure of the projection of one vector into the other. Its definition is

\displaystyle  \vec{A}.\cdot\vec{B}=\sum_i A_i B_i = AB\cos (A.B)

For this operation deserve its name, one still has to prove that the result indeed is a scalar.

First one writes {A'_i=\displaystyle \sum_j\lambda_{ij}A_j} and {B'_i=\displaystyle \sum_k\lambda_{ik}B_k}, where one changes the index of the second summation because we’ll have to multiply the two quantities and that way the final result can be achieved much more easily.

Now it is

{\begin{aligned} \vec{A}'\cdot \vec{B}' &= \displaystyle\sum_i A'_i B'_i \\ &= \displaystyle \sum_i \left(\sum_j\lambda_{ij}A_j\right)\left( \sum_k\lambda_{ik}B_k \right)\\ &= \displaystyle \sum_j \sum_k \left( \sum_i \lambda_{ij}\lambda_{ik} \right)A_j B_k\\ &= \displaystyle \sum_j \left(\sum_k \delta_{jk}A_jB_k \right)\\ &= \displaystyle \sum_j A_j B_j \\ &= \vec{A}\cdot \vec{B} \end{aligned}}

Hence {\vec{A}\cdot \vec{B}} is a scalar.

— 4.2. Vector product —

First we have to introduce the permutation symbol {\varepsilon_{ijk}}. Its definition is {\varepsilon_{ijk}=0} if two or three of its indices are equal; {\varepsilon_{ijk}=1} if {i\,j\,k} is an even permutation of {123} (the even permutations are {123}, {231} and {312}); {\varepsilon_{ijk}=-1} if {i\,j\,k} is an odd permutation of {123} (the odd permutations {132}, {321} and {213}).

The vector product, {\vec{C}}, of two vectors {\vec{A}} and {\vec{B}} is denoted by {\vec{C}=\vec{A}\times \vec{B}}.

To calculate the components the components of the vector {\vec{C}} the following equation is to be used:

\displaystyle  C_i=\sum_{j,k}\varepsilon_{ijk}A_j B_k

Where {\displaystyle\sum_{j,k}} is shorthand notation for {\displaystyle\sum_j\sum_k}.

As an example let us look into {C_1}

{\begin{aligned} C_1 &= \sum_{j,k}\varepsilon_{1jk}A_j B_k\\ &= \varepsilon_{123}A_2 B_3+\varepsilon_{132}A_3 B_2\\ &= A_2B_3-A_3B_2 \end{aligned}}

where we have used the definition of {\epsilon_{ijk}} throughout the reasoning.

One can also see that (this another exercise for the reader) {C_2=A_3B_1-A_1B_3} and that {C_3=A_1B_2-A_2B_1}.

If one only wants to know the magnitude of {\vec{C}} the following equation should be used {C=AB\sin (A,B)}.

After choosing the three axes that define our frame of reference one can choose as the basis of this space a set of three linearly independent vectors that have unit norm. These vectors are called unit vectors.

If we denote these vectors by {\vec{e}_i} any vector {\vec{A}} can be written as {\vec{A}=\displaystyle \sum _i \vec{e}_i A_i}. We also have that {\vec{e}_i\cdot \vec{e}_j=\delta_{ij}} and {\vec{e}_i\times \vec{e}_j=\vec{e}_k}. Another way to write the last equation is {\vec{e}_i\times \vec{e}_j=\vec{e}_k\varepsilon_{ijk}}.

— 5. Vector differentiation with respect to a scalar —

Let {\varphi} be a scalar function of {s}: {\varphi=\varphi(s)}. Since both {\varphi} and {s} are scalars we know that their transformation equations are {\varphi=\varphi '} and {s=s'}. Hence it also is {d\varphi=d\varphi '} and {ds=ds'}

Thus it follows that for differentiation it is {d\varphi/ds=d\varphi'/ds'=(d\varphi/ds)'}.

In order to define the derivative of a vector with respect to a scalar we will follow an analogous road.

We already know that it is {A'_i=\displaystyle \sum_j \lambda _{ij}A_j} hence

{\begin{aligned} \dfrac{dA'_i}{ds'} &= \dfrac{d}{ds'}\left( \displaystyle \sum_j \lambda _{ij}A_j \right)\\ &= \displaystyle \lambda _{ij}\dfrac{d A_j}{ds'}\\ &= \displaystyle \lambda _{ij}\dfrac{d A_j}{ds}\ \end{aligned}}

where the last equality follows from the fact that {s} is a scalar.

From what we saw we can write

\displaystyle  \frac{d A'_i}{ds'}= \left( \frac{d A_i}{ds} \right)'=\sum_j \lambda _{ij}\frac{d A_j}{ds}

Hence {dA_j/ds} transforms like the coordinates of a vector which is the same as saying that {d\vec{A}/ds} is a vector.

The rules for differentiating vectors are:

  • {\dfrac{d}{ds}(\vec{A}+\vec{B})= \dfrac{d\vec{A}}{ds}+\dfrac{d\vec{B}}{ds}}
  • {\dfrac{d}{ds}(\vec{A}\cdot\vec{B})= \vec{A}\cdot\dfrac{d\vec{B}}{ds}+\dfrac{d\vec{A}}{ds}\cdot \vec{B}}
  • {\dfrac{d}{ds}(\vec{A}\times\vec{B})= \vec{A}\times\dfrac{d\vec{B}}{ds}+\dfrac{d\vec{A}}{ds}\times \vec{B}}
  • {\dfrac{d}{ds}(\varphi\vec{A})= \varphi\dfrac{d\vec{A}}{ds}+\dfrac{d\varphi}{ds}\vec{A}}

The proof of these rules isn’t needed in order for us to develop any kind of special skills but if the reader isn’t very used to this, then it is better for him to do them just to see how things happen.


34 comments on “Matrices, Scalars, Vectors and Vector Calculus 1

  1. soothseeker says:

    That first set of equations doesn’t make sense. How can

    x'_1=x'_1(x_1,x_2,x_3)? A scalar is being set equal to an ordered triple.

  2. ateixeira says:

    This is standard function notation. It means that the x_1' scalar is a function of three variables: x_1, x_2 and x_3.

    • soothseeker says:

      Ah, then it makes sense. Not sure why I didn’t see it that way, but perhaps I was too tired.

      The definition of scalar that is given seems bizarre, but I see from researching it a little that it’s a definition unique to physicists.

      • ateixeira says:

        In fact one also distinguish scalars from pseudo scalars (just like one can distinguish vectors from axial vectors) but this isn’t needed for the time being. Anyway, what’s the definition of scalar you’re used to?

  3. soothseeker says:

    The definition of scalar you’re using is ultimately equivalent, I think, to the definition I’m used to. But, as you may already know, if you have a field \mathcal{F}, a set of objects V is called a vector space over \mathcal{F} if there is a function + : V \times V \to V and a function \cdot : \mathcal{F} \times V \to V that satisfy a collection of axioms:

    1) For all x,y \in V, x+y=y+x.
    2) For all x,y,z \in V, (x+y)+z=x+(y+z).
    3) There exists some \theta \in V such that x+\theta=x for all x \in V.
    4) For all \alpha \in \mathcal{F} and x,y \in V, \alpha(x+y) = \alpha x + \alpha y.
    5) For all \alpha,\beta \in \mathcal{F} and x \in V, (\alpha+\beta)x=\alpha x + \beta x.
    6) For all \alpha,\beta \in \mathcal{F} and x \in V, \alpha(\beta x)=(\alpha\beta)x.
    7) For all x \in V, 0 \cdot x = \theta and 1 \cdot x = x.

    Anyway, long story short, the elements of the field \mathcal{F} are called “scalars”. Your definition says a scalar is a physical quantity that is invariant under the operations of coordinate system rotations and translations. And that is interesting, because then it certainly makes sense to say that speed is a scalar quantity and velocity is a vector quantity. I probably knew these things once, when I was a physics student.

    • soothseeker says:

      Oh dear. It doesn’t seem like I can edit my comment to fix Axiom 4:

      4) For all \alpha \in \mathcal{F} and x,y \in V, \alpha(x+y)=\alpha x+\alpha y.

      Anyway you get the idea. 😉

      • ateixeira says:

        Yes I’m used to the linear algebra definition of a scalar (let me just put it this way…).

        These series posts on classical Physics will serve mainly to fix some terminology and get people used to physical reasoning.

        If you want you can also post the solution of some (all) of the problems I left as an exercise.

        Next week I’ll follow through with one more post of more of a mathematical content and then we’ll enter real Physics.

        Ps: I’ll edit your post for you.

      • soothseeker says:

        And of course, the function \cdot : \mathcal{F}\times V \to V is the “scalar multiplication function,” which takes in a scalar \alpha and a vector x, and returns an output \alpha x:

        (\alpha,x) \mapsto \alpha \cdot x.

        Normally \alpha \cdot x is written as \alpha x

        In physics, naturally, the field \mathcal{F} is usually the set of real numbers \mathcal{R}.

        I guess what I’m doing here is laying down some of the basic properties of a vector space for anyone out there who needs a refresher. Quite possibly everyone in the Quantum Gang is already on top of this stuff.

        • ateixeira says:

          Some of the members may not be, so if you want to make a post about it go ahead. It may be a little bit too mathematical but I don’t see anything wrong with that.

          When we get to Quantum Mechanics we’ll also see some revisions of Linear Algebra but it’ll be under a Physicist’s optics and we’ll use Dirac notation. Hence a more natural introduction to the subject might be a good idea.

          Ps: Don’t forget to send me your introduction text.

  4. soothseeker says:

    Your write-up above is quite nice, by the way.

  5. thehappyhexagon says:

    Yeah I see the equality more as a declaration, for the sake of clarity, that the point x'_1 sitting in it’s own little coordinate system can be expressed in terms of the three variables x_1, x_2, some other coordinate system. Trivial perhaps but we could also have said something like, say, x_1= x_1(x_1,0,0) for example.

    I think what has been presented is a good start, I had a slight issue with the fact that I am used to transforms including shears and dilations yet here we consider only rotation matrices (which seem to be synonymous with transformation matrices) but overall good stuff.

    • ateixeira says:

      I’ve edited your comment because you should have written $1atex (l instead of 1) instead of $itex.

      Anyway guys plenty of exercises for you to solve. 😉 😛

      Edit: Isn’t it ironic (I’m a positivist not a normativist (actually I’m not, but bear with me)) that my own post also needs an edit?

  6. thehappyhexagon says:

    Apologies…I shall make a better effort at latex typing next time!

  7. palynka says:

    Sorry, I will look into this later, although at a glance it looks pretty straightforward as a refresher.

  8. palynka says:

    Looking forward to the next installment… 🙂

  9. Amolv says:

    A couple of questions:

    1.) In the first section when you’re talking about coordinate transformations, your summation notation seems to imply that if lambda is the transformation matrix from one coordinate system to another then the transpose of lambda is the inverse transformation. Unless I am misinterpreting or misunderstanding what you are saying, this does not seem obvious to me.

    2.) In section 2 you say that if two matrices commute then they are inverses of each other. This again is not obvious to me. Can you clarify?

    • ateixeira says:

      Even though you have more or less answered your questions (and no they don’t make you an idiot they show that you have interest in what you’re reading) let me just add a few more comments:

      1.) This has to do with the fact that the \lambda_{ij} are actually the cosines of the axis between the two frames. A lot more can be said about this but frankly I didn’t want this post to be too long and I’m hoping that any major gaps will be filled by the other members and/or interested readers.

      2.) In this I really don’t have much to say, but that part of the post really wasn’t that clear. I mixed in bits about matrices in general and bits about matrices that represent rotations. Thanks for pointing that out.

  10. Amolv says:

    I’m an idiot, and should have read the whole post before commenting. This is a post about rotations, not general transformations. Answered my own question. Sorry.

  11. joeshmo26 says:

    I’m afraid I don’t understand the significance of the Kronecker delta matrix…and I bet once I look more deeply at some other things similar knowledge gaps will appear, but for now I’m hope you can just roll with the punches.

    • joeshmo26 says:

      After doing a bit more reading, I also feel that its understanding is crucial to the completion of the excersise in section 3.1? Obviously however I could be wrong.

      • ateixeira says:

        Sorry for the time but I have my hands full in the moment. I think that in this weekend I can give you a decent answer and I’ll also post the follow up post to this so that we can get our blog moving.

      • ateixeira says:

        The solution of this exercise is very similar to the solution presented in the text. You just have to remember what a scalar is in our context.

    • ateixeira says:

      joeshmo26 :

      I’m afraid I don’t understand the significance of the Kronecker delta matrix…and I bet once I look more deeply at some other things similar knowledge gaps will appear, but for now I’m hope you can just roll with the punches.

      The significance of Kronecker delta is mainly of simplification. In this case it expresses the fact that the direct transformation and the inverse one are orthogonal. This means that if you apply the direct transformation to a vector and then apply the inverse transformation to the resulting vector you just get the vector you started with. Which is just as it should be.

  12. […] the last post we took our first step in the mathematical introduction to classical Mechanics. In this post […]

  13. We still can not quite believe I could end up being one of those studying the important ideas found on your website. My family and I are really thankful for the generosity and for offering me the advantage to pursue my own chosen career path. Many thanks for the important information I acquired from your website.

  14. […] introducing some mathematical machinery with our first and second posts it is now time for us to look into some Newtonian Physics, after a brief look into […]

  15. Timothy says:

    I am struggling to understand the Kronecker delta. Is it possible to give an example?

    • ateixeira says:

      Hi Timothy,

      The Kronecker delta is a function of two variables which is equal to 1 when the two variables are equal and equal to 0 when the variables are different.

      So an example can be: \delta_{ik}=1 if the variables are equal and \delta_{ik}=0 if the variables are different:

      \begin{pmatrix}1 & 0\\ 0 & 1\end{pmatrix}

  16. […] to and the vector product of two parallel vectors is by definition (Do you see why? If not go to this post). see the definition of vector product and prove the previous […]

  17. […] Since and are scalar functions is also a scalar function. Therefore is an invariant for coordinate transformations. […]

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s