Matrices, Scalars, Vectors and Vector Calculus 1

Let us imagine that we have a system of coordinates {S} and a system of coordinates {S'} that is rotated relatively to {S}. Let us consider a point {P} that has coordinates {(x_1,x_2,x_3)} on {S} and coordinates {(x'_1,x'_2,x'_3)} on {S'}.

In general it is obvious that {x'_1=x'_1(x_1,x_2,x_3)}, {x'_2=x'_2(x_1,x_2,x_3)} and that {x'_3=x'_3(x_1,x_2,x_3)}.

Since the transformation from {S} to {S'} is just a rotation we can assume that the transformation is linear. Hence we can write explicitly

{\begin{aligned} x'_1 &= \lambda _{11}x_1+ \lambda _{12}x_2 +\lambda _{13}x_3 \\ x'_2 &= \lambda _{21}x_1+ \lambda _{22}x_2 +\lambda _{23}x_3 \\ x'_3 &= \lambda _{31}x_1+ \lambda _{32}x_2 +\lambda _{33}x_3 \end{aligned}}

Another way to write the three previous equations in a more compact way is:

\displaystyle x'_i=\sum_{j=1}^3 \lambda_{ij}x_j

In case you don’t see how the previous equation is a more compact way of writing the first equations I’ll just lay out the {i=1} case.

\displaystyle x'_1=\sum_{j=1}^3 \lambda_{1j}x_j

Now all that we have to do is to sum from {j=1} to {j=3} and we get the first equation. For the other two a similar reasoning applies.

If we want to make a transformation from {S'} to {S} the inverse transformation is

\displaystyle x_i=\sum_{j=1}^3 \lambda_{ji}x'_j

The previous notation suggests that the {\lambda} indexes can be arranged in a form of a matrix:

\displaystyle \lambda= \left(\begin{array}{ccc} \lambda_{11} & \lambda_{12} & \lambda_{13} \\ \lambda_{21} & \lambda_{22} & \lambda_{23} \\ \lambda_{31} & \lambda_{32} & \lambda_{33} \end{array} \right)

In the literature the previous matrix has the name of rotation matrix or transformation matrix.

— 1. Properties of the rotation matrix —

For the transformation {x'_i=x'_i(x_i)}

\displaystyle  \sum_j \lambda_{ij}\lambda_{kj}=\delta_{ik}

Where {\delta_{ik}} is a matrix known as Kronecker delta and its definition is

\displaystyle  \delta_{ik}=\begin{cases} 0 \quad i\neq k\\ 1 \quad i=k \end{cases}

For the inverse transformation {x_i=x_i(x'_i)} it is

\displaystyle  \sum_i \lambda_{ij}\lambda_{ik}=\delta_{jk}

The previous relationships are called orthogonality relationships.

— 2. Matrix operations, definitions and properties —

Let us represent the coordinates of a point {P} by a column vector

\displaystyle  x = \left(\begin{array}{c} x_1 \\ x_2 \\ x_3 \end{array}\right)

Using the usual notation of linear algebra we can write the transformation equations as {x'=\mathbf{\lambda} x}

Where we define the matrix product, {\mathbf{AB}=\mathbf{C}}, to be possible only when the number of columns of {\mathbf{A}} is equal to the number of rows of {\mathbf{B}}

The way to calculate a specific element of the matrix {\mathbf{C}}, we will denote this element by the symbol {\mathbf{C}_{ij}} is,

\displaystyle  \mathbf{C}_{ij}=[\mathbf{AB}]_{ij}=\sum_k A_{ik}B_{kj}

Given the definition of a matrix product it should be clear that in general one has {\mathbf{AB} \neq \mathbf{BA}}

As an example let us look into;

\displaystyle \mathbf{A}=\left( \begin{array}{cc} 2 & 1\\ -1 & 3 \end{array}\right) ;\quad \mathbf{B}=\left( \begin{array}{cc} -1 & 2\\ 4 & -2 \end{array}\right)


\displaystyle  \mathbf{AB}=\left( \begin{array}{cc} 2\times (-1)+1\times 4 & 2\times 2+1\times (-2)\\ -1\times (-1)+3\times 4 & -1\times 2+3\times (-2) \end{array}\right)=\left( \begin{array}{cc} 2 & 2\\ 13 & -8 \end{array}\right)


\displaystyle  \mathbf{BA}=\left( \begin{array}{cc} -4 & 5\\ 10 & -2 \end{array}\right)

We’ll say that {\lambda^T} is the transposed of {\lambda} and calculate the matrix elements of the transposed matrix by {\lambda_{ij}^T=\lambda_{ji}}. In a more pedestrian way one can say that in order to obtain the transpose of a given matrix one needs only to exchange its rows and columns.

For a given matrix {\mathbf{A}} it exists another matrix {\mathbf{U}} such as {\mathbf{AU}=\mathbf{UA}=\mathbf{A}}. The matrix {\mathbf{U}} is said to be the unit matrix and usually one can represent it by {\mathbf{U}=\mathbf{1}}.

If {\mathbf{AB}=\mathbf{BA}=\mathbf{1}}, then {\mathbf{A}} and {\mathbf{B}} are said to be the inverse of each other and {\mathbf{B}=\mathbf{A}^{-1}}, {\mathbf{A}=\mathbf{B}^{-1}}.

Now for the rotation matrices it is

{\begin{aligned} \lambda \lambda ^T &= \left( \begin{array}{cc} \lambda_{11} & \lambda_{12}\\ \lambda_{21} & \lambda_{22} \end{array}\right)\left( \begin{array}{cc} \lambda_{11} & \lambda_{21}\\ \lambda_{12} & \lambda_{22} \end{array}\right) \\ &= \left( \begin{array}{cc} \lambda_{11}^2+\lambda_{22}^2 & \lambda_{11}\lambda_{21}+\lambda_{12}\lambda_{22}\\ \lambda_{21}\lambda_{11}+\lambda_{22}\lambda_{12} & \lambda_{21}^2+\lambda_{22}^2 \end{array}\right)\\ &=\left( \begin{array}{cc} 1 & 0\\ 0 & 1 \end{array}\right)\\ &= \mathbf{1} \end{aligned}}

Where the second last equality follows from what we’ve seen in section 1.

Thus {\lambda ^T=\lambda ^{-1}}.

Just to finish up this section let me just mention that even though, in general, matrix multiplication isn’t commutative it still is associative. Thus {(\mathbf{AB})\mathbf{C}=\mathbf{A}(\mathbf{BC})}. Also matrix addition has just the definition one would expect. Namely {C_{ij}=A_{ij}+B_{ij}}.

If one inverts all three axes at the same time the matrix that we get is the so called inversion matrix and it is

\displaystyle  \left( \begin{array}{ccc} -1 & 0 & 0\\ 0 & -1 & 0\\ 0 & 0 & -1 \end{array}\right)

Since it can be shown that rotation matrices always have their determinant equal to {1} and that the inversion matrix has a {-1} determinant we know that there isn’t any continuous transformation that maps a rotation into an inversion.

— 3. Vectors and Scalars —

In Physics quantities are either scalars or vectors (they can also be tensors but since they aren’t needed right away I’ll just pretend that they don’t exist for the time being). These two entities are defined according to their transformation properties.

Let {\lambda} be a coordinate transformation, {\displaystyle\sum_j\lambda_{ij}\lambda_{kj}=\delta_{ij}}, if it is:

  • {\displaystyle\sum_j\lambda_{ij}\varphi=\varphi} then {\varphi} is said to be a scalar.
  • {\displaystyle\sum_j\lambda_{ij}A_j=A'_i} for {A_1}, {A_2} and {A_3} then {(A_1,A_2,A_3)} is said to be a vector.

— 3.1. Operations between scalars and vectors —

I think that most people in here already know this but in the interest of a modicum of self containment I’ll just enumerate some properties of scalars and vectors.

  1. {\vec{A}+\vec{B}=\vec{B}+\vec{A}}
  2. {\vec{A}+(\vec{B}+\vec{C})=(\vec{A}+\vec{B})+\vec{C}}
  3. {\varphi+\psi=\psi+\varphi}
  4. {\varphi+(\psi+\xi)=(\varphi+\psi)+\xi}
  5. {\xi \vec{A}= \vec{B}} is a vector.
  6. {\xi \varphi=\psi} is a scalar.

As an example we will show the second proposition 5 and the reader has to show the veracity of the last proposition.

In order to show that {\xi \vec{A}= \vec{B}} is a vector we have to show that it transforms like a vector.

{\begin{aligned} B'_i &= \displaystyle\sum_j \lambda_{ij}B_j\\ &= \displaystyle\sum_j \lambda_{ij}\xi A_j\\ &= \xi\displaystyle\sum_j \lambda_{ij} A_j\\ &= \xi A'_i \end{aligned}}

Hence {\xi A} transforms like a vector.

— 4. Vector “products” —

The operations between scalars are pretty much well know by everybody, hence we won’t take a look at them, but maybe it is best for us to take a look at two operations between vectors that are crucial for our future development.

— 4.1. Scalar product —

We can construct a scalar by using two vectors. This scalar is a measure of the projection of one vector into the other. Its definition is

\displaystyle  \vec{A}.\cdot\vec{B}=\sum_i A_i B_i = AB\cos (A.B)

For this operation deserve its name, one still has to prove that the result indeed is a scalar.

First one writes {A'_i=\displaystyle \sum_j\lambda_{ij}A_j} and {B'_i=\displaystyle \sum_k\lambda_{ik}B_k}, where one changes the index of the second summation because we’ll have to multiply the two quantities and that way the final result can be achieved much more easily.

Now it is

{\begin{aligned} \vec{A}'\cdot \vec{B}' &= \displaystyle\sum_i A'_i B'_i \\ &= \displaystyle \sum_i \left(\sum_j\lambda_{ij}A_j\right)\left( \sum_k\lambda_{ik}B_k \right)\\ &= \displaystyle \sum_j \sum_k \left( \sum_i \lambda_{ij}\lambda_{ik} \right)A_j B_k\\ &= \displaystyle \sum_j \left(\sum_k \delta_{jk}A_jB_k \right)\\ &= \displaystyle \sum_j A_j B_j \\ &= \vec{A}\cdot \vec{B} \end{aligned}}

Hence {\vec{A}\cdot \vec{B}} is a scalar.

— 4.2. Vector product —

First we have to introduce the permutation symbol {\varepsilon_{ijk}}. Its definition is {\varepsilon_{ijk}=0} if two or three of its indices are equal; {\varepsilon_{ijk}=1} if {i\,j\,k} is an even permutation of {123} (the even permutations are {123}, {231} and {312}); {\varepsilon_{ijk}=-1} if {i\,j\,k} is an odd permutation of {123} (the odd permutations {132}, {321} and {213}).

The vector product, {\vec{C}}, of two vectors {\vec{A}} and {\vec{B}} is denoted by {\vec{C}=\vec{A}\times \vec{B}}.

To calculate the components the components of the vector {\vec{C}} the following equation is to be used:

\displaystyle  C_i=\sum_{j,k}\varepsilon_{ijk}A_j B_k

Where {\displaystyle\sum_{j,k}} is shorthand notation for {\displaystyle\sum_j\sum_k}.

As an example let us look into {C_1}

{\begin{aligned} C_1 &= \sum_{j,k}\varepsilon_{1jk}A_j B_k\\ &= \varepsilon_{123}A_2 B_3+\varepsilon_{132}A_3 B_2\\ &= A_2B_3-A_3B_2 \end{aligned}}

where we have used the definition of {\epsilon_{ijk}} throughout the reasoning.

One can also see that (this another exercise for the reader) {C_2=A_3B_1-A_1B_3} and that {C_3=A_1B_2-A_2B_1}.

If one only wants to know the magnitude of {\vec{C}} the following equation should be used {C=AB\sin (A,B)}.

After choosing the three axes that define our frame of reference one can choose as the basis of this space a set of three linearly independent vectors that have unit norm. These vectors are called unit vectors.

If we denote these vectors by {\vec{e}_i} any vector {\vec{A}} can be written as {\vec{A}=\displaystyle \sum _i \vec{e}_i A_i}. We also have that {\vec{e}_i\cdot \vec{e}_j=\delta_{ij}} and {\vec{e}_i\times \vec{e}_j=\vec{e}_k}. Another way to write the last equation is {\vec{e}_i\times \vec{e}_j=\vec{e}_k\varepsilon_{ijk}}.

— 5. Vector differentiation with respect to a scalar —

Let {\varphi} be a scalar function of {s}: {\varphi=\varphi(s)}. Since both {\varphi} and {s} are scalars we know that their transformation equations are {\varphi=\varphi '} and {s=s'}. Hence it also is {d\varphi=d\varphi '} and {ds=ds'}

Thus it follows that for differentiation it is {d\varphi/ds=d\varphi'/ds'=(d\varphi/ds)'}.

In order to define the derivative of a vector with respect to a scalar we will follow an analogous road.

We already know that it is {A'_i=\displaystyle \sum_j \lambda _{ij}A_j} hence

{\begin{aligned} \dfrac{dA'_i}{ds'} &= \dfrac{d}{ds'}\left( \displaystyle \sum_j \lambda _{ij}A_j \right)\\ &= \displaystyle \lambda _{ij}\dfrac{d A_j}{ds'}\\ &= \displaystyle \lambda _{ij}\dfrac{d A_j}{ds}\ \end{aligned}}

where the last equality follows from the fact that {s} is a scalar.

From what we saw we can write

\displaystyle  \frac{d A'_i}{ds'}= \left( \frac{d A_i}{ds} \right)'=\sum_j \lambda _{ij}\frac{d A_j}{ds}

Hence {dA_j/ds} transforms like the coordinates of a vector which is the same as saying that {d\vec{A}/ds} is a vector.

The rules for differentiating vectors are:

  • {\dfrac{d}{ds}(\vec{A}+\vec{B})= \dfrac{d\vec{A}}{ds}+\dfrac{d\vec{B}}{ds}}
  • {\dfrac{d}{ds}(\vec{A}\cdot\vec{B})= \vec{A}\cdot\dfrac{d\vec{B}}{ds}+\dfrac{d\vec{A}}{ds}\cdot \vec{B}}
  • {\dfrac{d}{ds}(\vec{A}\times\vec{B})= \vec{A}\times\dfrac{d\vec{B}}{ds}+\dfrac{d\vec{A}}{ds}\times \vec{B}}
  • {\dfrac{d}{ds}(\varphi\vec{A})= \varphi\dfrac{d\vec{A}}{ds}+\dfrac{d\varphi}{ds}\vec{A}}

The proof of these rules isn’t needed in order for us to develop any kind of special skills but if the reader isn’t very used to this, then it is better for him to do them just to see how things happen.


It starts tomorrow

I guess that by now you had enough time to learn about \LaTeX and how to blog properly in our joint. Thus tomorrow we’ll get this show on the road with Classical Physics.

Actually it’ll be some light revisions of basic math for us to be able to do Classical Physics, but you get my drift.

New avenues into Quantum Mechanics

Previously published in here I’ll repost it in this blog so that its contributors may see what we will be discussing at the end of Griffith’s book:

In recent times two articles that can do the seemingly impossible in Quantum Mechanics have been published and they generated some buzz on the interweb.

This first article Observing the Average Trajectories of Single Photons in a Two-Slit Interferometer is notable because the authors show how they were able to observe the trajectories of photons in a double slit experiment and still managed to observe a clear interference pattern. A thing that is impossible to do according to the Complementarity principle.

In the second article, Direct measurement of the quantum wavefunction, the authors computed the the transverse spatial wavefunction of a single photon by means that they consider to be direct. For me, that haven’t read the article, so far it doesn’t seem to be a so direct method as claimed, but nevertheless the level of experimental expertise even to get an indirect computation of the wave function is certainly worthy of respect.

These spectacular achievements were possible because these two teams used weak measurement techniques, together with statistical ensembles of photons and non simultaneous measurements of the complementary variables they set out to determine.

The two abstracts are here (the bold isn’t in the original):

  • Direct measurement of the quantum wavefunction

    The wavefunction is the complex distribution used to completely describe a quantum system, and is central to quantum theory. But despite its fundamental role, it is typically introduced as an abstract element of the theory with no explicit definition. Rather, physicists come to a working understanding of the wavefunction through its use to calculate measurement outcome probabilities by way of the Born rule. At present, the wavefunction is determined through tomographic methods which estimate the wavefunction most consistent with a diverse collection of measurements. The indirectness of these methods compounds the problem of defining the wavefunction. Here we show that the wavefunction can be measured directly by the sequential measurement of two complementary variables of the system. The crux of our method is that the first measurement is performed in a gentle way through weak measurement so as not to invalidate the second. The result is that the real and imaginary components of the wavefunction appear directly on our measurement apparatus. We give an experimental example by directly measuring the transverse spatial wavefunction of a single photon, a task not previously realized by any method. We show that the concept is universal, being applicable to other degrees of freedom of the photon, such as polarization or frequency, and to other quantum systems ? for example, electron spins, SQUIDs (superconducting quantum interference devices) and trapped ions. Consequently, this method gives the wavefunction a straightforward and general definition in terms of a specific set of experimental operations. We expect it to expand the range of quantum systems that can be characterized and to initiate new avenues in fundamental quantum theory.

  • Observing the Average Trajectories of Single Photons in a Two-Slit Interferometer

    A consequence of the quantum mechanical uncertainty principle is that one may not discuss the path or ?trajectory? that a quantum particle takes, because any measurement of position irrevocably disturbs the momentum, and vice versa. Using weak measurements, however, it is possible to operationally define a set of trajectories for an ensemble of quantum particles. We sent single photons emitted by a quantum dot through a double-slit interferometer and reconstructed these trajectories by performing a weak measurement of the photon momentum, postselected according to the result of a strong measurement of photon position in a series of planes. The results provide an observationally grounded description of the propagation of subensembles of quantum particles in a two-slit interferometer.

And an excellent explanation of why this was possible can be found in this blog post Watching Photons Interfere: “Observing the Average Trajectories of Single Photons in a Two-Slit Interferometer”

The near future

Let me start by saying that this is possibly the last announcement on my part for the near future.

After thinking it about it for a while and taking into account that most people on this blog don’t have a Physics degree and that the target audience for this blog is people that don’t have a Physics degree but would like to know more about Quantum Mechanics I’ve decided to throw in a few posts about Classical Physics.

The idea behind this slight change of plans is that I want to introduce people to sophisticated (not that sophisticated) mathematical machinery in settings that people’s physical intuition works so that when one is on a territory where their physical intuition doesn’t work we at least have the fall back of having a mathematical knowledge of what’s going on.

Thus at one point we understand the Physics and get used to the Math and at the other point we understand the Math and get used to the Physics.

We’ll start with Classical Physics taking a look at Newton’s formalism, Lagrange’s formalism and Hamilton’s formalism, then we’ll take a look at some Electrodynamics and finally some Thermodynamics. In all of this we won’t just be doing theoretical work but we will also be solving exercises so that we can check if we really are understanding the subject matter.

After this is through, and this will take something like two or three months, we’ll take at some historical pivotal moments in Quantum Mechanics, just because it is customary, then we’ll start our study of Quantum Mechanics.

A few more thoughts and advices about LateX

— 1. Introduction —

In the page LateX and Equations I’ve already said some things about LateX and LateX in, but on this post I’ll just talk about a few more issues with LateX and why I think that the contributors of this blog should write their posts in LateX and then use Luca Trevisan’s script to convert the .tex file into .html.

As you can in the page LateX support you can easily insert equations into your post. For instance if you want to insert Pythagoras’ theorem you just type

$atex a^2+b^2=c^2$

With latex instead of atex the result that one gets is {a^2+b^2=c^2}. Notice that if you want to know what code was used to get one particular equation all that you have to do is to hover over the equation.

— 2. LateX —

I won’t bother you with the history of TeX and LateX. All you have to do is that if you want to produce a document with typographic quality right out of your computer LateX is the way to go.

For people that aren’t used to use a text editor in the style of LateX the learning curve might be a little steep, but I believe that if you’re not doing anything too fancy with it LateX can be as easy as using Microsoft type products. Besides if you think that you can learn Quantum Mechanics then you surely can learn how to use LaTeX in a rudimentary way. For this blog you surely won’t nothing very fancy. The only things that I use for blog posts is basically sections and other basic commands. Not much I know, but for me it helps in getting my posts with a clear organization and that makes the job of writing them easier and I like to think that it also makes the job of reading them easier.

A nice primer can be A Simplified Introduction to Latex or even The Not So Short Introduction to LateX.

In order for you to use LateX in your computer you’ll also need a distribution and a text editor. The distribution is what will make LateX work in your computer and the text editor is what you’ll use to write your .tex files

If you’re using Windows my advice is for you to use MikTeX, if you’re using Linux than you already have LateX installed in your computer and if you’re using Mac I can’t really help you, but I’ve heard some good things about MacTeX

As for editor I’ll only recommend Windows editors. I think that the best ones for people that are new to LateX are Texmaker and TeXnicCenter. I personally use Texmaker but maybe TexnicCenter is best for neophytes.

— 2.1. Latex to WordPress —

As was already said Luca Trevisan has created a wonderful script that converts a .tex file into an .html file. For me this solution is just perfect:

  1. I can write a blog post in LateX and that takes a lot less time and is more organized.
  2. Allows me to save my files in a format that is easily printable with a nice formatting.
  3. Writing long mathematical deductions is simplified and less error prone in a natural LateX environment than in editor.

So what you do is that you use your favourite text editor to write your .tex file then you go to your command line and type: python myfilename.tex and voilà! You get an .html file that is ready to be copied and pasted into the text editor in html mode.

— 2.2. LateX seems to be overkill —

If you’re not ready to use LateX right now you can use the usual editor, but bear in mind that writing your posts will be harder and writing equations in particular will just be nerve wrecking since you’ll be writing a lot of long equations.

Anyway it is your call and to make the job a little easier on yourself I’d advice you to use the Online Latex Equation Editor

— 3. Some example files —

Here are the files that I usually use to write blog posts:

— 4. Final advice —

At one point you’ll have to use very long derivations on this blog, and my advice for you is that you shouldn’t use the begin{array} environment. You have a much better choice with the begin{aligned} environment whose syntax is very much alike with begin{array} but the end result isn’t buggy and looks a lot better.