What you learned so far:

  • a state $\sigma$ is a complex vector $(s_1, s_2,…)$ representing probabilities with $\sum_i |s_i|^2=1$. The state is a superposition and $|s_i|^2$ is the probability of getting $i$ as an answer if asked/measured.
  • if you measure/ask what the state really is the state will collapse to the corresponding pure state
  • a classic random process is described with a stochastic matrix (Markov chain), a quantum process is described by a unitary matrix
  • there is entanglement; you cannot see the dynamics of the whole system as the sum of the dynamics of the parts. Technically, you have matrices which cannot be factorized as direct vector products.

All of this is valid for any quantum system.

The qubit

Let’s focus on two-state systems, e.g. an atom which can be in two different energy states. This is called a qubit. An element is a superposition of $|0\rangle$ and $|1\rangle$:

$$|\psi\rangle = \alpha\,|0\rangle + \beta|1\rangle$$

and usually one thinks of $|0\rangle$ as the up-state and $|1\rangle$ as the down-state. If you represent things as a tuple:

$$|\psi\rangle = \begin{pmatrix} \alpha \\ \beta \end{pmatrix} $$.
This de facto representation is what you see as a software developer, but the quantum world has a twist. It’s related to group representations and using objects in different dimensions.


The ket-vector $|\psi\rangle$ is a couple of complex numbers satisfying


and you can multiply both values with an arbitrary phase $e^{-i\phi}$ to obtain the same state. So, since any complex number can be represented in polar form; $\alpha=\rho\,e^{i\phi}, \beta =\xi\,e^{i\theta}$ you can assume that the first value is real by multiplying with a $e^{-i\phi}$. Hence

$$|\psi\rangle = \rho\,|0\rangle + \xi\,e^{i\theta}|1\rangle$$

with real values $\xi, \rho$ on a circle: $\rho^2+\xi^2=1.$ This means that you can use an angle $\mu$ on the circle and represent the state as

$|\psi\rangle = \cos(\mu)\,|0\rangle + \sin(\mu)\,e^{i\theta}|1\rangle$

and if you decompose it into a 3-dimensional vector you get

$|\psi\rangle = \begin{pmatrix} \cos(\mu)\\ \sin(\mu)\cos(\theta) \\ \sin(\mu)\sin(\theta) \end{pmatrix} $.

This little exercise is more than just smart change of variables. This is to show you that an object like a ket-vector can be represented in different ways in different dimensions. The default representation is in a 2-dimensional complex space and the second one is in a 3-dimensional real space. These are called representations of the abstract object $|\psi\rangle$. This has consequences for the way you represent observables and pretty much everything. It’s obvious that if you had an Hermitian observable acting on the complex space it would not work on the 3-dimensional space; you cannot apply a 2×2 matrix on a 3×1 vector.
Things are also not restricted to vectors; you can represent states using matrices, functions, anything you like.

The change above which ends up into a sphere in 3-dimensional real space is called the Bloch sphere and you should check the representation using Pauli spin matrices for an additional representation.

Representation theory is related to group theory and how symmetry is implemented in physics. It’s key in pretty much any physical theory.

Quantum entropy and density operator

By now you probably wonder what all this has to do with quantum machine learning, right? The shift from classic ML to quantum ML really is an extension of things you have done before but in different dimensions and spaces. It stretches your mind in the beginning but you get used to it.

The concept of entropy is a good example of this ‘stretching’.

In a classic machine learning context you end up with probability vectors

$p=\begin{pmatrix}p_1 \\ p_2 \\ \dots\end{pmatrix}$

representing the probability or prediction for a given vector. The value $p_1$ is the probability that you can classify it to label 1 and so on. If you use neural networks with a softmax activation you automatically get this type of vector. The (classic) entropy is defined as

$$S = \sum_i\,p_i\ln\,p_i$$

It indicates how mixed the whole things is. If there is absolute certainty for a particular value you get zero entropy, meaning you have a pure unmixed situation.

Now, imagine you put the probabilities on the diagonal of a matrix rather than in a vector:

$p=\begin{pmatrix}p_1 & 0 & 0 & \dots\\ 0 & p_2 \\ 0 \\ \dots\end{pmatrix}$

the entropy is now

$S = Tr(p\,\ln p)$

with the trace of the matrix corresponding to the previous sum. This formula is however valid for any operator $p$, not just diagonal ones, and is called the von Neumann entropy.

Using QuTip you can compute the von Neumann entropy like so

sqrt = np.sqrt
ket = (basis(4,0)+ basis(4,1) + basis(4,3))/sqrt(3)
dm = ket2dm(ket)

This matrix can also be written as

$p = \sum_i p_i|i\rangle\langle i|$

which is a special case of what is known as the density operator $\rho$

$$\rho = \sum_i \rho_{ij}|i\rangle\langle j|$$ with $Tr(\rho)=1$.

The density operator replaces the wave function of the system. For example, instead of computing the expactation value of an observable $A$ on the state $\psi$ you can use the density operator with

$\langle A\rangle = tr(\rho\,A)$.
If you know the state $\psi$ you can compute the density matrix from

$\rho_{ij} = \langle i|\psi\rangle\langle\psi|j\rangle$.
For example, if you have a 2-qubit system it can be represented as a 4-vector corresponding to the four basis states:
$|00\rangle = \begin{pmatrix}1 \\ 0 \\ 0\\0\end{pmatrix}, |10\rangle = \begin{pmatrix}0 \\ 1 \\ 0\\0\end{pmatrix}, |01\rangle = \begin{pmatrix}0 \\ 0 \\ 1\\0\end{pmatrix},|11\rangle = \begin{pmatrix}0 \\ 0 \\ 0\\1\end{pmatrix}.$

The entangled state

$\left|tangle\right\rangle = \frac{1}{\sqrt{2}}(\left|11\right\rangle + \left|00\right\rangle)$

has then the density matrix

$\left|tangle\right\rangle\langle tangle| = \frac{1}{2}(|11\rangle + |00\rangle)(\langle 11| + \langle 00|)$

or in matrix form

$\frac{1}{2}\begin{pmatrix}1 & 0 & 0 & 1\\ 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \\ 1 & 0 & 0 & 1\end{pmatrix}.$

Using QuTiP you would do something like this

from qutip import *
import numpy as np
sqrt = np.sqrt
epr = (basis(4,0) + basis(4,3))/sqrt(2)

and this returns the matrix shown above.