This is a straightforward implementation of an autoencoder of MNIST numbers. Instead of elaborating in lengthy sentences on a whole deeplearning universe I’ll summarize things in telegram style:

  • MNIST is a collection of images representing number
  • autoencoder means that you input the images and expect the images to come out as identical as possible with in between a reduction of the information. This can be compared to zip/unzip in one go and looking at the quality of the transition. A perfect autoencoder would return the identical input.
  • Keras is a popular deeplearning framework which internally makes use of TensorFlow or Theano for actual computation on CPU/GPU.
  • adadelta: adaptive learning rate optimization algorithm is one of the many optimization choices you have when using Keras
  • binary cross-entropy: is a way to measure how well the output is
  • input/output are vectors as a linearized representation of square images consisting of 1’s and 0’s
  • loss: in general a measure of how large the (collective) error is between input and output
  • accuracy refers to mistakes. If you wish, accuracy says how many times the prediction was correct and loss tells you how big the mistakes are.

Note that everything in this list is as much hard science as it is an art or craft. It’s OK  if you discover that another optimization algorithm suits your needs better. It’s fine to use any other deeplearning framework as well.

from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import *
import time

encoding_dim = 32  # from 784 to 32; serious compression
ae = Sequential()

inputLayer = Dense(784, input_shape=(784,))
ae.add(inputLayer)

middle = Dense(encoding_dim, activation='relu')
ae.add(middle)

output = Dense(784, activation='sigmoid')
ae.add(output)

ae.compile(optimizer='adadelta', loss='binary_crossentropy', metrics=['accuracy'])

(x_train, _), (x_test, _) = mnist.load_data()

# x_train is a 28x28 matrix with [0,255] color values
# this remaps things to the [0,1] range
x_train = x_train.astype('float32') / 255.0
x_test = x_test.astype('float32') / 255.0

# flattening the 28x28 matrix to a vector
x_train = x_train.reshape((len(x_train), np.prod(x_train.shape[1:])))
x_test = x_test.reshape((len(x_test), np.prod(x_test.shape[1:])))

start = time.time()
print("> Training the model...")
ae.fit(x_train, x_train,
       nb_epoch=5,
       batch_size=256,
       verbose=0,
       shuffle=True,  # whether to shuffle the training data before each epoch
       validation_data=(x_test, x_test))

print("> Training is done in %.2f seconds." % (time.time() - start))

# how well does it work?
print("> Scoring:")
scoring = ae.evaluate(x_test, x_test, verbose=0)
for i in range(len(ae.metrics_names)):
    print("   %s: %.2f%%" % (ae.metrics_names[i], 100 - scoring[i] * 100))

# loss: 90.34%
# acc: 98.62%