# Image recognition using MXNet

MXNet is an open source library for deep learning with the following salient features:

1. it supports both imperative and symbolic programming
2. it runs on CPUs or GPUs, on clusters, servers, desktops, or mobile phones
3. it supports over 7 programming languages, including C++, Python, R, Scala, Julia, Matlab, and Javascript.
4. it supports distributed training on multiple CPU/GPU machines, including AWS, GCE, Azure, and Yarn clusters.
5. it has an optimized C++ backend engine parallelizes both I/O and computation.

What triggered my interest was the R-package wrapping the whole library. While there are good neural network packages available, like nnet, this one has LSTM, RNN, GRU cells and whatnot. The nnet library only supports
feed-forward nets and multinomial log-linear models.

To install, use something like
install.packages("drat", repos="https://cran.rstudio.com")
drat:::addRepo("dmlc")
install.packages("mxnet")

and for our example you need the imager package as well.

You can train an image recognition net but to get immediate results you can download a pretrained model. This model can be loaded as follows:

library(mxnet)
library(imager)

model = mx.model.load("Inception_BN", iteration=39)
mean.img = as.array(mx.nd.load("mean_224.nd")[["mean_img"]])
synsets <<- readLines("synset.txt")


assuming the files are in the working directory.

The image recognition works only with jpg images, but that should not be a problem:

im <- load.image("Parrot.jpg")
plot(im)

Because the net expect an input of a certain dimension you cannot throw just any sized image at it. You need to vectorize it with a predefined dimension, here 224×224;

preproc.image <- function(im, mean.image) {
shape <- dim(im)
short.edge <- min(shape[1:2])
xx <- floor((shape[1] - short.edge) / 2)
yy <- floor((shape[2] - short.edge) / 2)
croped <- crop.borders(im, xx, yy)
resized <- resize(croped, 224, 224)
arr <- as.array(resized) * 255
dim(arr) <- c(224, 224, 3)
normed <- arr - mean.img
dim(normed) <- c(224, 224, 3, 1)
return(normed)
}

From here on it’s all fun and games:

normed <- preproc.image(im, mean.img)
prob <- predict(model, X=normed)

max.idx <- order(prob[,1], decreasing = TRUE)[1:5]
synsets[max.idx]

With the parrots above

you get something like the following:

[1] "n01818515 macaw"
[2] "n01820546 lorikeet"
[3] "n01843383 toucan"
[4] "n01819313 sulphur-crested cockatoo, Kakatoe galerita, Cacatua galerita"
[5] "n01531178 goldfinch, Carduelis carduelis"

which, considering the amount of code needed and the small size of the model is absolutely stunning.

The Python binding of MxNet on MacOS is an issue. Installing the R wrapper is a breeze but the dependency of OpenBLAS is an deal-breaker if you want to install the Python module. In fact, I was not able to bring it home. My impression of MxNet is, globally speaking, that it contains a lot of power but is not very reliable. Probably because it’s in a constant state of flux.

Tags: