# Serving TensorFlow models

A SavedModel contains a complete TensorFlow program, including weights and computation. It does not require the original model building code to run, which makes it useful for deploying (with TFLite, TensorFlow.js, or TensorFlow Serving and sharing models (with TFHub).

For a quick introduction, this section exports a pre-trained Keras model and serves image classification requests with it.

We’ll use an image of Grace Hopper as a running example, and a Keras pre-trained image classification model since it’s easy to use. Custom models work too, and are covered in detail later.

The top prediction for this image is “military uniform”.

The last directory (/1/ here) is a number associated with this version of your model – it allows tools like Tensorflow Serving to reason about the relative freshness of different versions of the same model.

SavedModels have named functions called signatures. Keras models export their forward pass under the serving_default signature key. The SavedModel command line interface is useful for inspecting SavedModels on disk:

We can load the SavedModel back into Python with tf.saved_model.load and see how Admiral Hopper’s image is classified.

Imported signatures always return dictionaries.

Running inference from the SavedModel gives the same result as the original model.

## Serving the model

SavedModels are usable from Python, but production environments will typically want a dedicated service for inference. This is easy to set up from a SavedModel using TensorFlow Serving.

See the TensorFlow Serving REST tutorial for more details about serving, including instructions for installing tensorflow_model_server in a notebook or on your local machine. As a quick sketch, to serve the mobilenet model exported above just point the model server at the SavedModel directory:

The resulting predictions are identical to the results from Python.

# SavedModel format

A SavedModel is a directory containing serialized signatures and the state needed to run them like variable values and vocabularies.

The saved_model.pb file contains a set of named signatures, each identifying a function.

SavedModels may contain multiple sets of signatures (multiple MetaGraphs, identified with the tag_set argument to saved_model_cli), but this is rare. APIs which create multiple sets of signatures include tf.Estimator.experimental_export_all_saved_models and in TensorFlow 1.x tf.saved_model.Builder.

The variables directory contains a standard training checkpoint (see the guide to training checkpoints).

The assets directory contains files used by the TensorFlow graph, for example text files used to initialize vocabulary tables. It is unused in this example.

SavedModels may have an assets.extra directory for any files not used by the TensorFlow graph, for example information for consumers about what to do with the SavedModel. TensorFlow itself does not use this directory.

# Exporting custom models

In the first section, tf.saved_model.save automatically determined a signature for the tf.keras.Model object. This worked because Keras Model objects have an unambiguous method to export and known input shapes. tf.saved_model.save works just as well with low-level model building APIs, but you will need to indicate which function to use as a signature if you’re planning to serve a model.

This module has two methods decorated with tf.function. While these functions will be included in the SavedModel and available if the SavedModel is reloaded via tf.saved_model.load into a Python program, without explicitly declaring the serving signature tools like Tensorflow Serving and saved_model_cli cannot access them.

module.mutate has an input_signature, and so there is enough information to save its computation graph in the SavedModel already. __call__ has no signature and so this method needs to be called before saving.

For functions without an input_signature, any input shapes used before saving will be available after loading. Since we called __call__ with just a scalar, it will accept only scalar values.

The function will not accept new shapes like vectors.

get_concrete_function lets you add input shapes to a function without calling it. It takes tf.TensorSpec objects in place of Tensor arguments, indicating the shapes and dtypes of inputs. Shapes can either be None, indicating that any shape is acceptable, or a list of axis sizes. If an axis size is None then any size is acceptable for that axis. tf.TensorSpecs can also have names, which default to the function’s argument keywords (“x” here).

Functions and variables attached to objects like tf.keras.Model and tf.Module are available on import, but many Python types and attributes are lost. The Python program itself is not saved in the SavedModel.

We didn’t identify any of the functions we exported as a signature, so it has none.

## Identifying a signature to export

To indicate that a function should be a signature, specify the signatures argument when saving.

Notice that we first converted the tf.function to a ConcreteFunction with get_concrete_function. This is necessary because the function was created without a fixed input_signature, and so did not have a definite set of Tensor inputs associated with it.

We exported a single signature, and its key defaulted to “serving_default”. To export multiple signatures, pass a dictionary.

saved_model_cli can also run SavedModels directly from the command line.

## Fine-tuning imported models

Variable objects are available, and we can backprop through imported functions.

## Control flow in SavedModels

Anything that can go in a tf.function can go in a SavedModel. With AutoGraph this includes conditional logic which depends on Tensors, specified with regular Python control flow.

# SavedModels from Estimators

Estimators export SavedModels through tf.Estimator.export_saved_model. See the guide to Estimator for details.

This SavedModel accepts serialized tf.Example protocol buffers, which are useful for serving. But we can also load it with tf.saved_model.load and run it from Python.

tf.estimator.export.build_raw_serving_input_receiver_fn allows you to create input functions which take raw tensors rather than tf.train.Examples.

# Load a SavedModel in C++

The C++ version of the SavedModel loader provides an API to load a SavedModel from a path, while allowing SessionOptions and RunOptions. You have to specify the tags associated with the graph to be loaded. The loaded version of SavedModel is referred to as SavedModelBundle and contains the MetaGraphDef and the session within which it is loaded.

C++ const string export_dir = ... SavedModelBundle bundle; ... LoadSavedModel(session_options, run_options, export_dir, {kSavedModelTagTrain}, &bundle);

 <br />&lt;a id=saved_model_cli/&gt; # Details of the SavedModel command line interface You can use the SavedModel Command Line Interface (CLI) to inspect and execute a SavedModel. For example, you can use the CLI to inspect the model's SignatureDefs. The CLI enables you to quickly confirm that the input Tensor dtype and shape match the model. Moreover, if you want to test your model, you can use the CLI to do a sanity check by passing in sample inputs in various formats (for example, Python expressions) and then fetching the output. ### Install the SavedModel CLI Broadly speaking, you can install TensorFlow in either of the following two ways: * By installing a pre-built TensorFlow binary. * By building TensorFlow from source code. If you installed TensorFlow through a pre-built TensorFlow binary, then the SavedModel CLI is already installed on your system at pathname bin\saved_model_cli. If you built TensorFlow from source code, you must run the following additional command to build saved_model_cli: 123456789101112131415161718192021222324252627282930 <br />&lt;a id=saved_model_cli/&gt; # Details of the SavedModel command line interface You can use the SavedModel Command Line Interface (CLI) to inspect andexecute a SavedModel.For example, you can use the CLI to inspect the model's SignatureDefs.The CLI enables you to quickly confirm that the inputTensor dtype and shape match the model. Moreover, if youwant to test your model, you can use the CLI to do a sanity check bypassing in sample inputs in various formats (for example, Pythonexpressions) and then fetching the output.  ### Install the SavedModel CLI Broadly speaking, you can install TensorFlow in either of the followingtwo ways: *  By installing a pre-built TensorFlow binary.*  By building TensorFlow from source code. If you installed TensorFlow through a pre-built TensorFlow binary,then the SavedModel CLI is already installed on your systemat pathname bin\saved_model_cli. If you built TensorFlow from source code, you must run the followingadditional command to build saved_model_cli:   $bazel build tensorflow/python/tools:saved_model_cli <br />### Overview of commands The SavedModel CLI supports the following two commands on a MetaGraphDef in a SavedModel: * show, which shows a computation on a MetaGraphDef in a SavedModel. * run, which runs a computation on a MetaGraphDef. ### show command A SavedModel contains one or more MetaGraphDefs, identified by their tag-sets. To serve a model, you might wonder what kind of SignatureDefs are in each model, and what are their inputs and outputs. The show command let you examine the contents of the SavedModel in hierarchical order. Here's the syntax: 123456789101112131415161718 <br />### Overview of commands The SavedModel CLI supports the following two commands on aMetaGraphDef in a SavedModel: * show, which shows a computation on a MetaGraphDef in a SavedModel.* run, which runs a computation on a MetaGraphDef. ### show command A SavedModel contains one or more MetaGraphDefs, identified by their tag-sets.To serve a model, youmight wonder what kind of SignatureDefs are in each model, and what are theirinputs and outputs. The show command let you examine the contents of theSavedModel in hierarchical order. Here's the syntax: usage: saved_model_cli show [-h] --dir DIR [--all] [--tag_set TAG_SET] [--signature_def SIGNATURE_DEF_KEY] <br />For example, the following command shows all available MetaGraphDef tag-sets in the SavedModel: 1234 <br />For example, the following command shows all availableMetaGraphDef tag-sets in the SavedModel:$ saved_model_cli show --dir /tmp/saved_model_dir The given SavedModel contains the following tag-sets: serve serve, gpu <br />The following command shows all available SignatureDef keys in a MetaGraphDef: 1234 <br />The following command shows all available SignatureDef keys ina MetaGraphDef:   $saved_model_cli show --dir /tmp/saved_model_dir --tag_set serve The given SavedModel MetaGraphDef contains SignatureDefs with the following keys: SignatureDef key: "classify_x2_to_y3" SignatureDef key: "classify_x_to_y" SignatureDef key: "regress_x2_to_y3" SignatureDef key: "regress_x_to_y" SignatureDef key: "regress_x_to_y2" SignatureDef key: "serving_default" <br />If a MetaGraphDef has *multiple* tags in the tag-set, you must specify all tags, each tag separated by a comma. For example: <pre>$ saved_model_cli show --dir /tmp/saved_model_dir --tag_set serve,gpu 1234567 <br />If a MetaGraphDef has *multiple* tags in the tag-set, you must specifyall tags, each tag separated by a comma. For example:   <pre>$saved_model_cli show --dir /tmp/saved_model_dir --tag_set serve,gpu  To show all inputs and outputs TensorInfo for a specific SignatureDef, pass in the SignatureDef key to signature_def option. This is very useful when you want to know the tensor key value, dtype and shape of the input tensors for executing the computation graph later. For example:  $ saved_model_cli show –dir \
/tmp/saved_model_dir –tag_set serve –signature_def serving_default
The given SavedModel SignatureDef contains the following input(s):
inputs[‘x’] tensor_info:
dtype: DT_FLOAT
shape: (-1, 1)
name: x:0
The given SavedModel SignatureDef contains the following output(s):
outputs[‘y’] tensor_info:
dtype: DT_FLOAT
shape: (-1, 1)
name: y:0
Method name is: tensorflow/serving/predict

### run command

 

Invoke the run command to run a graph computation, passing
inputs and then displaying (and optionally saving) the outputs.
Here’s the syntax:

usage: saved_model_cli run [-h] –dir DIR –tag_set TAG_SET –signature_def
SIGNATURE_DEF_KEY [–inputs INPUTS]
[–input_exprs INPUT_EXPRS]
[–input_examples INPUT_EXAMPLES] [–outdir OUTDIR]
[–overwrite] [–tf_debug]

where INPUTS is either of the following formats:

• input_key=<filename>
• input_key=<filename>[<variable_name>]

You may pass multiple INPUTS. If you do pass multiple inputs, use a semicolon
to separate each of the INPUTS.

saved_model_cli uses numpy.load to load the filename.
The filename may be in any of the following formats:

• .npy
• .npz
• pickle format

A .npy file always contains a numpy ndarray. Therefore, when loading from
a .npy file, the content will be directly assigned to the specified input
tensor. If you specify a variable_name with that .npy file, the
variable_name will be ignored and a warning will be issued.

When loading from a .npz (zip) file, you may optionally specify a
variable_name to identify the variable within the zip file to load for
the input tensor key. If you don’t specify a variable_name, the SavedModel
CLI will check that only one file is included in the zip file and load it
for the specified input tensor key.

When loading from a pickle file, if no variable_name is specified in the
square brackets, whatever that is inside the pickle file will be passed to the
specified input tensor key. Otherwise, the SavedModel CLI will assume a
dictionary is stored in the pickle file and the value corresponding to
the variable_name will be used.

#### --input_exprs

To pass inputs through Python expressions, specify the --input_exprs option.
This can be useful for when you don’t have data
files lying around, but still want to sanity check the model with some simple
inputs that match the dtype and shape of the model’s SignatureDefs.
For example:

In addition to Python expressions, you may also pass numpy functions. For
example:

(Note that the numpy module is already available to you as np.)

#### --input_examples

To pass tf.train.Example as inputs, specify the --input_examples option.
For each input key, it takes a list of dictionary, where each dictionary is an
instance of tf.train.Example. The dictionary keys are the features and the
values are the value lists for each feature.
For example:

#### Save output

By default, the SavedModel CLI writes output to stdout. If a directory is
passed to --outdir option, the outputs will be saved as npy files named after
output tensor keys under the given directory.

Use --overwrite` to overwrite existing output files.

Tags: