Deployment

Step 4: Deployment Instance Installations

If using DLAMI, activate pre-installed TensorFlow-Neuron environment (using source activate aws_neuron_tensorflow_p36 command) and skip this step.

On the instance you are going to use for inference, install TensorFlow-Neuron and Neuron Runtime

4.1. Follow Step 2 to install TensorFlow-Neuron.

  • Install neuron-cc if compilation on inference instance is desired (see notes above on recommended Inf1 sizes for compilation)
  • Skip neuron-cc if compilation is not done on inference instance

4.2. To install Neuron Runtime, see Getting started: Installing and Configuring Neuron-RTD.

Step 5: Deploy

In this step we run inference on Inf1 using the model compiled in Step 3.

5.1. Unzip the compiled model package from Step 3, download the example image, and install pillow module for inference:

unzip -o resnet50_neuron.zip
curl -O https://raw.githubusercontent.com/awslabs/mxnet-model-server/master/docs/images/kitten_small.jpg
pip install pillow # Necessary for loading images

5.2. On the Inf1 instance, create a inference Python script named infer_resnet50.py with the following content:

import os
import time
import numpy as np
import tensorflow as tf
from tensorflow.keras.preprocessing import image
from tensorflow.keras.applications import resnet50

tf.keras.backend.set_image_data_format('channels_last')

# Create input from image
img_sgl = image.load_img('kitten_small.jpg', target_size=(224, 224))
img_arr = image.img_to_array(img_sgl)
img_arr2 = np.expand_dims(img_arr, axis=0)
img_arr3 = resnet50.preprocess_input(img_arr2)

# Load model
COMPILED_MODEL_DIR = './resnet50_neuron/'
predictor_inferentia = tf.contrib.predictor.from_saved_model(COMPILED_MODEL_DIR)

# Run inference
model_feed_dict={'input': img_arr3}
infa_rslts = predictor_inferentia(model_feed_dict);

# Display results
print(resnet50.decode_predictions(infa_rslts["output"], top=5)[0])

Here we are providing an input image which is resized and convered into a an array. Thepredictor_inferentiafunction will load our saved neuron compiled model that will consume the image array and our inference results will be in ìnfa_rslts.

5.3. Run the inference:

python infer_resnet50.py
[('n02123045', 'tabby', 0.6956522), ('n02127052', 'lynx', 0.120923914), ('n02123159', 'tiger_cat', 0.08831522), ('n02124075', 'Egyptian_cat', 0.06453805), ('n02128757', 'snow_leopard', 0.0087466035)]