Step 2: Compilation Instance Installations

If using Conda DLAMI version 26 and up, activate pre-installed TensorFlow-Neuron environment (using source activate aws_neuron_tensorflow_p36 command). Please update Neuron by following update steps in DLAMI release notes.

To install in your own AMI, please see Neuron Install Guide to setup virtual environment and install TensorFlow-Neuron (tensorflow-neuron) and Neuron Compiler (neuron-cc) packages.

Step 3: Compile on Compilation Instance

A trained model must be compiled to Inferentia target before it can be deployed on Inferentia instances. In this step we compile the Keras ResNet50 model and export it as a SavedModel which is an interchange format for TensorFlow models.

3.1. Create a python script named with the following content:

import os
import time
import shutil
import tensorflow as tf
import tensorflow.neuron as tfn
import tensorflow.compat.v1.keras as keras
from tensorflow.keras.applications.resnet50 import ResNet50
from tensorflow.keras.applications.resnet50 import preprocess_input

# Create a workspace
WORKSPACE = './ws_resnet50'
os.makedirs(WORKSPACE, exist_ok=True)

# Prepare export directory (old one removed)
model_dir = os.path.join(WORKSPACE, 'resnet50')
compiled_model_dir = os.path.join(WORKSPACE, 'resnet50_neuron')
shutil.rmtree(model_dir, ignore_errors=True)
shutil.rmtree(compiled_model_dir, ignore_errors=True)

# Instantiate Keras ResNet50 model

model = ResNet50(weights='imagenet')

# Export SavedModel
    session            = keras.backend.get_session(),
    export_dir         = model_dir,
    inputs             = {'input': model.inputs[0]},
    outputs            = {'output': model.outputs[0]})

# Compile using Neuron
tfn.saved_model.compile(model_dir, compiled_model_dir)    

# Prepare SavedModel for uploading to Inf1 instance
shutil.make_archive('./resnet50_neuron', 'zip', WORKSPACE, 'resnet50_neuron')

3.2. Run the compilation script, which will take a few minutes on c5.4xlarge. At the end of script execution, the compiled SavedModel is zipped as in local directory:

INFO:tensorflow:fusing subgraph neuron_op_d6f098c01c780733 with neuron-cc
INFO:tensorflow:Number of operations in TensorFlow session: 4638
INFO:tensorflow:Number of operations after tf.neuron optimizations: 556
INFO:tensorflow:Number of operations placed on Neuron runtime: 554
INFO:tensorflow:Successfully converted ./ws_resnet50/resnet50 to ./ws_resnet50/

3.3. If not compiling and inferring on the same instance, copy the artifact to the inference server:

scp -i <PEM key file>  ./ ubuntu@<instance DNS>:~/ # if Ubuntu-based AMI
scp -i <PEM key file>  ./ ec2-user@<instance DNS>:~/  # if using AML2-based AMI