Jetson Nano and TensorFlow

I got a Jetson Nano for Christmas from my brother-in-law. This is a Single Board Computer (SBC), similar to a Raspberry Pi, but more oriented to machine learning as it has a powerful nVidia GPU on it. Neural networks like TensorFlow run most efficiently on GPUs.

The components I’m using are:

  • Jetson Nano 2Gb
  • Raspberry Pi USB-C power supply
  • Logitech wireless USB keyboard
  • 64Gb SD card
  • PiHut USB switch (so you can switch the Nano power on and off without having to use the plug socket

Setup instructions from nVidia are pretty good:

https://developer.nvidia.com/embedded/learn/get-started-jetson-nano-devkit

Note that the Nano comes in two versions – 4Gb and 2Gb – the first download link for the operating system is the 4Gb version, so if you are using the 2Gb version like me, make sure you click the second download link. It is a 6Gb zip which took around an hour to download. After that it needs to be unzipped, which produces a 14Gb disc image file, and written to your SD card. I’m using an Apple Mac, so as per the nVidia instructions, I used Etcher to copy the file to the SD card, which took around 10 minutes to write, and another 3 minutes to validate. The operating system is an Ubuntu variant, called Linux4Tigra, which runs the Lightweight Desktop Environment (http://www.lxde.org/):

Note that the Jetson Nano does not include a wireless ethernet card. In my case, I have a wireless range extender plugged in, and I’m using a wired connection to that. Alternatively, the latest versions of the Nano support wireless via USB. For example:

https://learn.sparkfun.com/tutorials/adding-wifi-to-the-nvidia-jetson/all

After getting the Nano up and running, I installed Python 3 and TensorFlow by following these instructions:

https://docs.nvidia.com/deeplearning/frameworks/install-tf-jetson-platform/index.html

I added an alias to .bash_aliases so that “python” is aliased to “python3”. (If you add an alias like this, it is recommended you add it for your user, not as a system wide alias, as this could break operating system functionality that requires python 2.)

Then I ran through the Tensorflow Quickstart tutorial, which is an image classification neural network, using the MNIST dataset 1, which is the digits from 0-9:

https://www.tensorflow.org/tutorials/quickstart/beginner

Although the code is all included in the above link, I want to inline it here to show just how short it is:

#!/usr/bin/python3

import tensorflow as tf

# load standard sample data from the mnist dataset 1
# dataset 1 is images of the digits 0-9
mnist = tf.keras.datasets.mnist

# load both training and test data
(x_train, y_train), (x_test, y_test) = mnist.load_data()
# the images have colour values from 0-255. these need to be scaled to be from 0-1
x_train, x_test = x_train / 255.0, x_test / 255.0

# now define our model
model = tf.keras.models.Sequential([
  tf.keras.layers.Flatten(input_shape=(28, 28)),
  tf.keras.layers.Dense(128, activation='relu'),
  tf.keras.layers.Dropout(0.2),
  tf.keras.layers.Dense(10)
])

predictions = model(x_train[:1]).numpy()

tf.nn.softmax(predictions).numpy()

loss_fn = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)

loss_fn(y_train[:1], predictions).numpy()

model.compile(optimizer='adam',
              loss=loss_fn,
              metrics=['accuracy'])

model.fit(x_train, y_train, epochs=5)

model.evaluate(x_test,  y_test, verbose=2)

model(x_test[:5])

I think it is amazing that in around 20 lines of python, we can load a dataset, define a neural network, train it and test it! Of course, the python is orchestrating the process, not doing the heavy lifting. The hard work is done by C code running on the GPU. Running this on the Nano took a couple of minutes. The output shows the work being sent to the GPU, as we expect. Here is an edited excerpt from the output:

2020-12-30 11:46:56.368109: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1884] Adding visible gpu devices: 0
2020-12-30 11:48:03.136928: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1428] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 22 MB memory) -> physical GPU (device: 0, name: NVIDIA Tegra X1, pci bus id: 0000:00:00.0, compute capability: 5.3)
Epoch 1/5
1875/1875 [==============================] - 17s 9ms/step - loss: 0.2918 - accuracy: 0.9158
Epoch 2/5
1875/1875 [==============================] - 17s 9ms/step - loss: 0.1385 - accuracy: 0.9584
Epoch 3/5
1875/1875 [==============================] - 16s 9ms/step - loss: 0.1049 - accuracy: 0.9678
Epoch 4/5
1875/1875 [==============================] - 17s 9ms/step - loss: 0.0864 - accuracy: 0.9733
Epoch 5/5
1875/1875 [==============================] - 16s 9ms/step - loss: 0.0738 - accuracy: 0.9774
2020-12-30 11:51:36.564930: W tensorflow/core/framework/cpu_allocator_impl.cc:81] Allocation of 31360000 exceeds 10% of free system memory.
313/313 - 2s - loss: 0.0733 - accuracy: 0.9774
This entry was posted in Machine Learning, Python and tagged , . Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *

HTML tags are not allowed.

517,762 Spambots Blocked by Simple Comments