Data ScienceKubernetesMachine LearningMLOpsTensorflow

Kubernetes Minikube #3 – Jupyter, Tensorflow with (external) GPU

Kubernetes scheduling a pod to a node with GPU resources to run a Jupyter notebook container demonstrating a python/tensorflow-gpu based application. All from a local Intel NUC machine using an eGPU. Yeah, we can do that. In this blog we take the NUC with eGPU – a Big Little ML Rig which we built earlier to the next level by adding Kubernetes support.

Objective

The steps below represent what is possible today and some of it is not so pretty. Expect a bunch of improvements to come. To build our local workstation ML eGPU Kubernetes Rig do as follows:

  1. Build the base Machine Learning Rig as described at NUC with eGPU – a Big Little ML Rig
  2. Verify basic Docker and nvidia-docker use cases
  3. Become familiar with using Kubernetes’ Minikube as described at Messing with Kubernetes Minikube #1 – Configmaps, Storage, Ingress
  4. Create a mlops project namespace and install the tensorflow-gpu Jupyter notebook gpu aware pod
  5. Run a few machine learning test cases to verify the environment

kube-egpu

Setup

0. Machine Learning Rig Setup

Prepare your machine as per the instructions at NUC with eGPU – a Big Little ML Rig .  Verify your rig is ready with a few tests such as below.

$ nvidia-smi
...
$ cd ~/MLOps/NVIDIA_CUDA-9.0_Samples/bin/x86_64/linux/release
$ ./deviceQuery
...
PASS

1. Basic Docker Tests

Now verify docker/nvidia/GPU integration with a few tests with a tensorflow image using the nvidia-docker driver and then straight Docker. If successful, the eGPU shuld be visible as per below.

$ minikube stop
$ systemctl restart nvidia-docker
$ curl -s localhost:3476/docker/cli
$ systemctl restart docker.service
$ docker pullgcr.io/tensorflow/tensorflow:latest-gpu
$ nvidia-docker run -it gcr.io/tensorflow/tensorflow:latest-gpu /bin/bash 
root@123567890:/notebooks# python 
>>> import tensorflow as tf 
>>> tf.test.gpu_device_name()
...
u'/gpu:0'
Ctrl-D

$ docker pull gcr.io/tensorflow/tensorflow:latest-gpu
$ sudo docker run -it -p 8888:8888 `curl -s localhost:3476/docker/cli` gcr.io/tensorflow/tensorflow:latest-gpu
root@123567890:/notebooks# python 
>>> import tensorflow as tf 
>>> tf.test.gpu_device_name() 
... 
u'/gpu:0' Ctrl-D 

You can also do this from Jupyter. Launch a browser with the Jupyter notebook URL as displayed in container stdout.  From Notebook, create new Python2 script and run and verify results as per below.

notebook-test

2. Launch Minikube

Good so far! Launch Minikube with the feature gates. It may be worthwhile restarting Minikube from a clean/reset environment as described at Messing with Kubernetes Minikube #1 – Configmaps, Storage, Ingress .

$ minikube stop 

$ sudo minikube start \ 
  --vm-driver=none \ 
  --feature-gates=Accelerators=true

2. Inspect Kubernetes System

From a separate terminal window launch the dashboard and tail the logfiles.

$ export NODE=$(kubectl get nodes --no-headers | awk '{print $1}')
$ kubectl describe node $NODE

$ minikube dashboard 
$ journalctl -f -u docker.service

Verify

0. Set up Environment

Create a project namespace for the pods you are about to create. We will need some artefacts from inside the pods later, so copy them across to a volume that will be mounted into the pods. Your cudnn_samples and nvidia-driver location/version may vary.

# Create a new namespace 
$ kubectl create -f - << EOF! 
apiVersion: v1 
kind: Namespace 
metadata: 
  name: mlops 
EOF!

$ cd /var/lib/nvidia-docker/volumes/nvidia_driver/384.90
$ sudo cp -r /usr/local/cuda-8.0/samples .
$ sudo cp -r $HOME/MLOps/cudnn_samples_v7 .
$ mkdir cudnn
$ sudo cp ~Downloads/libcudnn6_6.0.21-1+cuda8.0_amd64.deb cudnn 
$ sudo cp ~/Download/libcudnn6-doc_6.0.21-1+cuda8.0_amd64.deb cudnn
$ sudo cp ~/Downloads/libcudnn6-dev_6.0.21-1+cuda8.0_amd64.deb cudnn

1. Tensorflow with Jupyter Pod

Now for our first pod. Inspect the nvidia, gpu, volume related configuration in https://bitbucket.org/emergile/MLOps/src/master/tensorflow/tf-jupyter.yaml , in particular the alpha.kubernetes.io/nvidia-gpu setting and the lib path. You may need to download and edit the lib path in the .yaml file to reflect your NVIDIA driver version.  Create the pod as verify as follows.  Results should be the same as the Setup tests.

$ export YAML=https://bitbucket.org/emergile/MLOps/src/master/tensorflow/tf-jupyter.yaml

$ curl -s $YAML | grep nvidia
 path: /var/lib/nvidia-docker/volumes/nvidia_driver/384.90
 alpha.kubernetes.io/nvidia-gpu: 1
 alpha.kubernetes.io/nvidia-gpu: 1
 - mountPath: /usr/local/nvidia

$ kubectl create -n=mlops -f $YAML
$ kubectl get pods -n=mlops

$ export POD=$(kubectl get pods -n=mlops --no-headers | awk '{print $1}')
$ kubectl exec -it -n=mlops $POD -- /bin/bash
root@$POD:/notebooks# cd /usr/local/nvidia/samples/1_Utilities/deviceQuery
root@$POD:/notebooks# ./deviceQuery
...
deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 9.0, CUDA Runtime Version = 8.0, NumDevs = 1, Device0 = GeForce GTX 1070
Result = PASS
root@$POD:/notebooks# python
...
>>> import tensorflow as tf 
>>> tf.test.gpu_device_name()
...
u'/gpu:0'
Ctrl-D

2. Jupyter Samples

Try some of the other notebooks by copying/pasting the Jupyter token from the pod logfile and then appending to the URL http://127.0.0.1:30888/?token= inside your Browser. All should complete successfully.

jupyter-samples

3. cuDNN mnistCUDNN Sample

Install the cudnn packages that were copied to your volume mount. Copy to a writable location, e.g. $HOME, the cudnn samples, then verify as follows.

$ export POD=$(kubectl get pods -n=mlops --no-headers | awk '{print $1}') 
$ kubectl exec -it -n=mlops $POD -- /bin/bash 

root@$POD:/notebooks # cd /usr/local/nvidia/cudnn
root@$POD:/notebooks # sudo dpkg -i libcudnn6_6.0.21-1+cuda8.0_amd64.deb
root@$POD:/notebooks # sudo dpkg -i libcudnn6-doc_6.0.21-1+cuda8.0_amd64.deb
root@$POD:/notebooks # sudo dpkg -i libcudnn6-dev_6.0.21-1+cuda8.0_amd64.deb

root@$POD:/notebooks # cp -r /usr/local/nvidia/cudnn_sample_v7 /home
root@$POD:/notebooks # cd /home/cudnn_samples_v7/mnistCUDNN
root@$POD:/notebooks # ./mnistCUDNN 
cudnnGetVersion() : 6021 , CUDNN_VERSION from cudnn.h : 6021 (6.0.21)
Host compiler version : GCC 5.4.0
There are 1 CUDA capable devices on your machine :
device 0 : sms 15 Capabilities 6.1, SmClock 1721.0 Mhz, MemSize (Mb) 8114, MemClock 4004.0 Mhz, Ecc=0, boardGroupID=0
Using device 0
...
Result of classification: 1 3 5

Test passed!

4. Mandelbroot

Try the Mandelbrot example again by reading in the sample at: https://bitbucket.org/emergile/MLOps/src/master/gpu/Mandelbrot.pl into a Python2 script from your Jupyter notebook.

Mandelbroot-kube

5. Not Hot Dog

And of course, repeat our homage to the “Silicon Valley” Not Hotdog episode.

$ export POD=$(kubectl get pods -n=mlops --no-headers | awk '{print $1}') 
$ kubectl exec -it -n=mlops $POD -- /bin/bash 
root@$POD:/notebooks # apt-get update
root@$POD:/notebooks # apt-get install -y git
root@$POD:/notebooks # apt-get install -y wget
root@$POD:/notebooks # git clone https://github.com/tensorflow/models.git
root@$POD:/notebooks # cd models/tutorials/image/imagenet
root@$POD:/notebooks # python classify_image.py
...
giant panda, panda, panda bear, coon bear, Ailuropoda melanoleuca (score = 0.89632)
indri, indris, Indri indri, Indri brevicaudatus (score = 0.00766)
...
root@$POD:/notebooks # wget https://upload.wikimedia.org/wikipedia/commons/3/3a/NCI_Visuals_Food_Hot_Dog.jpg
root@$POD:/notebooks # python classify_image.py --image_file=NCI_Visuals_Food_Hot_Dog.jpg 
...
hotdog, hot dog, red hot (score = 0.96892)
...

Trivia

Troubleshooting

Yes you will run into problems. Review the various troubleshooting tips in blogs under Kubernetes as these will still apply.

Futures

This is a smoking hot new capability and, for sure, there are many improvements to come. There are a few things not to like about the current alpha release, but there are promising things in the pipe such as https://kubernetes.io/docs/concepts/cluster-administration/device-plugins/ :

<snippet> “Starting in version 1.8, Kubernetes provides a device plugin framework for vendors to advertise their resources to the kubelet without changing Kubernetes core code. Instead of writing custom Kubernetes code, vendors can implement a device plugin that can be deployed manually or as a DaemonSet. The targeted devices include GPUs, High-performance NICs, FPGAs, InfiniBand, and other similar computing resources that may require vendor specific initialization and setup. </snippet>

 

Leave a Reply