UPDATE – Visit SparkR with OpenShift for lastest r-notebook/OpenShift solution. – UPDATE
The cool kids are using Jupyter notebooks. But we are going to step it up a notch by hosting an R-enabled Jupyter notebook on OpenShift. Max Bugger, may he rest-in-peace, will show you how. This lab is another in the OpenShift MiniLabs series.
Objectives
Let’s demonstrate hosting a Jupyter notebook instance as a container managed by OpenShift. Pulling down a prebuilt r-notebook image means we can get started very easily without any messy local workstation configuration. Moreover, the user work directory is mapped to an external volume such that any installed packages and scripts are preserved on container restart.
Setup
Initial Attempt
This tutorial assumes you have completed the OpenShift MiniLabs installation procedure. Then refresh before continuing.
Repeat Attempt
To reset your environment to repeat this tutorial do the following:
$ cd ~/containersascode $ ./oc-cluster-wrapper/oc-cluster up containersascode $ oc login -u system:admin $ oc delete persistentvolumeclaim workclaim $ oc delete persistentvolume workvolume $ rm -rf ~/.oc/profiles/containersascode/volumes/workvolume $ oc login -u developer -p developer $ oc delete project jupyter
Instructions
This demonstration begins by creating a persistent volume that can be later claimed by a container instance. This step is something typically done by an Administrator.
Create the Persistent Volume
Replace $VOLUMEPATH below with your preferred host-path location.
$ oc login -u system:admin $ oc get pv $ oc create -f - << EOF! apiVersion: v1 kind: PersistentVolume metadata: name: workvolume spec: capacity: storage: 2Gi accessModes: - ReadWriteOnce - ReadWriteMany persistentVolumeReclaimPolicy: Recycle hostPath: path: $VOLUMEPATH EOF! $ oc get pv
Create Project
Let’s create a project for our new application. The Jupyter r-notebook container needs some extra privileges so will assign that as follows:
$ oc login -u developer -p developer $ oc new-project jupyter --display-name='Jupyter' --description='Jupyter' $ oc login -u system:admin $ oc project jupyter $ oc adm policy add-scc-to-user anyuid -z default
Create Application
You can create an OpenShift container straight from an image on Docker hub.
$ oc login -u developer -p developer $ oc project jupyter $ docker pull jupyter/r-notebook $ oc new-app jupyter/r-notebook -l name='r-notebook' --name='r-notebook' $ oc expose service r-notebook
Claim the Storage
The set volume command will automatically trigger and new deployment of the r-notebook container.
$ oc login -u developer -p developer $ oc project jupyter $ oc set volume dc/r-notebook --add \ --overwrite \ --name=work \ --type=persistentVolumeClaim \ --mount-path=/home/jovyan/work \ --claim-size=1Gi \ --claim-name=workclaim \ --containers=r-notebook
Login to the r-notebook
To login to the r-notebook instance we will need to recover the token. This can be copy/pasted from the container’s ($PODID) log file using instructions as per below or from the Console. You can then inspect your shiny new Jupyter notebook with a URL such as http://r-notebook-jupyter.127.0.0.1.nip.io/?token=$TOKEN
$ oc login -u developer -p developer $ oc project jupyter $ oc get pods | grep r-notebook $ oc logs -f $PODID // Copy the token $TOKEN
Verify Lab Success
Here are a couple of tests to verify that your r-notebook can save state.
Install a Package
Create a notebook and then install a new package. Restart the container and verify that it still appears in the installed list. This means you can add a growing list of R packages easily.
[1]: install.packages('plyr', lib='/home/jovyan/work')
[1]: installed.packages(lib='/home/jovyan/work')
Create a Folder
Now try something more creative. For example, create a new folder, rename it to “iris”. Then create a new file called “sample” inside the “iris” folder. Paste the following below into the first cell [1]. Then Run that Cell, Save and Checkpoint the file, which should complete with the graphic below. Restart your container and confirm that your new “iris” folder and “sample” are preserved.
library(dplyr) iris iris %>% group_by(Species) %>% summarise(Sepal.Width.Avg = mean(Sepal.Width)) %>% arrange(Sepal.Width.Avg) library(ggplot2) ggplot(data=iris, aes(x=Sepal.Length, y=Sepal.Width, color=Species)) + geom_point(size=3)
Trivia
Knock yourself out at http://jupyter.org/