How to Install and Use RAPIDS on your Windows Workstation

Photo by Logan Mayer on Unsplash

I have been having troubles finding a good end-to-end article that could help me install and teach me how to use RAPIDS on my Windows Workstation.

At the moment, RAPIDS has as prerequisite that you have to be running the Ubuntu or CentOS Operating Systems on your Workstation to be able to use it (more about prerequisites here).

This wasn’t my case. I had a Windows 10 Pro running on my computer, and I hit the wall hard when I realized there’s little to no documentation on the internet on how to properly install RAPIDS if you have Windows. Of course, there are references, but I am more oriented towards the data science area than the technical areas — hence, I struggled a bit until I found the solution.

I am making this article in the hopes that it would help anybody that is in the same situation as I was when I started. I will also explain the concepts to the best of my ability, so you won’t follow the guidelines mindlessly, but also understand what you’re doing.

As a reference, I have a HP Z8 Workstation with an NVIDIA Quadro RTX 8000, and a Windows 10 Pro for Workstations.

I would also like to add some very special thanks to my partner, who is much more technical than me and has been kind enough to help and guide me throughout this journey.

Ok, let’s dive in! (as a PS: this process might take a few hours and up to one day, so be patient and take some time to do it)

1. Getting Started

This is one of the main articles you’ll be following: Cuda on WSL. I encourange you to open the link and follow it with this article side by side, as there are some little tweaks and perks that I did that you might also need.

So, how are you going to use Ubuntu, if you have Windows. Here is where WSL 2 comes into action(you want to use “2”, not the first version).

WSL is a containerized environment within which users can run Linux native applications from the command line of the Windows 10 shell without requiring the complexity of a dual boot environment.

WSL is short from “Windows Subsystem for Linux” and it’s a pretty great method of installing Linux on your machine without having a VDI (Virtual Desktop Environment, which creates a computer on your computer), or without having to dual boot (which comes with the effort of having to switch between your operating systems, having to download on both any required packages, applications etc.)

So, in the “Getting Started” chapter of the Cuda on WSL tutorial you are required to:

  • Install Microsoft Windows Insider Program Builds: Here you make sure that you have your Windows OS up and ready for what’s coming, so you won’t have any weird unexpected errors later in the process. You’ll have to register and do the steps in the “2. Flight” section (this might take a few hours). After you’re finished, be sure you check your build version by pressing “Windows” + R and write winver in the console. Your build should be greater than 20145.
Windows Program Builds
  • Installing the NVIDIA Drivers: The NVIDIA driver is a software application that you will have to install on your workstation. It enables the communication between Windows and the NVIDIA Graphics GPU. You’ll have to make a free account on the NVIDIA Developer website and follow the steps in the tutorial to download and run the executable (either QUADRO or GEFORCE, depending on you NVIDIA GPU).
  • Installing WSL 2: This is the last step in this section. You’ll need to install WSL 2 from this article in the “Manual Installation Steps” section. At Step 6, I chose to install the “Ubuntu 20.04 LTS” Linux Distribution. Another thing I would recommend is installing the Windows Terminal from the Store (it provides an interface where you can use multiple terminals like Ubuntu, Wondows Power Shell etc. without the need to change between them all the time). To check your WSL version, write in the Windows Power Shell : wsl cat /proc/version .
Check your WSL version

2. Setting up the CUDA Toolkit

The NVIDIA® CUDA® Toolkit provides a development environment for creating high performance GPU-accelerated applications. The toolkit includes GPU-accelerated libraries, debugging and optimization tools, a C/C++ compiler, and a runtime library to build and deploy your application on major architectures including x86, Arm and POWER. More here.

In the same tutorial which we used to prepare the stage and get our workstation ready to use, we have the “Setting up CUDA Toolkit” chapter. The commands shown in the tutorial will be wrote from now on in the Ubuntu Command Prompt:

  • Setting up the CUDA network repository: Commands are exactly as they are shown in the tutorial; however, if a “Permission denied” error ocures, you might want to add sudo at the beginning of each line, like this:
sudo apt-key adv --fetch-keys http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/7fa2af80.pubsudo sh -c 'echo "deb http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64 /" > /etc/apt/sources.list.d/cuda.list'sudo apt-get update
  • Installing CUDA: This process might take a bit longer, so you can go grab yourself a beverage until it finishes. Again, I used sudo at the beginning of the command, but the rest is the same as in the tutorial.
sudo apt-get install -y cuda-toolkit-11-0

3. Running CUDA Applications

The next chapter in the tutorial is “Running CUDA Applications”. Here, the alternative method they are providing to build the CUDA samples didn’t work for me. So, I encourage you to:

  • First go in the usr/local/cuda/samples/4_Finance/BlackScholes folder (as shown below).
cd /usr/local/cuda/samples/4_Finance/BlackScholes
  • Run sudo make and then rellocate your path wherever the output shows you to go (in the example, the path is highlighted in orange).
sudo makecd ../../bin/x86_64/linux/release/
  • Run the ./BlackScholes command. It will initialize and execute the GPU Kernel. Once is done, it will prompt “test passed” like in the image below.
./BlackScholes
Test Passed

4. Setting up the Run Containers

Containers are a form of operating system virtualization. A single container might be used to run anything from a small microservice or software process to a larger application. Inside a container are all the necessary executables, binary code, libraries, and configuration files.

Here we’ll be using Docker Containers — meaning we need to first install a Docker that will carry all the Containers. This will enable us further to run RAPIDS, install other Data Science Libraries or create Jupyter Notebooks — all safely available in the Container.

4.1 Installing the Docker

Installing the Docker was harder and required more attention than I thought. In the tutorial, in the “Setting up to Run Containers” chapter, in the “Install Docker” subsection, the command looked simple:

curl https://get.docker.com | sh       # ERROR

But for me it threw an error:

error thrown

It says that the recommendation would be to use Docker Desktop for Windows; however, the tutorial states clearly that the NVIDIA Container Toolkit doesn’t support the Desktop Docker at the moment. Hence, I had to run manually the orange commands, which themselves were having errors as well (they were missing some important “ ‘ ”). Below is a snippet of the initial commands and the corrected version (so use these instead):

# === INITIAL (do not recommend) ===sudo -E sh -c apt-get update -qq >/dev/null
sudo -E sh -c DEBIAN_FRONTEND=noninteractive apt-get install -y -qq apt-transport-https ca-certificates curl >/dev/null
sudo -E sh -c curl -fsSL "https://download.docker.com/linux/ubuntu/gpg" | apt-key add -qq - >/dev/null
sudo -E sh -c echo "deb [arch=amd64] https://download.docker.com/linux/ubuntu focal stable" > /etc/apt/sources.list.d/docker.list
sudo -E sh -c apt-get update -qq >/dev/null
sudo -E sh -c apt-get install -y -qq --no-install-recommends docker-ce >/dev/null
# === HOW IT SHOULD BE ===

sudo -E sh -c 'apt-get update -qq' >/dev/null
sudo -E sh -c 'DEBIAN_FRONTEND=noninteractive apt-get install -y -qq apt-transport-https ca-certificates curl' >/dev/null
sudo -E sh -c 'curl -fsSL "https://download.docker.com/linux/ubuntu/gpg" | apt-key add -qq -'
sudo -E sh -c 'echo "deb [arch=amd64] https://download.docker.com/linux/ubuntu focal stable" > /etc/apt/sources.list.d/docker.list'
sudo -E sh -c 'apt-get update -qq' >/dev/null
sudo -E sh -c 'apt-get install -y -qq --no-install-recommends docker-ce' >/dev/null

4.2 Installing the NVIDIA Container Toolkit

The NVIDIA Container Toolkit allows users to build and run GPU accelerated Docker containers. The toolkit includes a container runtime library and utilities to automatically configure containers to leverage NVIDIA GPUs.

This part was again the same process as described in the tutorial in the “Install NVIDIA Container Toolkit”.

distribution=$(. /etc/os-release;echo $ID$VERSION_ID)

curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -

curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list

curl -s -L https://nvidia.github.io/libnvidia-container/experimental/$distribution/libnvidia-container-experimental.list | sudo tee /etc/apt/sources.list.d/libnvidia-container-experimental.list
sudo apt-get update

sudo apt-get install -y nvidia-docker2
sudo service docker stop

sudo service docker start

5. Run CUDA Containers

5.1 Simple CUDA containers

As shown in the tutorial, just run the following command:

sudo docker run --gpus all nvcr.io/nvidia/k8s/cuda-sample:nbody nbody -gpu -benchmark

5.2 Jupyter Notebook

We’re finally getting closer to the Data Science. In the same Ubuntu terminal you’ve been writing commands so far, just run the command in the “Jupyter Notebooks” subsection in the tutorial . You should afterwards see something like this:

sudo docker run -it --gpus all -p 8888:8888 tensorflow/tensorflow:latest-gpu-py3-jupyter
jupyter notebook installation

6. Installing RAPIDS

RAPIDS has on their website a Get Started section from where you can copy the necessary comand to install RAPIDS. Just select the preferred method and copy the 2 commands in the terminal (be weary of the path — I am now in /mnt/c/Users/andrada):

sudo docker pull rapidsai/rapidsai:cuda10.1-runtime-ubuntu18.04
sudo docker run --gpus all --rm -it -p 8888:8888 -p 8787:8787 -p 8786:8786 \
rapidsai/rapidsai:cuda10.1-runtime-ubuntu18.04

Now, as you can see in orange, a JupyterLab server has been started, with the localhost:8888. However, besides the fact that the above command is long and hard to remember, this method would never manage to keep what you create (notebooks, files etc.) after you end the current session. The container is going to purge any information once you close it (not very practical, huh?).

There is a way though that can help ease both the access to the container and keep the files you create even after closing. You’ll have to make your local folders visible by mounting to /rapids/notebooks/host. To do that, go ahead and create a path like that wherever you would like on your computer (in my case, I created a main folder names 9. RAPIDS in which I created an empty notebooks folder).

Hence, what we’ll do is create a .sh file in the 9. RAPIDS folder (or however you called yours). Then enter in the path and write the following:

vim start.sh

You’ll then enter in an empty document. Then, copy-paste the following lines:

docker run --gpus all --rm -it -p 8888:8888 -p 8787:8787 -p 8786:8786 \
-v "$(pwd)/notebooks:/rapids/notebooks/host" \
rapidsai/rapidsai:cuda10.1-runtime-ubuntu18.04

Then, press :x to save and exit the document. What the above command will do is open a new docker container every time you run the start.sh file. The command also mounts the host folder, so you can create notebooks and save files in the Jupyter environment and still keep/reuse them even after terminating the session. Keep in mind that any libraries/packages you install in this container during the session are STILL going to be lost afterwards (I am still working on a solve to this).

Now what remains is:

# make the file executable
chmod +x start.sh
# run this line of code any time you want to open the environment
sudo ./start.sh

To open the Jupyter environment, open the browser and write in the address bar one of the following:

http://localhost:8888/lab?    - to open Jupyter Labhttp://localhost:8888/tree    - to open the classic Jupyter Notebook

You’ll see multiple folders there: if you’ll write a new notebook in any of them besides the host , they will be deleted after the end of the session. However, the notebooks you write in the host folder will remain and be available to check out in your “notebooks” folder (which is in my case located in the 9.RAPIDS folder).

the WEB envirnoment
the LOCAL folder with the saved files

After this, you can safely enjoy any RAPIDS library like cupy , cuml , cudf etc. In this opened environment in the terminal you can also install libraries as you would install them with the Anaconda prompt (example in orange below):

To check your memory usage and performance of the GPU devices, you have to run the following:

sudo apt install nvidia-utils-455nvidia-smi

Afterwards, you should see something like this:

The drawback to this method is that the libraries and changes to the environment are still going to be deleted after the end of each session, even if the files you created or updated will still be available on your machine.

I am currently working on a solve for this, as the references are quite poor, but I will update this article as soon as I find a solution.

References

Z by HP & NVIDIA Data Science Global Ambassador. On the highway to becoming a Data Science Master.