These steps have been tested on Ubuntu 18.04, 20.04 and 22.04. Other Linux versions may work, since most processing takes place in a Docker container. However, the script is currently specific to Ubuntu. Your machine does need NVIDIA GPU drivers installed. Your machine does NOT need CUDA installed.

  1. Clone the repository
git clone && cd caiman-asr
  1. Install Docker
source training/
  1. Add your username to the docker group:
sudo usermod -a -G docker [user]

Run the following in the same terminal window, and you might not have to log out and in again:

newgrp docker
  1. Build the docker image
# Build from Dockerfile
cd training
  1. Start an interactive session in the Docker container mounting the volumes, as described in the next section.
./scripts/docker/ <DATASETS> <CHECKPOINTS> <RESULTS>


Currently, the reference uses CUDA-12.2. Here you can find a table listing compatible drivers:

Information about volume mounts

Setting up the training environment requires mounting the three directories: <DATASETS>, <CHECKPOINTS>, and <RESULTS> for the training data, model checkpoints, and results, respectively.

The following table shows the mappings between directories on a host machine and inside the container.

Host machineInside container


The host directories passed to ./scripts/docker/ must have absolute paths.

If your <DATASETS> directory contains symlinks to other drives (i.e. if your data is too large to fit on a single drive), they will not be accessible from within the running container. In this case, you can pass the absolute paths to your drives as the 4th, 5th, 6th, ... arguments to ./scripts/docker/ This will enable the container to follow symlinks to these drives.


During training, the model checkpoints are saved to the /results directory so it is sometimes convenient to load them from /results rather than from /checkpoints.

Next Steps

Go to the Data preparation docs to see how to download and preprocess data in advance of training.