Deploying Cornserve

Cornserve can be deployed on a GPU cluster managed by Kubernetes.

Important

Audio dependencies are not included by default in our Docker images due to license incompatibilities. The Cornserve Python package has audio dependencies as optional dependencies, and Eric's Dockerfile provides an extra build target (eric-audio) that includes audio dependencies. Before you install/run Cornserve, please ensure you understand and agree to dependency licensing terms.

Deploying K3s

Tip

If you have a Kubernetes cluster running, you can skip this section.

If you don't have a Kubernetes cluster running, you can deploy Cornserve on a K3s cluster. We also use the K3s distribution of Kubernetes for our development. Refer to their Documentation for more details.

Tip

If you're deploying on-premise with k3s, make sure you have plenty of disk space under /var/lib/rancher because containerd stores images there. If not, you can create a directory in a secondary storage (e.g., /mnt/data/rancher) and symlink it to /var/lib/rancher prior to starting k3s.

Clone the Repository

git clone https://github.com/cornserve-ai/cornserve.git
cd cornserve/kubernetes

Master Node

Install and start K3s:

curl -sfL https://get.k3s.io | INSTALL_K3S_SKIP_ENABLE=true sh -
sudo mkdir -p /etc/rancher/k3s
sudo cp k3s/server-config.yaml /etc/rancher/k3s/config.yaml
sudo systemctl start k3s

Note the master node address ($MASTER_ADDRESS) and the node token ($NODE_TOKEN):

NODE_TOKEN="$(sudo cat /var/lib/rancher/k3s/server/node-token)"

Worker Nodes

Install and start K3s:

curl -sfL https://get.k3s.io | K3S_URL=https://$MASTER_ADDRESS:6443 K3S_TOKEN=$NODE_TOKEN INSTALL_K3S_SKIP_ENABLE=true sh -
sudo mkdir -p /etc/rancher/k3s
sudo cp k3s/agent-config.yaml /etc/rancher/k3s/config.yaml
sudo systemctl start k3s-agent

NVIDIA Device Plugin

The NVIDIA GPU Device Plugin is required to expose GPUs to the Kubernetes cluster as resources. You can deploy a specific version like this:

kubectl create -f https://raw.githubusercontent.com/NVIDIA/k8s-device-plugin/v0.17.3/deployments/static/nvidia-device-plugin.yml

Deploying Cornserve

If you haven't already, clone the Cornserve repository:

git clone git@github.com:cornserve-ai/cornserve.git
cd cornserve

On top of a Kubernetes cluster, you can deploy Cornserve with a single command:

kubectl apply -k kubernetes/kustomize/cornserve-system/base
kubectl apply -k kubernetes/kustomize/cornserve/overlays/prod

Warning

With the prod overlay, the Cornserve Gateway is not exposed via any service by default. Users are expected to expose the Gateway in a way suitable for their infrastructure (e.g., a Load Balancer service). For local development or to quickly test Cornserve, use the local overlay, which will expose the Gateway via a NodePort service on port 30080. For more information on other Kustomize overlays we have, please check out the Contributor Guide.

If you'll be using gated models from Hugging Face Hub, you'll need to make the Hugging Face token available to Task Executors:

kubectl create -n cornserve secret generic cornserve-env --from-literal=hf-token=$HF_TOKEN

Note

The cornserve namespace is used for most of our control plane and data plane objects. On the other hand, the cornserve-system namespace is used for components that look over and manage the Cornserve system itself (under cornserve), like Jaeger and Prometheus.