Kubernetes Setup Guide

This guide will help you to get started with the deployment of Private AI container in a Kubernetes cluster.

1. Prerequisites

1.1 Install and setup `kubectl`

The Kubernetes command-line tool, kubectl, allows you to run commands against Kubernetes clusters.

Find installation instructions for your OS here.

1.2 Setup your Kubernetes cluster

There are many flavours of Kubernetes available that you can choose from. Setup the one that best suits your needs. Here are few popular Kubernetes services and distributions.

Azure Kubernetes Services (AKS)

Amazon Web Services (EKS)

Google Cloud Platfrom (GKE)

Minikube

MicroK8s

info

For recommendations on machine type, see our System Requirements Section.

1.3 Setup a container registry

Setup a container registry by creating a secret for Private AI’s private registry. Only after this step, you’ll be able to pull Private AI's private docker images.

Copy

Copied

kubectl create secret docker-registry pai-cr-creds --docker-server="crprivateaiprod.azurecr.io" --docker-username="<your docker username>" --docker-password=<your docker password>

info

If you don't have credentials for the Private AI Container Registry, log into the Private AI Customer Portal to generate and retrieve them to use

See this blog article for more details on pulling images from a private registry.

2 Deploying the Container

The container can be deployed via steps 2.1 and 2.2 or via a Helm chart described in 2.3.

2.1 Setting up your license file

Log into the Private AI Customer portal and download your license file.

Once you've downloaded the file (license.json), open the file in a text editor and paste the contents of the file in a license manifest pai-license.yaml

It should look something like this:

Copy

Copied

apiVersion: v1
kind: ConfigMap
metadata:
  name: pai-license
data:
  license.info: |
    {
    "id": 1234,
    "tier": "demo",
    "expires_at": "2024-01-01T12:00:00+00:00",
    "permissions": [
        {
            "permission_type": "credits",
            "allowed_value": 1234
        },
        ...        
    ],
    "user": "<YourCustomerID>",
    "metering_id": "<anotherID>",
    "licensing_api_key": "<yetAnotherID>",
    "signature": "<SignatureHash>"
    }

Now run the following command to load the ConfigMap with your license into your Kubernetes cluster:

Copy

Copied

kubectl apply -f pai-license.yaml

2.2 Deploying the deid application

Now that we have all the things in place, let’s create the manifest file deploy-private-ai.yaml

CPU ConfigurationGPU Configuration

Copy

Copied

apiVersion: apps/v1
kind: Deployment
metadata:
  name: private-ai-deployment
spec:
  replicas: 1
  selector:
    matchLabels:
      app: private-ai-app
  template:
    metadata:
      labels:
        app: private-ai-app
    spec:
      affinity:
        podAntiAffinity:  # So that only one pod runs per node.
          requiredDuringSchedulingIgnoredDuringExecution:
            - labelSelector:
                matchExpressions:
                  - key: app
                    operator: In
                    values:
                      - private-ai-app
              topologyKey: "kubernetes.io/hostname"
      imagePullSecrets:
        - name: pai-cr-creds
      containers:
        - name: pai-container
          image: <private-ai-image>:<tag> # replace placeholders with appropriate image name and tag, example: crprivateaiprod.azurecr.io/deid:3.3.2-cpu
          volumeMounts:
            - name: license-volume
              mountPath: /app/license
          readinessProbe:
            failureThreshold: 3
            httpGet:
              path: /healthz
              port: 8080
              scheme: HTTP
            initialDelaySeconds: 30
            periodSeconds: 10
            successThreshold: 1
            timeoutSeconds: 10
          livenessProbe:
            failureThreshold: 3
            httpGet:
              path: /healthz
              port: 8080
              scheme: HTTP
            initialDelaySeconds: 40
            periodSeconds: 60
            successThreshold: 1
            timeoutSeconds: 10
      volumes:
        - name: license-volume
          configMap:
            name: pai-license
            items:
            - key: "license.info"
              path: "license.json"
      terminationGracePeriodSeconds: 120
---
apiVersion: v1  # To see available service types https://kubernetes.io/docs/concepts/services-networking/service/
kind: Service
metadata:
  name: private-ai-service
spec:
  type: LoadBalancer
  selector:
    app: private-ai-app
  ports:
    - name: http
      port: 80
      targetPort: 8080

Copy

Copied

apiVersion: apps/v1
kind: Deployment
metadata:
  name: private-ai-deployment
spec:
  replicas: 1
  selector:
    matchLabels:
      app: private-ai-app
  template:
    metadata:
      labels:
        app: private-ai-app
    spec:
      affinity:
        podAntiAffinity:  # So that only one pod runs per node.
          requiredDuringSchedulingIgnoredDuringExecution:
            - labelSelector:
                matchExpressions:
                  - key: app
                    operator: In
                    values:
                      - private-ai-app
              topologyKey: "kubernetes.io/hostname"
      imagePullSecrets:
        - name: pai-cr-creds
      containers:
        - name: pai-container
          image: <private-ai-image>:<tag> # replace placeholders with appropriate image name and tag, example: crprivateaiprod.azurecr.io/deid:3.3.2-gpu
          resources:
            requests:
              nvidia.com/gpu: 1
            limits:
              nvidia.com/gpu: 1
          volumeMounts:
            - name: license-volume
              mountPath: /app/license
            - name: dshm-volume
              mountPath: /dev/shm
          readinessProbe:
            failureThreshold: 3
            httpGet:
              path: /healthz
              port: 8080
              scheme: HTTP
            initialDelaySeconds: 30
            periodSeconds: 10
            successThreshold: 1
            timeoutSeconds: 10
          livenessProbe:
            failureThreshold: 3
            httpGet:
              path: /healthz
              port: 8080
              scheme: HTTP
            initialDelaySeconds: 40
            periodSeconds: 60
            successThreshold: 1
            timeoutSeconds: 10
      volumes:
        - name: license-volume
          configMap:
            name: pai-license
            items:
            - key: "license.info"
              path: "license.json"
        - name: dshm-volume
          emptyDir:
            medium: Memory
      terminationGracePeriodSeconds: 120
---
apiVersion: v1  # To see available service types https://kubernetes.io/docs/concepts/services-networking/service/
kind: Service
metadata:
  name: private-ai-service
spec:
  type: LoadBalancer
  selector:
    app: private-ai-app
  ports:
    - name: http
      port: 80
      targetPort: 8080

Now create a deployment using this kubectl command.

Copy

Copied

kubectl apply -f deploy-private-ai.yaml

2.3 Deploy the container via Helm

Private AI supports installation to a Kubernetes cluster via Helm. Before you begin, ensure that you have helm installed.

Our public Helm chart is hosted here. Simply pull the chart, replace the placeholder license with your license file (from your customer portal) and run

Copy

Copied

helm package .
helm install --namespace <namespace> private-ai ./private-ai-0.1.0.tgz

replacing <namespace> with the space of your choice in your cluster.

info

The helm chart for the Private AI container can also be used to deploy on OpenShift container platform clusters.

3 Post deployment

3.1 Checking the status of containers

Once deployed successfully, you’ll be able to check the status of pods with this command:

Copy

Copied

kubectl get pods

expected output

Copy

Copied

NAME          READY   STATUS    RESTARTS   AGE
<pod-name>    1/1     Running   0          1m

To check the logs, run this command with your pod name

Copy

Copied

kubectl logs <pod-name>    # change <pod-name> with the name of pod from the command above

expected output

Copy

Copied

Log level is: info
Image Version: <version>
INFO:     Started server process [9]
INFO:     Waiting for application startup.
INFO:     Application startup complete.
INFO:     Uvicorn running on http://ip:port (Press CTRL+C to quit)
INFO:     ip:port - "GET /healthz HTTP/1.1" 200 OK
 model time is 44.28 ms or 89.05 percent, rx time is 0.42 ms or 0.85 percent, total time: 49.73 ms
Auth call to Private AI servers took 154.39 ms
Got 100000 calls from PAI auth system
 1 API calls used, 99999 remaining until next auth call. Total processing time is 0.05 secs, 19.91 API calls per sec.
INFO:     ip:port - "POST /deidentify_text HTTP/1.1" 200 OK

The above deploy-private-ai.yaml also creates a LoadBalancer service which exposes an IP address to access your application. To check the external IP, run this:

Copy

Copied

kubectl get svc

expected output

Copy

Copied

NAME      TYPE           CLUSTER-IP    EXTERNAL-IP     PORT(S)        AGE
deid-ip   LoadBalancer   <cluster-ip>  <external-ip>   80:30456/TCP   27m

3.2 Making requests

Your can use external-ip (from the command above) of LoadBalancer service to make requests to deidentify text.

Request BodycURLPythonPython Client

Copy

Copied

{
    "text": ["Hi John, Grace this side. It'\''s been a while since we last met in Berlin."
    ]
}

Copy

Copied

curl --location --request POST 'http://<external-ip>/process/text' \
--header 'Content-Type: application/json' \
--data-raw '{"text": ["Hi John, Grace this side. It'\''s been a while since we last met in Berlin."]}'

Copy

Copied

import requests

r = requests.post(url="http://<external-ip>/process/text",                  
                  json={"text": ["Hi John, Grace this side. It'\''s been a while since we last met in Berlin."]})

results = r.json()

print(results)

Copy

Copied

from privateai_client import PAIClient
from privateai_client import request_objects

client = PAIClient(url="http://<external-ip>")

text_request = request_objects.process_text_obj(text=["Hi John, Grace this side. It'\''s been a while since we last met in Berlin."])
response = client.process_text(text_request)

print(response.processed_text)

You can expect a response like this:

Copy

Copied

[
  {
    "processed_text": "Hi [NAME_1], [NAME_2] this side. It's been a while since we last met in [LOCATION_CITY_1].",
    "entities": [...],
    "entities_present": true,
    "characters_processed": 1234,
    "languages_detected": {...}
  }
]

Additional Resources

Autoscaling your Kubernetes deployment

Kubernetes Setup Guide

1. Prerequisites

1.1 Install and setup kubectl

1.2 Setup your Kubernetes cluster

info

1.3 Setup a container registry

info

2 Deploying the Container

2.1 Setting up your license file

2.2 Deploying the deid application

2.3 Deploy the container via Helm

info

3 Post deployment

3.1 Checking the status of containers

3.2 Making requests

Additional Resources

1.1 Install and setup `kubectl`