Kubernetes Setup Guide
This guide will help you to get started with the deployment of Private AI container in a Kubernetes cluster.
1. Prerequisites
1.1 Install and setup kubectl
The Kubernetes command-line tool, kubectl
, allows you to run commands against Kubernetes clusters.
Find installation instructions for your OS here.
1.2 Setup your Kubernetes cluster
There are many flavours of Kubernetes available that you can choose from. Setup the one that best suits your needs. Here are few popular Kubernetes services and distributions.
Azure Kubernetes Services (AKS)
info
For recommendations on machine type, see our System Requirements Section.
1.3 Setup a container registry
Setup a container registry by creating a secret for Private AI’s private registry. Only after this step, you’ll be able to pull Private AI's private docker images.
kubectl create secret docker-registry pai-cr-creds --docker-server="crprivateaiprod.azurecr.io" --docker-username="<your docker username>" --docker-password=<your docker password>
info
If you don't have credentials for the Private AI Container Registry, log into the Private AI Customer Portal to generate and retrieve them to use
See this blog article for more details on pulling images from a private registry.
2 Deploying the Container
The container can be deployed via steps 2.1 and 2.2 or via a Helm chart described in 2.3.
2.1 Setting up your license file
attention
If you are using our evaluation AWS Marketplace Evaluation offering you can SKIP the following step to setting up your license file as the AWS marketplace container has a special license included (that can only be used on AWS ECS or EKS)
Log into the Private AI Customer portal and download your license file.
Once you've downloaded the file (license.json), open the file in a text editor and paste the contents of the file in a license manifest pai-license.yaml
It should look something like this:
apiVersion: v1
kind: ConfigMap
metadata:
name: pai-license
data:
license.info: |
{
"id": 1234,
"tier": "demo",
"expires_at": "2024-01-01T12:00:00+00:00",
"permissions": [
{
"permission_type": "credits",
"allowed_value": 1234
},
...
],
"user": "<YourCustomerID>",
"metering_id": "<anotherID>",
"licensing_api_key": "<yetAnotherID>",
"signature": "<SignatureHash>"
}
Now run the following command to load the ConfigMap with your license into your Kubernetes cluster:
kubectl apply -f pai-license.yaml
2.2 Deploying the deid application
Now that we have all the things in place, let’s create the manifest file deploy-private-ai.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: private-ai-deployment
spec:
replicas: 1
selector:
matchLabels:
app: private-ai-app
template:
metadata:
labels:
app: private-ai-app
spec:
affinity:
podAntiAffinity: # So that only one pod runs per node.
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: app
operator: In
values:
- private-ai-app
topologyKey: "kubernetes.io/hostname"
imagePullSecrets:
- name: pai-cr-creds
containers:
- name: pai-container
image: <private-ai-image>:<tag> # replace placeholders with appropriate image name and tag, example: crprivateaiprod.azurecr.io/deid:3.3.2-cpu
volumeMounts:
- name: license-volume
mountPath: /app/license
readinessProbe:
failureThreshold: 3
httpGet:
path: /healthz
port: 8080
scheme: HTTP
initialDelaySeconds: 30
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 10
livenessProbe:
failureThreshold: 3
httpGet:
path: /healthz
port: 8080
scheme: HTTP
initialDelaySeconds: 40
periodSeconds: 60
successThreshold: 1
timeoutSeconds: 10
volumes:
- name: license-volume
configMap:
name: pai-license
items:
- key: "license.info"
path: "license.json"
terminationGracePeriodSeconds: 120
---
apiVersion: v1 # To see available service types https://kubernetes.io/docs/concepts/services-networking/service/
kind: Service
metadata:
name: private-ai-service
spec:
type: LoadBalancer
selector:
app: private-ai-app
ports:
- name: http
port: 80
targetPort: 8080
apiVersion: apps/v1
kind: Deployment
metadata:
name: private-ai-deployment
spec:
replicas: 1
selector:
matchLabels:
app: private-ai-app
template:
metadata:
labels:
app: private-ai-app
spec:
affinity:
podAntiAffinity: # So that only one pod runs per node.
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: app
operator: In
values:
- private-ai-app
topologyKey: "kubernetes.io/hostname"
imagePullSecrets:
- name: pai-cr-creds
containers:
- name: pai-container
image: <private-ai-image>:<tag> # replace placeholders with appropriate image name and tag, example: crprivateaiprod.azurecr.io/deid:3.3.2-gpu
resources:
requests:
nvidia.com/gpu: 1
limits:
nvidia.com/gpu: 1
volumeMounts:
- name: license-volume
mountPath: /app/license
- name: dshm-volume
mountPath: /dev/shm
readinessProbe:
failureThreshold: 3
httpGet:
path: /healthz
port: 8080
scheme: HTTP
initialDelaySeconds: 30
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 10
livenessProbe:
failureThreshold: 3
httpGet:
path: /healthz
port: 8080
scheme: HTTP
initialDelaySeconds: 40
periodSeconds: 60
successThreshold: 1
timeoutSeconds: 10
volumes:
- name: license-volume
configMap:
name: pai-license
items:
- key: "license.info"
path: "license.json"
- name: dshm-volume
emptyDir:
medium: Memory
terminationGracePeriodSeconds: 120
---
apiVersion: v1 # To see available service types https://kubernetes.io/docs/concepts/services-networking/service/
kind: Service
metadata:
name: private-ai-service
spec:
type: LoadBalancer
selector:
app: private-ai-app
ports:
- name: http
port: 80
targetPort: 8080
Now create a deployment using this kubectl
command.
kubectl apply -f deploy-private-ai.yaml
2.3 Deploy the container via Helm
Private AI supports installation to a Kubernetes cluster via Helm. Before you begin, ensure that you have helm installed.
Our public Helm chart is hosted here. Simply pull the chart, replace the placeholder license with your license file (from your customer portal) and run
helm package .
helm install --namespace <namespace> private-ai ./private-ai-0.1.0.tgz
replacing <namespace>
with the space of your choice in your cluster.
info
The helm chart for the Private AI container can also be used to deploy on OpenShift container platform clusters.
3 Post deployment
3.1 Checking the status of containers
Once deployed successfully, you’ll be able to check the status of pods with this command:
kubectl get pods
expected output
NAME READY STATUS RESTARTS AGE
<pod-name> 1/1 Running 0 1m
To check the logs, run this command with your pod name
kubectl logs <pod-name> # change <pod-name> with the name of pod from the command above
expected output
Log level is: info
Image Version: <version>
INFO: Started server process [9]
INFO: Waiting for application startup.
INFO: Application startup complete.
INFO: Uvicorn running on http://ip:port (Press CTRL+C to quit)
INFO: ip:port - "GET /healthz HTTP/1.1" 200 OK
model time is 44.28 ms or 89.05 percent, rx time is 0.42 ms or 0.85 percent, total time: 49.73 ms
Auth call to Private AI servers took 154.39 ms
Got 100000 calls from PAI auth system
1 API calls used, 99999 remaining until next auth call. Total processing time is 0.05 secs, 19.91 API calls per sec.
INFO: ip:port - "POST /deidentify_text HTTP/1.1" 200 OK
The above deploy-private-ai.yaml
also creates a LoadBalancer service which exposes an IP address to access your application. To check the external IP, run this:
kubectl get svc
expected output
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
deid-ip LoadBalancer <cluster-ip> <external-ip> 80:30456/TCP 27m
3.2 Making requests
Your can use external-ip
(from the command above) of LoadBalancer service to make requests to deidentify text.
{
"text": ["Hi John, Grace this side. It'\''s been a while since we last met in Berlin."
]
}
curl --location --request POST 'http://<external-ip>/v3/process/text' \
--header 'Content-Type: application/json' \
--data-raw '{"text": ["Hi John, Grace this side. It'\''s been a while since we last met in Berlin."]}'
import requests
r = requests.post(url="http://<external-ip>/v3/process/text",
json={"text": ["Hi John, Grace this side. It'\''s been a while since we last met in Berlin."]})
results = r.json()
print(results)
from privateai_client import PAIClient
from privateai_client import request_objects
client = PAIClient(url="http://<external-ip>")
text_request = request_objects.process_text_obj(text=["Hi John, Grace this side. It'\''s been a while since we last met in Berlin."])
response = client.process_text(text_request)
print(response.processed_text)
You can expect a response like this:
[
{
"processed_text": "Hi [NAME_1], [NAME_2] this side. It's been a while since we last met in [LOCATION_CITY_1].",
"entities": [...],
"entities_present": true,
"characters_processed": 1234,
"languages_detected": {...}
}
]