Processing Image Files

Private AI supports scanning images for PII and creating de-identified or redacted copies. Private AI’s supported entity types function across each file type, with localized variants of different PII (Personally Identifiable Information) entities, PHI (Protected Health Information) entities, and PCI (Payment Card Industry) entities being detected. Our Supported Languages and Supported Entity Types page provides a more detailed look.

info

If you'd like to try it yourself, please visit our free interactive web demo. No code or account is necessary.

How Images Are Processed

Image files are processed as follows:

  1. Non-text PII such as faces and license plates are detected and de-identified via blurring or black box redaction.
  2. The resulting image is run an OCR system to detect any text present.
  3. The output of the OCR system is passed to a PII detection module.
  4. Any PII found in the previous step is de-identified via blurring or black box redaction.
info

Check out our OCR Guide to see the available OCR modes.

Constraints

  • There is limited support for 8-bit PNG images.

Supported File Types

File Type Extension Content Type Added In
JPEG image .jpg, .jpeg image/jpg, image/jpeg 3.0.0
TIFF image .tif, .tiff image/tif, image/tiff 3.0.0
PNG image .png image/png 3.4.0
BMP image .bmp image/bmp, image/x-ms-bmp 3.4.0

Support Matrix

CPU Container GPU Container Community API Professional API PrivateGPT UI
Supported? Yes Yes Yes Yes No

Sample Request

info

Please sign up for a free API key to run this code.

Request BodycURLPythonPython Client
Copy
Copied
{
    "file": {
        "data": file_content_base64,
        "content_type": "image/jpeg",
    },
    "entity_detection": {
        "return_entity": True
    }
}
Copy
Copied
echo '{
          "file": {"data": "'$(base64 -w 0 sample.jpeg)'", 
          "content_type": "image/jpeg"}, 
          "entity_detection": {"return_entity": "True"}
      }' \
| curl --request POST --url 'https://api.private-ai.com/community/v3/process/files/base64' \
       -H 'Content-Type: application/json' \
       -H 'x-api-key: <YOUR KEY HERE>' \
       -d @- \
       | jq -r .processed_file \
       | base64 -d > 'sample.redacted.jpeg'
Copy
Copied
import requests
import base64

file_url = "https://paidocumentation.blob.core.windows.net/$web/sample.jpeg"
filename_out = "/path/to/output/sample.redacted.jpeg"
file_content = requests.get(file_url).content
file_content_base64 = base64.b64encode(file_content).decode()

url = "https://api.private-ai.com/community/v3/process/files/base64"

headers = {"Content-Type": "application/json", "x-api-key": "<INSERT API KEY>"}

payload = {
  "file":{
    "data": file_content_base64,
    "content_type": "image/jpeg",
  },
  "entity_detection": {
    "return_entity": True
  }
}

response = requests.post(url, json=payload, headers=headers)
with open(filename_out, "wb") as f:
    f.write(base64.b64decode(response.json()["processed_file"]))
Copy
Copied
from privateai_client import PAIClient
from privateai_client.objects import request_objects
import base64

filename_in = "sample.jpeg"
filename_out = "sample.redacted.jpeg"

file_type= "image/jpeg"
client = PAIClient(url="https://api.private-ai.com/community/", api_key="<YOUR API KEY>")

with open(filename_in, "rb") as b64_file:
    file_data = base64.b64encode(b64_file.read())
    file_data = file_data.decode("ascii")

file_obj = request_objects.file_obj(data=file_data, content_type=file_type)
request_obj = request_objects.file_base64_obj(file=file_obj)
resp = client.process_files_base64(request_object=request_obj)

with open(filename_out, 'wb') as redacted_file:
    processed_file = resp.processed_file.encode("ascii")
    processed_file = base64.b64decode(processed_file, validate=True)
    redacted_file.write(processed_file)

Sample Response

Copy
Copied
"processed_file": "Base64 Encoded File Content of the Redacted File",
"processed_text":"string",
"entities":"List[Entity]",
"entities_present":true,
"languages_detected":{"lang_1":0.67, "lang_2": 0.74}
© Copyright 2024 Private AI.