Processing Audio Files

Private AI supports scanning audio for PII and creating de-identified or redacted copies. Private AI’s supported entity types function across each file type, with localized variants of different PII (Personally Identifiable Information) entities, PHI (Protected Health Information) entities, and PCI (Payment Card Industry) entities being detected. Our Supported Languages and Supported Entity Types page provides a more detailed look.

How Audio Files Are Processed

Audio files are processed as follows:

  1. A transcript is produced using an Automatic Speech Recognition (ASR) system
  2. The resulting ASR transcript is passed through the text-based PII detection engine
  3. If specified by the user, the audio file undergoes pitch distortion
  4. Using the ASR timestamps, any sections of the audio file corresponding with PII detections are replaced with a Sine wave or bleep tone
  5. The resulting de-identified or redacted audio file and transcript are passed back to the user

Parameters

Please see the audio_options object in the API reference for a full list of the parameters for audio processing.

Supported File Types

File Type Extension Content Type Added In
wave .wav audio/wav 3.0.0
x-wave .wav audio/x-wav 3.5.0
mp3 .mp3 audio/mpeg, audio/mp3 3.0.0
mp4 .mp4 audio/mp4 3.0.0
m4a .m4a audio/m4a 3.5.0
m4a-latm .m4a-latm audio/m4a-latm 3.5.0
webm .webm audio/webm 3.5.0

VOX Files

Note that .vox files are not natively supported by Private AI, but can be processed by converting the .vox file to a wav or mp3 using a conversion tool like SoX

Because .vox files are headerless, you will need to know the sample rate and encoding to specify.

For example, to take a vox file with a sample rate 8000, mono channel, mu-law encoded: sox -t raw -r 8000 -c 1 -e mu-law myfile.vox myfile.wav

to generate a wav file.

Support Matrix

CPU Container GPU Container Community API Professional API PrivateGPT UI
Supported? Yes No No No No

Sample Request

info

Please sign up for a free API key to run this code.

Request BodycURLPythonPython Client
Copy
Copied
{
    "file": {
        "data": file_content_base64,
        "content_type": "audio/wav"
    },
    "entity_detection": {
        "return_entity": True
    },
    "audio_options": {
        "bleep_start_padding": 0,
        "bleep_end_padding": 0
    }
}
Copy
Copied
echo '{
          "file": {"data": "'$(base64 -w 0 sample.wav)'", 
          "content_type": "audio/wav"}, 
          "entity_detection": {"return_entity": "True"}
      }' \
| curl --request POST --url 'https://api.private-ai.com/community/v4/process/files/base64' \
       -H 'Content-Type: application/json' \
       -H 'x-api-key: <YOUR KEY HERE>' \
       -d @- \
       | jq -r .processed_file \
       | base64 -d > 'sample.redacted.wav'
Copy
Copied
import requests
import base64

file_url = "https://paidocumentation.blob.core.windows.net/$web/sample.wav"
filename_out = "/path/to/output/sample.redacted.wav"
file_content = requests.get(file_url).content
file_content_base64 = base64.b64encode(file_content).decode()

headers = {"Content-Type": "application/json", "x-api-key": "<INSERT API KEY>"}

url = "https://api.private-ai.com/community/v4/process/files/base64"

payload = {
  "file" : {
    "data": file_content_base64,
    "content_type": "audio/wav"
  },
  "entity_detection": {
    "return_entity": True
  },
  "audio_options":{
    "bleep_start_padding": 0,
    "bleep_end_padding": 0
  }
}

response = requests.post(url, json=payload, headers=headers)
with open(filename_out, "wb") as f:
    f.write(base64.b64decode(response.json()["processed_file"]))
Copy
Copied
from privateai_client import PAIClient
from privateai_client.objects import request_objects
import base64

filename_in = "sample.wav"
filename_out = "sample.redacted.wav"

file_type= "audio/wav"
client = PAIClient(url="https://api.private-ai.com/community/v4/", api_key="<YOUR API KEY>")

with open(filename_in, "rb") as b64_file:
    file_data = base64.b64encode(b64_file.read())
    file_data = file_data.decode("ascii")

file_obj = request_objects.file_obj(data=file_data, content_type=file_type)
request_obj = request_objects.file_base64_obj(file=file_obj)
resp = client.process_files_base64(request_object=request_obj)

with open(filename_out, 'wb') as redacted_file:
    processed_file = resp.processed_file.encode("ascii")
    processed_file = base64.b64decode(processed_file, validate=True)
    redacted_file.write(processed_file)

Sample Response

Copy
Copied
"processed_file": "Base64 Encoded File Content of the Redacted File",
"processed_text":"string",
"entities":"List[Entity]",
"entities_present":true,
"languages_detected":{"lang_1":0.67, "lang_2": 0.74}
© Copyright 2024 Private AI.