Supported File Types
Private AI can support multiple file types for de-identification. The complete list of supported file types is below. New file types are continually being added, please contact us if you require a file type not in the list below.
Private AI’s supported entity types function across each file type, with multilingual equivalents of different PII (Personally Identifiable Information) entities, PHI (Protected Health Information) entities, and PCI (Payment Card Industry) entities being detected. Our Supported Entity Types page provides a more detailed look at entities.
Document File Types
File Type | Extension | Content Type | Added In |
---|---|---|---|
PDF doc | .pdf |
application/pdf |
3.0.0 |
JSON file | .json |
application/json |
3.1.0 |
XML file | .xml |
application/xml |
3.1.0 |
CSV file | .csv |
text/csv |
3.1.0 |
Word Doc | .doc |
application/msword |
3.1.0 |
Word Open XML Doc | .docx |
application/vnd.openxmlformats-officedocument.wordprocessingml.document |
3.1.0 |
Email file | .eml |
message/rfc822 |
3.1.1 |
Text file | .txt |
text/plain |
3.1.1 |
Excel workbook | .xls |
application/vnd.ms-excel |
3.2.0 |
Excel Open XML spreadsheet | .xlsx |
application/vnd.openxmlformats-officedocument.spreadsheetml.sheet |
3.2.0 |
DICOM file | .dcm |
application/dicom |
3.4.0 |
Image File Types
File Type | Extension | Content Type | Added In |
---|---|---|---|
JPEG image | .jpg , .jpeg |
image/jpg , image/jpeg |
3.0.0 |
TIFF image | .tif , .tiff |
image/tif , image/tiff |
3.0.0 |
PNG image | .png |
image/png |
3.4.0 |
BMP image | .bmp |
image/bmp , image/x-ms-bmp |
3.4.0 |
Audio File Types
File Type | Extension | Content Type | Added In |
---|---|---|---|
wave audio file | .wav |
audio/wav |
3.0.0 |
mp3 audio file | .mp3 |
audio/mpeg , audio/mp3 |
3.0.0 |
mp4 audio file | .mp4 |
audio/mp4 |
3.0.0 |
Supported Languages
Note that while Private AI text de-identification service supports more than 50 languages, the file processing service supports this restricted list of languages: Dutch, English, French, German, Italian, Polish, Portuguese and Spanish.
Limitations
Private AI is constantly improving the file processing support in every releases. These are the current limitations:
Document Type | Limitation |
---|---|
XML file | Only the text of elements and node attributes are redacted |
Text file | Text encoding must be utf-8 |
CSV file | The data must be column-oriented and the headers must be on the first row |
Word Doc | Only the document text and metadata are redacted |
Word Open XML Doc | Only the document text and metadata are redacted |
Email file | Only the email body is redacted |
Attachments are not redacted |