Process Files Uri

Detect entities such as PII, PHI, PCI, or CCI in the file located at the provided URI using Private AI's entity detection engine. After entity detection, a copy of the file with all entities removed is created and placed in the folder specified by PAI_OUTPUT_FILE_DIR on the local host.

This route is similar to /process/files/base64, but relies on URIs instead of base64-encoded strings. As this route avoids the overhead of base64 encoding, it is more suitable for processing large files and large volumes of data.

This route supports the following content types: application/dicom, application/json, application/msword, application/pdf, application/vnd.ms-excel, application/vnd.ms-powerpoint, application/vnd.openxmlformats-officedocument.presentationml.presentation, application/vnd.openxmlformats-officedocument.spreadsheetml.sheet, application/vnd.openxmlformats-officedocument.wordprocessingml.document, application/xml, audio/m4a, audio/mp3, audio/mp4, audio/mp4a-latm, audio/mpeg, audio/wav, audio/webm, audio/x-wav, image/bmp, image/jpeg, image/jpg, image/png, image/tif, image/tiff, image/x-ms-bmp, text/csv, text/plain

Request
header Parameters
x-api-key
string (X-Api-Key)
Default:
Request Body schema: application/json
required
uri
required
string (Uri)

URI of the file to be processed. It must be an accessible file path on the host machine (e.g. /Users/sam/files/file.pdf).

object (PIIDetectionParams)

This section contains a set of parameters to control the PII detection process. All fields have sensible default that can be changed for specific needs.

object (PDFOptions)

Options to process PDF files, such as the rendering quality when each page is turned into an image.

object (ImageOptions)

Options to process image files, such as the masking mode.

object (AudioOptions)

Options to process audio files, such as the padding to add while redacting audio segments.

any (Processed Text)
Default: {"coreference_resolution":"heuristics","marker_language":"en","pattern":"[UNIQUE_NUMBERED_ENTITY_TYPE]","type":"MARKER"}

This section allows the user to generate redacted (default) or masked text.

project_id
string (Project Id) <= 60 characters ^[a-zA-Z0-9\-_\:]*$

Used to categorize requests for reporting purposes. Limited to alphanumeric characters or the following special characters :_-

any (Ocr Options)
Default: {"ocr_system":"azure_doc_intelligence"}

Options to provide Optical Character Recognition (OCR) details, such as choice of OCR system.

return_processed_text
boolean (Return Processed Text)
Default: true

Controls whether the response contains the processed_text field. Turning this off can significantly decrease the size of the response.

Responses
200

Successful Response

422

Validation Error

post/process/files/uri
Request samples
Response samples
application/json
{
  • "result_uri": "string",
  • "processed_text": "string",
  • "entities": [
    ],
  • "entities_present": true,
  • "languages_detected": {
    },
  • "audio_duration": 0,
  • "page_count": 0
}
© Copyright 2024 Private AI.