Environment Variables
The Private AI container supports a number of environment variables. The environment variables can be set in the Docker run command as follows:
docker run --rm -e <ENVIRONMENT_VARIABLE1>=<VALUE> -e <ENVIRONMENT_VARIABLE2>=<VALUE> -p 8080:8080 -it deid:<version number>
Supported Environment Variables
Variable Name |
Description |
---|---|
PAI_ACCURACY_MODES |
Controls which entity detection models are loaded at container start. By default, the container loads all models. Setting this environment variable allows for faster startup and reduced RAM and GPU memory usage. A request specifying an accuracy mode that wasn't loaded will return an error. Allowed values are the accuracy modes specified in the accuracy field in entity_detection , e.g. PAI_ACCURACY_MODES=high |
PAI_ALLOW_LIST |
Allows for the allow list to be set globally, instead of passing into each POST request. An example could be PAI_ALLOW_LIST='["John","Grace"]' . Please see Processing Text |
PAI_DISABLE_GPU_CHECK |
When defined and set to any value, the startup GPU check is disabled. This variable is only applicable to the GPU container and allows the GPU container to run in fallback CPU mode |
PAI_DISABLE_RAM_CHECK |
When defined and set to any value, the sufficient RAM check performed at container startup is disabled. Please note that Private AI cannot guarantee container stability if this is switched off |
PAI_ENABLED_ENTITIES |
Allows for the enabled classes to be set globally, instead of passing it into each POST request. An example could be PAI_ENABLED_ENTITIES="NAME" or PAI_ENABLED_ENTITIES="NAME,AGE,ORGANIZATION" . A command sample is located at the bottom. Please see also Processing Text |
PAI_DISABLED_ENTITIES |
Allows for the disabled classes to be set globally, instead of passing it into each POST request. An example could be PAI_DISABLED_ENTITIES="ORGANIZATION" or PAI_DISABLED_ENTITIES="AGE,LOCATION,ORGANIZATION" . A command sample is located at the bottom. Please see also Processing Text |
PAI_ENABLE_AUDIO |
When defined and set to any value, the container loads the functionality to process audio files. This is off by default to save startup time and RAM |
PAI_ENABLE_PII_COUNT_METERING |
When defined and set to any value, Aggregated entity detection counts are sent back to Private AI servers for reporting and visualization inside the dashboard. Note that feature is off by default |
PAI_LOG_LEVEL |
Controls the verbosity of the container logging output. Allowed values are info , warning or error . Default is info |
PAI_MARKER_FORMAT |
Allows for the redaction marker format to be set globally, instead of passing into each POST request. Please see Processing Text |
PAI_OUTPUT_FILE_DIR |
The directory where /process/files/uri will write processed files to. Note that this does not need to be specified for /process/files/base64 |
PAI_PROJECT_ID |
Sets a default project_id that will be used if a request doesn't contain one. Please see Processing Text |
PAI_SYNTHETIC_PII_ACCURACY_MODES |
Same as PAI_ACCURACY_MODES , except for synthetic entity generation models. Unlike PAI_ACCURACY_MODES , this environment variable can be set empty via PAI_SYNTHETIC_PII_ACCURACY_MODES= to disable synthetic entity generation |
PAI_WORKERS |
Number of pre/post-processing workers used in the GPU container. Defaults to 16 - increasing this number allows for higher throughput, at the cost of increased RAM usage |
PAI_ENABLE_REPORTING |
Enables reporting to a Logstash server |
LOGSTASH_HOST |
The Logstash server's host info |
LOGSTASH_PORT |
The port of the Logstash server |
LOGSTASH_TTL |
Sets the time to live value (in seconds) of the data queued for Logstash. Data will be lost if the queued data is not sent successfully before the ttl value. |
PAI_REPORT_ENTITY_COUNTS |
Enables entity counts (per piece of text deidentified) to be added to reporting |
PAI_MAX_IMAGE_PIXELS |
Configures the max allowed pixels in the images processed. Default value is 178956970 |
PAI_MAX_FILE_SIZE |
Configures the max allowed file size in bytes. File size check occurs before redaction and produces an error message if it exceeds the value specified. i.e. 2000000 |
PAI_WS_LINK_BATCH |
Enables context retention in the websocket endpoint. Default is true |
PAI_WS_CONTEXT_SIZE |
Sets the context window size (number of previous messages retained in context). Default is 50 |
PAI_ENABLE_PDF_TEXT_LAYER |
This environment variable sets the default behaviour for whether a text layer is included in generated PDF files. When set to true , the application will include a text layer in the PDFs, allowing for text selection, search, and accessibility features. When set to false the text layer will be disabled, resulting in marginally smaller file sizes and faster processing. Note that this option is ignored if the pdf_options.enable_pdf_text_layer POST parameter is set in requests to the /process/files/uri and /process/files/base64 endpoints. Default value is true . |
PAI_OCR_SYSTEM |
Sets the Optical Character Recognition (OCR) system for processing documents and images. Default is paddleocr |
To change the port used by the container, please set the host port as per the command below:
docker run --rm -v "full path to license.json":/app/license/license.json \
-p <host port>:8080 -it crprivateaiprod.azurecr.io/deid:<version>
To use PAI_ENABLED_ENTITIES
to redact only for NAME
and ORGANIZATION
, the command will look like:
docker run --rm -v "full path to license.json":/app/license/license.json \
-e PAI_ENABLED_ENTITIES="NAME,ORGANIZATION"
-p <host port>:8080 -it crprivateaiprod.azurecr.io/deid:<version>
To use PAI_DISABLED_ENTITIES
to redact all except ORGANIZATION
, the command will look like:
docker run --rm -v "full path to license.json":/app/license/license.json \
-e PAI_DISABLED_ENTITIES="ORGANIZATION"
-p <host port>:8080 -it crprivateaiprod.azurecr.io/deid:<version>
Note: If setting entities both in ENABLE
and DISABLE
, only ENABLE
takes effect. If an entity is duplicated both in ENABLE
and DISABLE
and ENABLE
has no other entity set, the request will fail as 400 Bad Request
.