Environment Variables

The Private AI container supports a number of environment variables. The environment variables can be set in the Docker run command as follows:

Copy
Copied
docker run --rm -e <ENVIRONMENT_VARIABLE1>=<VALUE> -e <ENVIRONMENT_VARIABLE2>=<VALUE> -p 8080:8080 -it deid:<version number>

Supported Environment Variables

Variable Name
Description
PAI_ACCURACY_MODES Controls which entity detection models are loaded at container start. By default, the container loads all models. Setting this environment variable allows for faster startup and reduced RAM and GPU memory usage. A request specifying an accuracy mode that wasn't loaded will return an error. Allowed values are the accuracy modes specified in the accuracy field in entity_detection, e.g. PAI_ACCURACY_MODES=high
PAI_ALLOW_LIST Allows for the allow list to be set globally, instead of passing into each POST request. An example could be PAI_ALLOW_LIST='["John","Grace"]'. Please see Processing Text
PAI_DISABLE_GPU_CHECK When defined and set to any value, the startup GPU check is disabled. This variable is only applicable to the GPU container and allows the GPU container to run in fallback CPU mode
PAI_DISABLE_RAM_CHECK When defined and set to any value, the sufficient RAM check performed at container startup is disabled. Please note that Private AI cannot guarantee container stability if this is switched off
PAI_ENABLED_ENTITIES Allows for the enabled classes to be set globally, instead of passing it into each POST request. An example could be PAI_ENABLED_ENTITIES="NAME" or PAI_ENABLED_ENTITIES="NAME,AGE,ORGANIZATION". A command sample is located at the bottom. Please see also Processing Text
PAI_DISABLED_ENTITIES Allows for the disabled classes to be set globally, instead of passing it into each POST request. An example could be PAI_DISABLED_ENTITIES="ORGANIZATION" or PAI_DISABLED_ENTITIES="AGE,LOCATION,ORGANIZATION". A command sample is located at the bottom. Please see also Processing Text
PAI_ENABLE_AUDIO When defined and set to any value, the container loads the functionality to process audio files. This is off by default to save startup time and RAM
PAI_ENABLE_PII_COUNT_METERING When defined and set to any value, Aggregated entity detection counts are sent back to Private AI servers for reporting and visualization inside the dashboard. Note that feature is off by default
PAI_LOG_LEVEL Controls the verbosity of the container logging output. Allowed values are info, warning or error. Default is info
PAI_MARKER_FORMAT Allows for the redaction marker format to be set globally, instead of passing into each POST request. Please see Processing Text
PAI_OUTPUT_FILE_DIR The directory where /process/files/uri will write processed files to. Note that this does not need to be specified for /process/files/base64
PAI_PROJECT_ID Sets a default project_id that will be used if a request doesn't contain one. Please see Processing Text
PAI_SYNTHETIC_PII_ACCURACY_MODES Same as PAI_ACCURACY_MODES, except for synthetic entity generation models. Unlike PAI_ACCURACY_MODES, this environment variable can be set empty via PAI_SYNTHETIC_PII_ACCURACY_MODES= to disable synthetic entity generation
PAI_WORKERS Number of pre/post-processing workers used in the GPU container. Defaults to 16 - increasing this number allows for higher throughput, at the cost of increased RAM usage
PAI_ENABLE_REPORTING Enables reporting to a Logstash server
LOGSTASH_HOST The Logstash server's host info
LOGSTASH_PORT The port of the Logstash server
LOGSTASH_TTL Sets the time to live value (in seconds) of the data queued for Logstash. Data will be lost if the queued data is not sent successfully before the ttl value.
PAI_REPORT_ENTITY_COUNTS Enables entity counts (per piece of text deidentified) to be added to reporting
PAI_MAX_IMAGE_PIXELS Configures the max allowed pixels in the images processed. Default value is 178956970
PAI_MAX_FILE_SIZE Configures the max allowed file size in bytes. File size check occurs before redaction and produces an error message if it exceeds the value specified. i.e. 2000000
PAI_WS_LINK_BATCH Enables context retention in the websocket endpoint. Default is true
PAI_WS_CONTEXT_SIZE Sets the context window size (number of previous messages retained in context). Default is 50
PAI_ENABLE_PDF_TEXT_LAYER This environment variable sets the default behaviour for whether a text layer is included in generated PDF files. When set to true, the application will include a text layer in the PDFs, allowing for text selection, search, and accessibility features. When set to false the text layer will be disabled, resulting in marginally smaller file sizes and faster processing. Note that this option is ignored if the pdf_options.enable_pdf_text_layer POST parameter is set in requests to the /process/files/uri and /process/files/base64 endpoints. Default value is true.
PAI_OCR_SYSTEM Sets the Optical Character Recognition (OCR) system for processing documents and images. Default is paddleocr

To change the port used by the container, please set the host port as per the command below:

Copy
Copied
docker run --rm -v "full path to license.json":/app/license/license.json \
-p <host port>:8080 -it crprivateaiprod.azurecr.io/deid:<version>

To use PAI_ENABLED_ENTITIES to redact only for NAME and ORGANIZATION, the command will look like:

Copy
Copied
docker run --rm -v "full path to license.json":/app/license/license.json \
-e PAI_ENABLED_ENTITIES="NAME,ORGANIZATION"
-p <host port>:8080 -it crprivateaiprod.azurecr.io/deid:<version>

To use PAI_DISABLED_ENTITIES to redact all except ORGANIZATION, the command will look like:

Copy
Copied
docker run --rm -v "full path to license.json":/app/license/license.json \
-e PAI_DISABLED_ENTITIES="ORGANIZATION"
-p <host port>:8080 -it crprivateaiprod.azurecr.io/deid:<version>

Note: If setting entities both in ENABLE and DISABLE, only ENABLE takes effect. If an entity is duplicated both in ENABLE and DISABLE and ENABLE has no other entity set, the request will fail as 400 Bad Request.

© Copyright 2024 Private AI.