Environment Variables

The Private AI container supports a number of environment variables. The environment variables can be set in the Docker run command as follows:

Copy

Copied

docker run --rm -e <ENVIRONMENT_VARIABLE1>=<VALUE> -e <ENVIRONMENT_VARIABLE2>=<VALUE> -p 8080:8080 -it deid:<version number>

Supported Environment Variables

Variable Name	Description
`PAI_ACCURACY_MODES`	Controls which entity detection models are loaded at container start. By default, the container loads all models. Setting this environment variable allows for faster startup and reduced RAM and GPU memory usage. A request specifying an accuracy mode that wasn't loaded will return an error. Allowed values are the accuracy modes specified in the `accuracy` field in `entity_detection`, e.g. `PAI_ACCURACY_MODES=high`. This variable also supports a comma-separated list of values to load multiple models, e.g., `PAI_ACCURACY_MODES=high,standard,standard_high`
`PAI_ALLOW_LIST`	Allows for the allow list to be set globally, instead of passing into each `POST` request. An example could be `PAI_ALLOW_LIST='["John","Grace"]'`. Please see Processing Text
`PAI_DISABLE_GPU_CHECK`	When defined and set to any value, the startup GPU check is disabled. This variable is only applicable to the GPU container and allows the GPU container to run in fallback CPU mode
`PAI_DISABLE_RAM_CHECK`	When defined and set to any value, the sufficient RAM check performed at container startup is disabled. Please note that Private AI cannot guarantee container stability if this is switched off
`PAI_ENABLED_ENTITIES`	Allows for the enabled classes to be set globally, instead of passing it into each POST request. An example could be `PAI_ENABLED_ENTITIES="NAME"` or `PAI_ENABLED_ENTITIES="NAME,AGE,ORGANIZATION"`. A command sample is located at the bottom. Please see also Processing Text
`PAI_DISABLED_ENTITIES`	Allows for the disabled classes to be set globally, instead of passing it into each POST request. An example could be `PAI_DISABLED_ENTITIES="ORGANIZATION"` or `PAI_DISABLED_ENTITIES="AGE,LOCATION,ORGANIZATION"`. A command sample is located at the bottom. Please see also Processing Text
`PAI_ENABLE_AUDIO`	When defined and set to any value, the container loads the functionality to process audio files. This is off by default to save startup time and RAM
`PAI_ENABLE_PII_COUNT_METERING`	When defined and set to any value, Aggregated entity detection counts are sent back to Private AI servers for reporting and visualization inside the dashboard. Note that feature is off by default
`PAI_LOG_LEVEL`	Controls the verbosity of the container logging output. Allowed values are `info`, `warning` or `error`. Default is `info`
`PAI_MARKER_FORMAT`	Allows for the redaction marker format to be set globally, instead of passing into each `POST` request. Please see Processing Text
`PAI_OUTPUT_FILE_DIR`	The directory where `/process/files/uri` will write processed files to. Note that this does not need to be specified for `/process/files/base64`
`PAI_PROJECT_ID`	Sets a default `project_id` that will be used if a request doesn't contain one. Please see Processing Text
`PAI_SYNTHETIC_PII_ACCURACY_MODES`	Same as `PAI_ACCURACY_MODES`, except for synthetic entity generation models. This variable also supports a comma-separated list of values to load multiple models, e.g., `PAI_SYNTHETIC_PII_ACCURACY_MODES=standard,standard_multilingual`. Unlike `PAI_ACCURACY_MODES`, this environment variable can be set empty via `PAI_SYNTHETIC_PII_ACCURACY_MODES=` to disable synthetic entity generation
`PAI_WORKERS`	Number of pre/post-processing workers used in the GPU container. Defaults to `16` - increasing this number allows for higher throughput, at the cost of increased RAM usage
`PAI_ENABLE_REPORTING`	Enables reporting to a Logstash server
`LOGSTASH_HOST`	The Logstash server's host info
`LOGSTASH_PORT`	The port of the Logstash server
`LOGSTASH_TTL`	Sets the time to live value (in seconds) of the data queued for Logstash. Data will be lost if the queued data is not sent successfully before the ttl value.
`PAI_REPORT_ENTITY_COUNTS`	Enables entity counts (per piece of text deidentified) to be added to reporting
`PAI_MAX_IMAGE_PIXELS`	Configures the max allowed pixels in the images processed. Default value is `178956970`
`PAI_MAX_FILE_SIZE`	Configures the max allowed file size in bytes. File size check occurs before redaction and produces an error message if it exceeds the value specified. i.e. `2000000`
`PAI_WS_LINK_BATCH`	Enables context retention in the websocket endpoint. Default is `true`
`PAI_WS_CONTEXT_SIZE`	Sets the context window size (number of previous messages retained in context). Default is `50`
`PAI_ENABLE_PDF_TEXT_LAYER`	This environment variable sets the default behaviour for whether a text layer is included in generated PDF files. When set to `true`, the application will include a text layer in the PDFs, allowing for text selection, search, and accessibility features. When set to `false` the text layer will be disabled, resulting in marginally smaller file sizes and faster processing. Note that this option is ignored if the `pdf_options.enable_pdf_text_layer` POST parameter is set in requests to the `/process/files/uri` and `/process/files/base64` endpoints. Default value is `true`.
`PAI_OCR_SYSTEM`	Sets the Optical Character Recognition (OCR) system for processing documents and images. Default is `paddleocr`

To change the port used by the container, please set the host port as per the command below:

Copy

Copied

docker run --rm -v "full path to license.json":/app/license/license.json \
-p <host port>:8080 -it crprivateaiprod.azurecr.io/deid:<version>

To use PAI_ENABLED_ENTITIES to redact only for NAME and ORGANIZATION, the command will look like:

Copy

Copied

docker run --rm -v "full path to license.json":/app/license/license.json \
-e PAI_ENABLED_ENTITIES="NAME,ORGANIZATION"
-p <host port>:8080 -it crprivateaiprod.azurecr.io/deid:<version>

To use PAI_DISABLED_ENTITIES to redact all except ORGANIZATION, the command will look like:

Copy

Copied

docker run --rm -v "full path to license.json":/app/license/license.json \
-e PAI_DISABLED_ENTITIES="ORGANIZATION"
-p <host port>:8080 -it crprivateaiprod.azurecr.io/deid:<version>

Note: If setting entities both in ENABLE and DISABLE, only ENABLE takes effect. If an entity is duplicated both in ENABLE and DISABLE and ENABLE has no other entity set, the request will fail as 400 Bad Request.