Processing DCM/DICOM files (Beta)

Private AI supports scanning Digital Imaging and Communications in Medicine (DCM or DICOM) files for PII and creating de-identified or redacted copies. Private AI’s supported entity types function across each file type, with localized variants of different PII (Personally Identifiable Information) entities, PHI (Protected Health Information) entities, and PCI (Payment Card Industry) entities being detected. Our Supported Languages and Supported Entity Types page provides a more detailed look.

How DCM/DICOM Files Are Processed

DICOM files are processed as follows:

  1. The DICOM file's contents which include images, metadata and other elements as described in the DICOM file standard are extracted.
  2. Text components of the DICOM file, including supported properties and metadata are processed by Private AI's text module, except for the fields listed below in constraints that are assumed to never contain PII. More information about sensitive fields in DICOM files can be accessed here .
  3. Any image components of the DICOM file are processed by Private AI's image module. This process finds and masks any sensitive text, such as scan date and operator name. The scan itself, such as an MRI isn't altered.
  4. A new DICOM file is constructed using the de-identified or redacted equivalents created in the steps above.
info

Graphical content where text is present will be OCRed and then redacted. You can configure the OCR System by setting it as an Environment Variable or sending it in the request object. Check out our OCR Guide to further understand the OCR modes and their usage.

Constraints

  • Private AI only supports DICOM files with 2D images, DICOM files containing 3D images or videos are not currently supported. Please split 3D images up into a series of 2D images for processing.
  • The following metadata and parameters are assumed to never contain PII and are always passed through the output DICOM file:
    Copy
    Copied
    "AcquisitionNumber",
    "BitsAllocated",
    "BitsStored",
    "Columns",
    "ConvolutionKernel",
    "DataCollectionCenter",
    "DataCollectionCenterPatient",
    "DataCollectionDiameter",
    "DetectorBinning",
    "DistanceSourceToDirector",
    "DistanceSourceToPatient",
    "DistanceSourceToDetector",
    "Exposure",
    "ExposureTime",
    "FieldOfViewDimensions",
    "FieldOfViewShape",
    "FieldOfViewOrigin",
    "FilterType",
    "FrameOfReferenceUID",
    "FocalSpots",
    "GantryDetectorTilt",
    "GeneratorPower",
    "Grid",
    "HighBit",
    "ImageOrientationPatient",
    "ImagePositionPatient",
    "ImageType",
    "ImagerPixelSpacing",
    "InstanceCreationDate",
    "InstanceCreationTime",
    "InstanceNumber",
    "IrradiationEventUID",
    "KVP",
    "LargestImagePixelValue",
    "Modality",
    "NumberOfPhaseEncodingSteps",
    "NumberOfStudyRelatedInstances",
    "PhotometricInterpretation",
    "PixelData",
    "PixelPaddingValue",
    "PixelRepresentation",
    "PixelSpacing",
    "PositionReferenceIndicator",
    "ReconstructionDiameter",
    "ReconstructionTargetCenterPatient",
    "RescaleIntercept",
    "RescaleSlope",
    "RotationDirection",
    "Rows",
    "SOPClassUID",
    "SOPInstanceUID",
    "SamplesPerPixel",
    "ScanOptions",
    "SeriesInstanceUID",
    "SeriesNumber",
    "SliceLocation",
    "SliceThickness",
    "SmallestImagePixelValue",
    "SoftwareVersions",
    "SpatialResolution",
    "SpecificCharacterSet",
    "StudyInstanceUID",
    "TableHeight",
    "WindowCenter",
    "WindowWidth",
    "XRayTubeCurrent",
    "XRayTubeCurrentInuA"

Support Matrix

CPU Container GPU Container Community API Professional API PrivateGPT UI
Supported? Yes Yes No Yes No

Sample Request

info

Please sign up for a free API key to run this code.

Request BodycURLPythonPython Client
Copy
Copied
{
    "file": {
        "data": file_content_base64,
        "content_type": "application/dicom",
    },
    "entity_detection": {
        "return_entity": True
    }
}
Copy
Copied
echo '{
          "file": {"data": "'$(base64 -w 0 sample.dcm)'", 
          "content_type": "application/dicom"}, 
          "entity_detection": {"return_entity": "True"}
      }' \
| curl --request POST --url 'https://api.private-ai.com/community/v4/process/files/base64' \
       -H 'Content-Type: application/json' \
       -H 'x-api-key: <YOUR KEY HERE>' \
       -d @- \
       | jq -r .processed_file \
       | base64 -d > 'sample.redacted.dcm'
Copy
Copied
import requests
import base64

file_url = "https://paidocumentation.blob.core.windows.net/$web/sample.dcm"
filename_out = "/path/to/output/sample.redacted.dcm"
file_content = requests.get(file_url).content
file_content_base64 = base64.b64encode(file_content).decode("ascii")

url = "https://api.private-ai.com/community/v4/process/files/base64"

headers = {"Content-Type": "application/json", "x-api-key": "<INSERT API KEY>"}

payload = {
  "file":{
    "data": file_content_base64,
    "content_type": "application/dicom",
  },
  "entity_detection": {
    "return_entity": True
  }
}

response = requests.post(url, json=payload, headers=headers)
with open(filename_out, "wb") as f:
    f.write(base64.b64decode(response.json()["processed_file"]))
Copy
Copied
from privateai_client import PAIClient
from privateai_client.objects import request_objects
import base64

filename_in = "sample.dcm"
filename_out = "sample.redacted.dcm"

file_type= "application/dicom"
client = PAIClient(url="https://api.private-ai.com/community/v4/", api_key="<YOUR API KEY>")

with open(filename_in, "rb") as b64_file:
    file_data = base64.b64encode(b64_file.read())
    file_data = file_data.decode("ascii")

file_obj = request_objects.file_obj(data=file_data, content_type=file_type)
request_obj = request_objects.file_base64_obj(file=file_obj)
resp = client.process_files_base64(request_object=request_obj)

with open(filename_out, 'wb') as redacted_file:
    processed_file = resp.processed_file.encode("ascii")
    processed_file = base64.b64decode(processed_file, validate=True)
    redacted_file.write(processed_file)

Sample Response

Copy
Copied
"processed_file": "Base64 Encoded File Content of the Redacted File",
"processed_text":"string",
"entities":"List[Entity]",
"entities_present":true,
"languages_detected":{"lang_1":0.67, "lang_2": 0.74}
© Copyright 2024 Private AI.