Supported Languages
Private AI features core support for 14 languages and extended support for 38 additional languages, with core languages featuring the highest level of performance. The complete list of supported languages below details which languages have core support, which have extended or beta support, and which are upcoming additions. New languages are continually being added, please contact us if you require a language not in the list below.
In addition to supporting 50+ languages, Private AI offers support for regional language varieties in recognition of the large differences in vocabulary and grammar that can exist in the same language when spoken in different regions. So far, this includes support for varieties of English (US, UK, Canada and Australia), Spanish (Spain and Mexico), French (France and Canada), and Portuguese (Portugal and Brazil). Private AI also supports code-switching, or mixing of different languages. This means that, in a phrase such as J’ai payé 76,88RM por ein Haarschnitt da 范玉菲 habang ko ay nasa Україна, multilingual PII is accurately de-identified. The selection of supported regional language varieties is continually being expanded, please let us know if there is a specific request.
Private AI’s supported entity types function across each supported language, with multilingual equivalents of different PII (Personally Identifiable Information) entities, PHI (Protected Health Information) entities, and PCI (Payment Card Industry) entities being detected in each language. Our Supported Entity Types page provides a more detailed look at our coverage of language and region-specific entity equivalents. The solution is also sensitive to cross-linguistic differences in how names are structured, how place names are referred to, and how monetary units are described in different languages, among other differences.
Supported Languages for File Processing
Note that while the Private AI text de-identification service supports more than 50 languages, the file processing service currently supports the following restricted list of languages: Dutch, English, French, German, Italian, Polish, Portuguese and Spanish. See the file processing documentation for details.
Core Support
Language | ISO Code | Supported Regional Varieties | Support Level | Added In |
---|---|---|---|---|
Dutch | nl | The Netherlands | Core | 3.4.0 |
English | en | Australia, Canada, United Kingdom, United States | Core | 1.0.0 |
French | fr | Canada (Quebec), France | Core | 2.2.0 |
German | de | Germany | Core | 2.2.0 |
Hindi | hi | India | Core | 2.10.0 |
Italian | it | Italy | Core | 2.2.0 |
Japanese | ja | Japan | Core | 3.4.0 |
Korean | ko | Korea | Core | 2.3.0 |
Mandarin (simplified) | zh-Hans | China | Core | 3.1.1 |
Portuguese | pt | Brazil, Portugal | Core | 2.2.0 |
Russian | ru | Russia | Core | 2.10.0 |
Spanish | es | Mexico, Spain | Core | 2.2.0 |
Tagalog | tl | Philippines | Core | 2.10.0 |
Ukrainian | uk | Ukraine | Core | 2.10.0 |
Extended Support
Language | ISO Code | Support Level | Added In |
---|---|---|---|
Afrikaans | af | Extended | 3.4.0 |
Arabic | ar | Extended | 2.10.0 |
Bambara | bm | Extended | 3.2.1 |
Bengali | bn | Extended | 2.10.0 |
Belarusian | be | Extended | 2.13.0 |
Bulgarian | bg | Extended | 2.10.0 |
Burmese | my | Extended | 2.10.0 |
Cantonese (traditional) | zh-Hant | Extended | 3.4.1 |
Catalan | ca | Extended | 2.10.0 |
Croatian | hr | Extended | 2.10.0 |
Czech | cs | Extended | 2.10.0 |
Danish | da | Extended | 2.10.0 |
Estonian | et | Extended | 2.10.0 |
Finnish | fi | Extended | 2.10.0 |
Greek | el | Extended | 2.10.0 |
Hebrew | he | Extended | 2.10.0 |
Hungarian | hu | Extended | 2.10.0 |
Icelandic | is | Extended | 2.13.0 |
Indonesian | id | Extended | 2.13.0 |
Khmer | km | Extended | 2.13.0 |
Latvian | lv | Extended | 2.10.0 |
Lithuanian | lt | Extended | 2.10.0 |
Luxembourgish | lb | Extended | 2.14.0 |
Malay | ms | Extended | 2.10.0 |
Moldovan | ro | Extended | 2.10.0 |
Norwegian (Bokmål) | nb | Extended | 2.10.0 |
Persian (Farsi) | fa | Extended | 2.10.0 |
Polish | pl | Extended | 2.10.0 |
Punjabi | pa | Extended | 2.10.0 |
Romanian | ro | Extended | 2.10.0 |
Slovak | sk | Extended | 2.10.0 |
Slovenian | sl | Extended | 2.10.0 |
Swahili | sw | Extended | 2.14.0 |
Swedish | sv | Extended | 2.10.0 |
Tamil | ta | Extended | 2.10.0 |
Thai | th | Extended | 2.13.0 |
Turkish | tr | Extended | 2.10.0 |
Vietnamese | vi | Extended | 2.10.0 |
Coming Soon
Language | ISO Code | Support Level |
---|---|---|
Haitian Creole | ht | Coming soon |
Mandarin (Traditional) | zh-Hant | Coming soon |