Skip to content
Last updated

Overview of automated checks applied to identity documents and related artifacts. Use the quick actions below to jump or filter by area.

v1.0 Last updated: Aug 15, 2025

Controls

This solution is based on in-house technologies for image quality control, data extraction, image analysis, as well as connectors to external repositories.

Artificial intelligence technologies such as neural networks, machine learning, and deep learning have been developed to guarantee the best possible results given the scanning source (scanner, smartphone, etc).

The checks carried out provide answers to the following questions:

  • Is the image of the document of sufficient quality to be used?
  • Is the document what you expected?
  • Is the information in the document consistent with the information I have and want to retrieve?
  • Is the document authentic?

Control sequence

  1. The first stage of the checks involves verifying the quality of the transmitted images.
  2. The second step is to verify that the submitted supporting documents are the expected ones.
  3. The third stage involves comparing the data extracted from the documents with reference information (from the customer file, for example).
  4. The fourth stage involves analyzing the authenticity of the supporting documents submitted.

Quality analysis

Automatic analyses of document type and data extraction require a high-contrast, sharp, well-resolved image. These quality requirements are higher than those needed for a human to proofread text. This solution manages its own quality thresholds to optimise the resulting analysis.

Before analysing the type of document and its content, a check is carried out on the size and resolution: images that are too small or too large are rejected (see limits below). The type of the image is checked: on identity documents, 'binary' images (made up of two-color pixels: all black or all white) are rejected.

The image must be sufficiently sharp and contrasted. For text documents, a more detailed analysis of the text quality in the image is performed before the OCR analysis. The text must be black on a light background and the characters must be large enough to be recognised. The exact limits on size depend on the type of document; they are typically around 20 pixels in character height.

Note: on PDF files, only the first two pages are examined to determine whether the first page is blank. In this case, the second page will be processed if it corresponds to a receipt.

Controlled elementAuthorised limit or values
File mime typePDF, JPEG or PNG.
File sizeGreater than 0 bytes and less than 10 MB.
Image sizeWidth ≥ 320 px and height ≥ 256 px; resolution < 20 MP.
Image sharpnessInternal threshold.
Image contrastInternal threshold.
Image colourimetryFor identity documents, the image must be in color or greyscale and not binary.
Text size in imageLarge enough for OCR; typically ~20 px character height (document‑dependent).

Additional controls

Analysis of PDF file metadata

Note: This check applies to all PDFs supplied.


Original Document (Video capture only)

Is this an original document (not a printed / photocopy, screen capture...)? This check is performed on the front side page only, or on the first page if no Identity photo has been detected.

Skipped the submitted document file.

PDF annotations

PDF annotations are checked. These annotations may be the result of a fraudulent attempt to modify certain information in the document.


Picture Not Tampered (Video capture only)

The identity photo has not been substituted or tampered with. This check is performed only if the Identity photo has been detected.

Skipped the submitted document file.

Modification mark

A check is performed to determine whether the text in a document has been modified using Adobe's TouchUp tool.


Document Video Not Tampered (Video capture only)

Checks if a document video is authentic (recorded from a real camera) or manipulated (e.g., generated by a virtual camera or injected video). Injection Attack Detection is only applied if the IAD service is enabled.

Keyword-based controls

Note: these checks apply to all documents except identity documents.

  • Keywords: This control enables the determination of document type by searching for keywords such as 'balance sheet'. It also verifies the consistency of a file with the document by reading specific information, such as the company name, client's name, or address, as listed on the document. These two checks (type and consistency) automate the analysis of documents even if their type is not processed natively.
  • White list of keywords: absence of keywords in a white list can be checked. The check works with a complete search. For example, the word "BAS" will not be considered as found in the word "AMBASSADE."
  • Blacklist of keywords: the presence of keywords on a blacklist can be verified. The check works with a complete search. For example, the word "BAS" will not be considered as found in the word "AMBASSADE."