Validating financial documents at Scale, reducing manual processing and IT intervention

Introduction:

In the fast-paced world of financial services, precision and speed are critical. In mortgage processing and administration, validating extensive supporting documents accompanying home loan applications poses a significant challenge. Manual validation processes are time-consuming and prone to human error, leading to potential delays and inefficiencies in workflow and bad customer experience.

Our creation of a specialised application to automate the validation of loan documents marks a significant advancement in financial document processing services. This article delves into the engineering and technology behind this innovative solution.

Read the case study: Application fast-tracks validation of home loan documents via automation

Ninjaneer building innovative application automates financial document validation, significantly reducing manual processing and IT requirements in mortgage processing

Main Use Case and Objective:

The primary objective of this project was to empower users to define validation rules dynamically without the need for IT intervention.

By utilising user-friendly text notations to capture and validate data, users can seamlessly create rules to be automatically executed, removing the need for manual or repetitive validations.

Our main use case was to compare information from multiple sources (such as digital documents, ERPs and external databases) to ensure all sources have the same or similar data.

For example, Optical Character Recognition (OCR) is used to extract an Applicant’s Name from a document provided by the applicant and validate if it has the same data as an internal database.

 

Challenges to Overcome:

The development of our automated document validation application needed to integrate diverse technologies while ensuring seamless, user-friendly functionality.

One major obstacle was the complexity of dynamically defining validation rules from a no-code interface operated by business users. This required developing a flexible system to interpret and integrate external REST requests, OCR systems, and other technologies without a predefined structure.

To overcome this specific challenge, we utilised JSON/XML Path notations to allow users to target specific attributes from those requests, which later on in the process would be stored in runtime variables that could be used in their own rules to validate the information.

Let’s look at the scenario below to help us paint the picture and visualise this explanation.

First, let’s assume that we have multiple sources of information, and all of them integrate with our application using REST APIs.

Furthermore, each one of these sources has their own data structure.

One of them is an ERP that sends the following JSON request to our application:

				
					{
    "Applicant" : {
        "FirstName" : "Raphael",
        "LastName" : "Ranieri"
    },
    "Product" : {
        "Name" : "Mortgage",
        "Date" : "2024-03-07"
    }
}

				
			

Another source is an OCR, like AWS Textract, that after extracting the information from a PDF file sends other JSON request to our application:

				
					{
    "Relationships": [
        {
            "Type": "CHILD",
            "Ids": [
                "d7fbd604-d609-4d69-857d-247a3f591238"
            ]
        }
    ],
    "Confidence": 99.63229370117188,
    "Text": "First Name: Raphael",
    "BlockType": "LINE",
    "Id": "d7fbd604-d609-4d69-857d-247a3f591238"
}

				
			

Now, let’s assume that one of our requirements during the Validation Process, is to make sure the First name we have stored in our ERP is the same as the one present in the file provided by the applicant to prove their identity.

These values will only be available after the request is received, and to be able to create a dynamic rule and validate it in runtime, we first need to tell the application how to get the information. 

To do that, we used a JSON Path notation, which is a way to retrieve specific data from JSON files.

To explain it further, using the first request as an example, we can provide a path as a text for our application, such as: ’$.Applicant.FirstName’. 

The result we would have in runtime when executing this path would be ‘Raphael’, as shown in the picture below:

Evaluation Results

With this in mind, we were able to develop a back office area to allow users to visually select the information they needed based on example requests, and store this information in a variable to be used by our validation engine in runtime.

In simpler words, the business users would access the back office, create and give a name to a variable and say the JSON path this variable should use to get the data in runtime.

Such as:

Variable Name – ‘ERP_FirstName’

Variable Path – ‘$.Applicant.FirstName’

Then, after all the variables were defined, we could provide a visual and simple way for the users to create rules using these variables.

Imagine having a dynamic formula that compares two variables and returns a true value if the expression is valid or a false value if they do not match. This formula is also configurable by the user in the back office. Just like:

‘ERP_FirstName’ = ‘OCR_FirstName’

With this base, we could escalate the application to support infinitely more complex scenarios, including conditional cases, text manipulations and even mathematical operations.

Some other examples of rules could be: 

Replace(‘ERP_LastName’,’ERP_FullName’) = ‘OCR_FirstName’ To replace a portion of a text and compare it to another text.

If(ERP_ApplicantAge > 21, ‘ERP_FirstName’ = ‘OCR_FirstName’, false) To compare two texts only if the applicant is 21 years old or older.

These are still elementary operations, and in our real scenario, we had to evolve the rules engine a lot further, but the base concept is the same. 

Additionally, ensuring the accuracy and reliability of OCR technology in extracting and interpreting data from digitised documents presented engineering complexity. 

Furthermore, accommodating the diverse requirements of different document processing scenarios while maintaining scalability and performance was daunting. However, we successfully overcame these challenges through meticulous planning, innovative engineering solutions, and rigorous testing, resulting in a robust and efficient automated document validation application.

 

Technologies Used:

  • OutSystems app development platform
  • JSON/XML path for data capture
  • Optical Character Recognition (OCR)
  • C# integrations for expression evaluation



User-Friendly Interface:

Our solution features a user-friendly interface that enables business users to create validation rules effortlessly. By eliminating the dependency on IT for coding, users can focus on structuring rules for accurate validation, enhancing efficiency and productivity.

 

Conclusion:

Implementing automated document validation technology represents a transformative milestone in the loans and financial services processing industry. Our application has demonstrated remarkable efficacy in accelerating document validation processes by leveraging OCR technology and customisable validation rules. 

By building a dynamic parser of JSON requests with a dynamic rules engine, and combined with the document recognition, we delivered tangible benefits such as significant reductions in processing time and labour costs and eliminating manual errors. Moreover, the flexibility of our back-office system enables organisations to adapt swiftly to changing business and regulatory requirements.

As businesses prioritise customer-centric digital services, our automated document validation application exemplifies technological innovation, driving efficiency, accuracy, and enhanced customer satisfaction. With successful implementation and tangible results, our application underscores the transformative potential of engineering and technology in reshaping traditional processes within the financial services sector.