Icon close

Thea Integration Guide

How to architect your solution using Thea.

About Thea

Thea is a robust, well architected solution for automating document-based workflows, designed with industry best practices around security, data privacy and governance. Thea harnesses cloud OCR services combined with custom built Machine Learning models to accurately extract business critical data from documents at scale in order to reduce costs, save time and minimise processing errors.

Thea is API-based and securely processes customer documents without storing the information that is extracted. Thea supports a number of document types including bank statements, payslips, passports, and invoices. It can also be trained for custom use cases. Thea can process a range of document types including PDFs, images, handwriting and scans.

Thea combines OCR and document layout models and can be integrated with existing customer systems and document workflows in two simple steps

Fine-tuning document models through ML training for the customer use cases
Adding calls to Thea APIs from customer applications and systems

How does Thea work

Each document passes through a preprocessing pipeline to ensure best practice inputs for text extraction. Optical character recognition (OCR) then extracts data from your documents. This data is passed to Thea’s natural language processing (NLP) models for document classification, field identification and extraction. Document input and data consumption are facilitated by our secure, scalable REST API.

Integration Steps

An integration would often comprise the following steps.

1. Review client data and decide on the training dataset and amount of data augmentation required

2. Labelling documents into desired categories e.g. passport, bank statement etc

3. Labelling fields in the document to extract e.g. first name, last name, total, passport number

4. Configuring access to the Thea API

5. Call Thea API endpoints to solve the business problem

6. Periodic fine-tuning to maintain and improve model performance

Architectural Components

How Predictions API can be used

How Training is Initiated

Developer Guide

Want to know more? Read the full guide.

See Thea live in action

Learn how to architect your solution using the Thea integration guide.

When it comes to data, we are super into it, we dig it. We are always curious to know what you are working on and always love to chat.