Overview
SS&C is focused on changing the paradigm of transaction processing through ‘extract and automate’. In March 2020, SS&C acquired Vidado, a machine learning data extraction platform. Renamed Chorus Document Automation, it enables the extraction of handwritten or machine printed data from images with accuracy of 98% or greater to provide immediately usable structured data. Then Chorus BPM takes the structured data and applies client specific processing rules for further processing via APIs and web services.
Low quality documents are machine printed documents that can’t be read by traditional OCR technology because somewhere along the way, their data got compromised, typically due to faxing or scanning. Traditional OCR can’t read and extract data from handwriting or poor quality machine printed documents. While it can recognize all types of clearly printed characters, once the text becomes smudged or skewed, it no longer knows what it’s looking at. This means, while it can handle 80 percent of document workflows, humans must intervene to take care of the remaining 20 percent.
Chorus Document Automation uses more advanced technology than traditional OCR. It incorporates AI and machine learning to read and extract hard-to-read documents. Instead of simple techniques to identify letter shapes, this type of OCR leverages a highly trained machine learning model and advanced computer vision engines to predict what is there. The combination of highly trained machine learning models and computer vision engines unlocks OCR’s ability to replicate the way humans are able to read low-quality documents.
Key Features
- Platform delivers both form identification, extraction (computer vision) and data enrichment
- Focus on automation-ready data
- Training Data Set derived from 1 Billion+ data points
- Delivers 98%+ accuracy across all data inputs (including handwritten)
- Handwriting and low-quality machine print
- Structured forms
- High-volume operational/analytics use cases
- Checkboxes
Key Benefits
Removes these challenges:
- Inaccuracy: mistyping and exception handling
- Resources: difficult to source talent willing and able to manually extract text from low-quality documents
- Security: the transfer from machine to human back to machine causes concern for security. Especially for those use cases in tightly regulated industries with sensitive information, like financial services, government, and healthcare organizations.