Paving Way for New Accounting Practices in Russia
ATAPY Software builds a data capture production line for Russian SaaS Accounting Provider on the basis of ABBYY OCR Toolkits
In 2013 ATAPY completed a project for a Russian Company* offering a line of business consulting services, including company registration, legal aid, audit and accounting services, ISO certification, and a number of others. All these services were offered on the basis of a modern and comprehensive technology infrastructure; the accounting services were provided in the cloud via a SaaS (Software as a Service) model. The SaaS model frees the customer from the necessity of deploying and maintaining a hardware/software platform; instead the customer purchases a subscription to the Company’s cloud service and performs all the necessary operations in their restricted client area on the Company’s web site.
Until recently, the procedure of document upload to the web site was quite complicated: after scanning the document, the customer had to manually enter the key information in the corresponding fields of the document description: the document date and number, Contactor 1, Contractor 2, Grand total (for financial documents), etc. This was far from an optimal process, so the Company started looking for some ways to improve usability in this part of the system. The solution was to answer the following requirements:
- Support upload of images in most popular graphic formats: TIFF (including multi-page), JPG, BMP
- Recognize the text and automatically extract the data from the key fields, filling the document metadata with it (for several document types)
- Automatically classify documents by type
- Export the extracted information and full-text recognition results into XML and PDF/A formats, correspondingly
- Allow for high productivity
- Run on Company’s servers as a Windows service
No end-user software package was suitable for the task, as the Company needed a solution that could be natively built into their existing infrastructure as a module. This could be implemented as a custom solution on the basis of one of existing OCR/data capture software toolkits.
The Company turned to ABBYY, a recognized manufacturer of such products, and ABBYY passed this project over to ATAPY Software – a company with 11-year expertise in developing custom data entry solutions. To satisfy the Customer’s requirements, two ABBYY toolkits were required: ABBYY FineReader Engine for full-text recognition, and ABBYY FlexiCapture Engine for data extraction.
At the first phase of the project, ATAPY developed a server component that:
1. Monitored the hotfolder where user uploaded the scanned documents in background mode
2. Having found new documents, initiated document processing; that included the following steps:
- Document recognition
- Document classification by type
- Applying the corresponding flexi layout and extracting the key data; if the document doesn’t belong to any of the defined types, this step is skipped, and the document is passed for manual processing
- Saving the source scanned images and full-text recognition results in PDF/A format, saving the extracted key data in XML
3. Having finished processing, saved the results in a destination folder on the server, from where the documents were then routed into client areas.
On the basis of ABBYY FlexiCapture Engine, ATAPY developed Flexi layouts for two most popular document types: delivery notes (Form TORG-12), and invoices. Two more are planned for the second phase of the project: acceptance reports, and bills of lading (Form 1T).
To enhance the component productivity, ATAPY implemented recognition in several streams – a capability available due the multi-core CPUs support featured in ABBYY toolkits.
In is anticipated that the new exciting feature – automatic fill-in of the document description during upload – will provide the Company a serious competitive edge. Besides, the service will also speed up the document processing workflow of the Company division that provides outsourced document processing services.
- ABBYY FlexiCapture Engine 10
- ABBYY FineReader Engine 10
Technologies and programming languages
- Microsoft .NET Framework 4.0
- Microsoft Visual Studio 2010 (C#)
* The name of the Customer is not discloed due to NDA restrictions.
- Windows Server © 2008 R2 (64-bit)