Image Cleaning Solution for the Legal Industry
Cooperation between ATAPY Software and EasyData B.V. continues to bring new advantages to Benelux customers
‘Raad voor Rechtsbijstand’ (www.rvr.org) is a Dutch agency that provides legal advice and public prosecutors to people who are unable to hire a lawyer. The agency maintains a large archive of various legal records. While attempting to convert the archive to a digital format, the agency faced an unexpected problem.
Many documents were printed on intensively colored paper. The same background that made the documents look nice and distinct, turned up as a heavy jitter on black-and-white scans that inflated the file sizes, made them difficult to read, and impossible to OCR.
The agency sought professional advice from EasyData B. V. EasyData analyzed the situation and concluded that no ready-made solution, such as the despeckling facilities in modern OCR packages, would produce acceptable results since the jitter was significantly heavier than what those tools are capable of removing. EasyData outsourced the problem for more thorough research to ATAPY Software. ATAPY’s engineers designed and implemented a custom algorithm that approached the task more intelligently, taking into account not only the linear characteristics of each dot cluster, but its context (characteristics of the neighboring clusters). The result is a tool that produces nearly-clean images with insignificant information loss and that take up to 10 times less disk space:
The original image (OCR rate after built-in despeckling = 1.9%)
The same image after ATAPY’s cleaning algorithm has been applied (OCR rate = 98.9%)
About Raad voor Rechtsbijstand:
The Raad voor Rechtsbijstand (Legal Aid Board) is a Dutch agency that provides legal advice and public defenders to people who are unable to hire a lawyer. For more information visit www.rvr.org.
More information about ATAPY Custom Document Imaging Solutions: