01:35 Don’t Just OCR Documents, Interpret Them cognitive learning theory in the classroom Unattended Automation | |
Don’t just OCR documents. Interpret them. Document interpretation is central to cognitive document automation for the digital workforce. The shortcomings of treating a document automation project as simply an “OCR problem” and the limitations of OCR when the ultimate objective is achieving unattended document automation have become much more clear over time.Cognitive learning theory in the classroom we’ve even written an ebook ( building bespoke document automation) about it. The focus of this article is “interpretation” cognitive document automation and what that really means.Cognitive learning theory in the classroom When we talk about document automation or advanced capture, what we mean is the process that takes documents and converts the information held within them into useful information for organizations within many different business processes.Cognitive learning theory in the classroom information is converted regardless of whether they are images of documents captured by a scanner, by mobile device or if they are “digitally-born” documents such as word documents, emails or pdfs.Cognitive learning theory in the classroom typically, “useful” is a substitute for the word “structured data.” this is data that can easily be ingested and processed by Other automation and business systems.Cognitive learning theory in the classroom so for a tax form, document automation is the process of locating specific entries of data and presenting them to another system. For more complex documents such as commercial invoices, it is the process of locating specific data about the transaction and presenting it to an accounting or ERP system.Cognitive learning theory in the classroom for some of the most complex documents—such as contracts—it is the process of identifying specific terms of the contract and presenting these terms to another system.Cognitive learning theory in the classroom OCR software cannot do this type of interpretation. OCR tools merely convert image-based documents into machine readable text. In each of these cases, the focus is on the ability to reliably and efficiently locate and extract specific, needed data.Cognitive learning theory in the classroom As you might imagine, the use of OCR is only a small fractionof the tasks involved. With increasing frequency, OCR is not needed at all. For born-digital documents, there is no OCR as the information contained is already machine-readable text.Cognitive learning theory in the classroom for many document classification tasks, use of visual analysis alone is suitable to get the job done. Again, this is without the use of OCR software.Cognitive learning theory in the classroom unattended automation for data location and extraction For data location tasks, the effort involved runs the gamut from employing seemingly easy fixed zones/templates for locating data to complex document structure analysis for identifying headings, paragraphs, sentences and words.Cognitive learning theory in the classroom complex document structure analysis enables reliable identification of data like the list of causes for termination of a contract. Not surprisingly, as a result, the efforts to reliably locate and extract document-based data have become a key focus of advanced capture vendors.Cognitive learning theory in the classroom this allows for the core OCR technologies to be considered practically a commodity. Most advanced capture vendors don’t even develop their own OCR!Cognitive learning theory in the classroom If we examine the “technology stack” involved with advanced capture, it starts with image perfection (again, not OCR), moves to OCR in selected cases (remember it isn’t always about images), and then progresses to the interpretation stage where required data is located within documents.Cognitive learning theory in the classroom there is another crucial step in this interpretation stage before you get to the extraction/presentation stage: data analysis. This component is responsible for evaluating the located data to determine whether it is reliable.Cognitive learning theory in the classroom in order to do this, a variety of alternatives are considered along with other information about the data (called “context”) to arrive at a conclusion.Cognitive learning theory in the classroom the best systems employ several different evaluation techniques in order to create consistent reliability outcomes. There is nothing at all within any OCR process or regular expression that will enable this capability.Cognitive learning theory in the classroom only this capability can enable true unattended automation for today’s digital workforce-based processes. Digital workforce-based processes cognitive learning theory in the classroom Parascript has been using these interpretation and analysis processes within document automation for over 25 years achieving true unattended automation that saves organizations literally billions of dollars per year.Cognitive learning theory in the classroom we continue to push the automation envelope. Today, we enable true automation with smart learning which takes the data science approach necessary to achieve high levels of automation and puts it into the hands of the organizations themselves.Cognitive learning theory in the classroom it’s time to put the “ advanced” into your document capture processes. | |
|
Total comments: 0 | |