Practical Records Management Blog

Shoreline Records Management's blog highlights the latest in Document Management and Records Management - Document Scanning, Document Storage, Enterprise Content Management, and General Filing Tips and Advice.

Subscribe via E-mail

Your email:

Request a free records management analysis

Get our free records management ebook

Things we like...

Records Management Blog | Practical Records Management

Current Articles | RSS Feed RSS Feed

Google Expands OCR Capabilities for Document Scanning


google docs OCRGoogle is Now Offering Free OCR Services for Scanned Documents. Google announced yesterday that there is a new feature available in Google Docs to allow users to import Scanned Documents. The feature, describes as "Convert Text from PDF or image files to Google Docs Documents," allows users to import a Scanned PDF or Image File (JPEG, GIF, or PNG).

There are still some questions that come up as to whether or not this new functionality indicates an intention by Google to broaden the scope of their Google Docs platform, as well as questions about how Google Docs' new OCR functionality works and the functionality that it provides. These questions include:

What OCR Engine is Google using in the Google Docs Platform?

The OCR Engine used by Google in this process is not immediately clear. Google does Sponsor an Open Source OCR Engine and Document Analysis Platform called OCRopus, but Google hasn't publicly acknowledged that this is the technology being used by any of their services, including Google Books or the new Google Docs OCR Functionality.

Does Google Docs OCR Work with TIF Files?

During our testing, we noticed that the OCR functionality didn't work for one of the most standard image formats that we find clients using, TIF Images. TIF, or TIFF (Tagged Image File Format), Images are widely considered an Industry Standard for Scanning Paper Documents, so I found the absence of this functionality to be a surprising.

For those looking to convert TIF images, you may want to use Adobe Acrobat or another utility to convert TIF files to PDF, or check out ABBYY FineReader Online. For organizations looking to convert large volumes of information, I would recommend using an alternate document capture software for converting your images to OCR.

How Well does Google Docs OCR work?

The technology is still a bit new, as it was only released yesterday, but ars technica did some testing and was nice enough to summarize their experiences. Their results were about the same as the results we experienced during our testing, and they summarized their findings: "There are still cases where this OCR would be better than nothing." Not quite the ringing endorsement that you'd hope to see attached to a Google Service, but the offering is still new.

Because of the way the import mechanism is configured, Google Docs OCR may not be the best document scanning solution for every business case, especially if you're looking to convert a large volume of paper documents to digital images. For ad-hoc, low volume OCR requirements however, the Google Docs OCR functionality serves as a solid utility for converting paper into useable text.

Have you tried the Google Docs OCR tool yet? What have your experiences been? Have you had better success with other services or software? Share your experiences in the Comments!


Comments

In today‚Äôs business world document management is very crucial. Before some days they maintain a large number of file. It is better if the scanned copies of those documents are stored in a database. So there is no chance for misplacement of any document. Goggle now gives us this facility.  
OCR Engine is a new Goggle technology. It provides us a online storage space for scanned copy. 
Posted @ Friday, July 02, 2010 12:11 AM by Ishita
After touring the site last week, I now know there is a state of the art document storage company I will be proud to recommend to the hospitals and doctor groups work with. Keep up the good work.
Posted @ Wednesday, July 07, 2010 7:09 AM by Bill Baylis
Comments have been closed for this article.