Our Digitisation Grant has opened early!
FREE PASTVIEW DEMO (Completely free and only takes 30 seconds)
Slider
Transport for London
GSK
Boots
The Sandhurst Collection
Archant
Scottish Canals
Rambert Archives
University of the Arts London
previous arrownext arrow
Slider

Optical Character Recognition (OCR) Scanning

Optical Character Recognition (OCR)

TownsWeb Archiving’s optical character recognition (OCR) scanning service uses leading professional OCR software to identify typed text within digital images and convert it into usable digital text, which can be added to digital archives as metadata and searched against.

OCR is invaluable in making the information within digitised magazine, journal and newspaper collections far more accessible, providing the most efficient and instantaneous retrieval of your valuable content

Consultation

Get in touch and tell us about your requirements for a free, no obligation quote.

Digital assets

We can either digitise your material or you can provide us with your digital assets.

Pre-Processing

We can interpolate, deskew, reduce background noise and much more to produce the best possible OCR results.

OCR Scanning

Using our specialist OCR software we scan your digital material, providing the results in any format you require.

System Integration

We can import your OCR data straight into PastView or into your own system, ready to link to your digital images.

Optical Character Recognition (OCR) Frequently Asked Questions

Our professional OCR software scans your JPEG and TIFF image collections (often produced via digitisation), recognises typed or printed text within the images, and converts that typed text into machine readable digital text documents.

The OCR’d text can then be added as metadata to a digital archive and associated with the image it was scanned from, to allow keyword searching of the text content, either via collections management software or on a digital archive website.

For example, if you digitised a collection of printed magazines, then put the digital images through the OCR process to extract the article text, this would then allow searching by keyword against the articles’ content.

To see examples of this in action take a look at our PastView digital collections management system.

Using OCR scanning we can capture the full content of digitised items – including books, magazines, newspapers, and diaries.

Our team can also index your digitised files by incorporating metadata created from the OCR within the filenames of the digital images.

Performing our OCR process on typed or printed text can be very accurate. If the text is clear (e.g. the text colour is a strong contrast to background colour), typed in a standard font (e.g. Arial, Times New Roman), and in a standard size (e.g. size 10 upwards); the OCR results are on average 95% accurate. Our OCR software additionally performs pre-process techniques to improve the chances of successful recognition, such as if de-skewing the document if it is not aligned correctly.

Though it should be noted that formatting anomalies, such as tables, can negatively impact the results.

We can produce output files of the OCR data in any format you specify, such as PDF, PDF/A, MS Word, HTML or Rich Text documents.

We can help you import the data into your own system, alternatively we can import the data into our collection management systems, PastView. Take a look at our internet based BookViewing Software, which allows you to link the transcribed data to your digitised images.

If you are interested in publishing your digital collection and OCR data online, find a PastView package here to suit your organisation.

Whilst our OCR service can be very accurate when converting printed type text into digital format, in our experience the accuracy of OCR on handwritten text is very poor. For this reason, in the case of capturing hand written text we recommend using our transcription service.

Digitise your physical material with TownsWeb Archiving

Digitising your material is the first step in providing improved access to your collections. We have worked with 1000s of organisations throughout the UK, safeguarding their invaluable archives with our digitisation expertise.

Learn more about digitisation

Publishing your digitised content with PastView

Once your collection has been digitised you can upload this data to PastView, for ultimate management control, and publish your collection through a purpose built, bespoke PastView website.

Learn more about publishing

Would you like more information?

If you would like to learn more about our services or request a free quotation, please feel free to contact us.