Great news! PastView has a brand new website! Your complete publishing platform that showcases your digital collections online. Find out more.

Data Capture - OCR
Transport for London
The Sandhurst Collection
Scottish Canals
Rambert Archives
University of the Arts London
previous arrow
next arrow

Optical Character Recognition (OCR)

TownsWeb Archiving’s optical character recognition (OCR) scanning service uses leading professional OCR software to identify typed text within digital images and convert it into usable digital text, which can be added to digital archives as metadata and searched against.

OCR is invaluable in making the information within digitised magazine, journal and newspaper collections far more accessible, providing the most efficient and instantaneous retrieval of your valuable content. TownsWeb Archiving are happy to deploy their OCR service during the digitisation phase or for already digitised materials. Following an initial survey and sample we can provide examples of our outputs for your approval.

Frequently Asked Questions – Find out more

Get in touch and tell us about your requirements for a free, no obligation quote.

Digital assets

We can either digitise your material or you can provide us with your digital assets.


We can interpolate, deskew, reduce background noise and much more to produce the best possible OCR results.

OCR Scanning

Using our specialist OCR software we scan your digital material, providing the results in any format you require.

System Integration

We can import your OCR data straight into PastView or into your own system, ready to link to your digital images.

Hear what our clients have to say about our data capture services

Optical Character Recognition (OCR) Frequently Asked Questions

Our professional OCR software scans your JPEG and TIFF image collections (often produced via digitisation), recognises typed or printed text within the images, and converts that typed text into machine readable digital text documents.

The OCR’d text can then be added as metadata to a digital archive and associated with the image it was scanned from, to allow keyword searching of the text content, either via collections management software or on a digital archive website.

For example, if you digitised a collection of printed magazines, then put the digital images through the OCR process to extract the article text, this would then allow searching by keyword against the articles’ content.

To see examples of this in action take a look at our PastView digital collections management system.

Using OCR scanning we can capture the full content of digitised items – including books, magazines, newspapers, and diaries.

Our team can also index your digitised files by incorporating metadata created from the OCR within the filenames of the digital images.

Performing our OCR process on typed or printed text can be very accurate. If the text is clear (e.g. the text colour is a strong contrast to background colour), typed in a standard font (e.g. Arial, Times New Roman), and in a standard size (e.g. size 10 upwards); the OCR results are on average 95% accurate. Our OCR software additionally performs pre-process techniques to improve the chances of successful recognition, such as if de-skewing the document if it is not aligned correctly.

Though it should be noted that formatting anomalies, such as tables, can negatively impact the results.

We can produce output files of the OCR data in any format you specify, such as PDF, PDF/A, MS Word, HTML or Rich Text documents.

We can help you import the data into your own system, alternatively we can import the data into our collection management systems, PastView. Take a look at our internet based BookViewing Software, which allows you to link the transcribed data to your digitised images.

If you are interested in publishing your digital collection and OCR data online, find a PastView package here to suit your organisation.

Whilst our OCR service can be very accurate when converting printed type text into digital format, in our experience the accuracy of OCR on handwritten text is very poor. For this reason, in the case of capturing hand written text we recommend using our transcription service.

Calculate the cost of your data capture project

Enter your data capture requirements and tick the box to confirm whether you would like us to get in touch to discuss your requirements further. If you’re not quite ready to use our online quote calculator, feel free to call on 01536 713834 or email us [email protected]


Keep up to date

Sign up to our newsletter and be the first to know about our next funding opportunity, plus the latest cultural heritage news and updates.