|
For publishers and other professional document creators,
technology offers a wealth of opportunities for the re-use,
re-generation and re-purposing of legacy material.
Legacy text material comes in many forms, principally out-of-print
or previously published books, as well as manuscripts from
typewriters and word processors, and magazine articles, academic
journals and newspaper cuttings. However, regardless of its
form, three factors are always present when dealing with legacy
material; it is:
· existing printed material…
· unavailable in a digital form…
· whose content has some perceived value.
If the content of the legacy material is principally good
quality text, and the material is to be re-published, optical
character recognition (OCR), also known as text scanning,
is a tried and trusted technological alternative to re-typing
for bulk text entry. However, if the existing form of the
legacy material is also important, and it is unfeasible or
unnecessary to re-publish the material, capturing the documents
with Adobe Acrobat may prove to be beneficial.
Adobe Acrobat
Adobe's Acrobat family of software products lets users create,
display and print Portable Document Format (PDF) files, independent
of their originating software or computing platform, and regardless
of whether files originate in electronic form or as hard copy.
Legacy documents can be converted into a single, standardised
electronic format suitable for distribution over networks,
on low-cost CD-ROM disks, or via the Internet.
Acrobat Reader
Acrobat Reader, the software application which gives display
and print access to PDF files, is available free from Adobe
Systems Inc. (www.adobe.com or www.adobe.co.uk) as well as
bulletin boards an numerous freeware and shareware websites.
It is also bundled free with many new PCs and software applications.
Acrobat Capture
Acrobat Capture is Adobe's professional-level PDF production
tool. It merges two important technologies - imaging and optical
character recognition - with a host of exciting innovations,
including multiple OCR language dictionaries and workflow
management of jobs and projects.
Absolutely Scantastic Ltd uses Acrobat Capture for all of
its PDF production work.
There are actually four types of captured Acrobat file, all
compatible with Acrobat Reader, but with different strengths:
PDF image only: Acrobat files containing images of
entire scanned pages.
PDF searchable image (exact): Acrobat files containing
images of entire scanned pages, with an OCR (text recognition)
process used to create files searchable for key words. The
searchable nature of this type of file is extremely beneficial.
Absolutely Scantastic Ltd recommends this file type for the
majority of our Acrobat archiving projects.
PDF searchable image (compact): Similar to the PDF
searchable image (exact) files detailed above, but used with
scans of full-colour pages.
PDF formatted text and graphics: Compact and searchable
files based on Acrobat's interpretation of the structure and
content of a scanned page. Graphics are reproduced and bitmapped
text is replaced with formatted text based on OCR. File sizes
are the smallest of any Acrobat Capture PDF option, but, because
the resulting page is an interpretation of the original scanned
page, it need not necessarily be identical to the original
page. Absolutely Scantastic Ltd does not recommend this file
type.
The benefits of the Portable Document Format
The benefits of Acrobat PDF files - cross-platform availability,
ease of distribution, small file size, searchability, intra-
and inter-document cross-referencing and linking capabilities
- are willing this file format many adherents. Industrial
and commercial companies, government and not-for-profit bodies,
regulatory authorities, library and archive services - all
are finding innovative and profitable means to use this technology.
Re-generation and re-issue of legacy material
Documents which remain valuable and for which there is an
on-going demand, such as library collections of journals or
news clippings, can be scanned into a PDF searchable image
archive, made available through a web server or FTP site,
or distributed on low-cost CD-ROMs.
Archiving
Documents for which there is a low or negligible on-going
demand, but for which there is a retention requirement for
legal purposes, can be scanned and archived on low-cost CD-ROM
or magneto-optical storage media. These documents preserve
their look and remain accessible to any modern computing platform.
Distribution
Documents from diverse sources, such as product data sheets,
white papers, specifications, catalogues and price lists,
can be captured into an Acrobat library and distributed on
low-cost CD-ROMs or via the Internet. This has the benefits
of saving transport costs of large volumes of paper, ensuring
that distributed documents are current, and that updates can
be undertaken easily and swiftly.
Absolutely Scantastic Ltd's PDF conversion services
In addition to converting legacy documents to PDF documents,
we can also work with existing libraries from imaging and
document storage systems. This allows for conversion of TIFF,
GIF and JPG archives to PDF files.
Home | Back
| Top
|