
PDF is a great format for turning documents to electronic form, preserving all information and appearance. The drawback is that it remains static - “electronic paper”, with the emphasis on the paper part. Searching within, or sorting and cataloging, a collection of PDFs quickly becomes an unwieldy task.
PDFfolio is a collection of tools and methods developed by us for making static PDFs smarter. It “injects” data from a database into a PDF document, and displays and searches that data, quickly retrieving the relevant PDF documents.
PDFfolio can greatly decrease the time and effort required for creating and collating “smart” PDF documents.
PDFfolio has been used in a wide range of applications. Here are some examples:
For more information about PDFfolio, follow the links below.
PDFfolio is a collection of tools and methods necessary to pass data from a database into a PDF document, to display that data, and to search through that data. It acts as a catalog for information, such as for product catalogs or document collections.
PDFfolio consists of the client component, and the development components.
The development components are a database in which the relevant data is collected, and prepared that it can be introduced into the empty container document, as well as a series of help utilities for Acrobat to prepare documents.
The client component is essentially the empty container document. This document provides the user interface to the data, and contains all the logic to manage, access and display the information.
PDFfolio can be used to create navigation systems through document collections. It allows a very fast search, which is between the full text search, provided by Acrobat Catalog, and the simple search of the document’s metadata. Because it does not connect directly to the document’s metadata, it can be used with any kind of document, even if they have no formal mechanism for metadata. provides the tools to bring information into a PDF document, to display that information in the document, to find information and to arrange and present the results.
The searchable data is normally prepared within a database, and can be manually entered, or extracted from the document itself or its metadata. This data is then injected into the PDF document, and can be accessed from any document-based JavaScript. Because of the use of an intermediare development database, it is possible to add information about the document, which would otherwise not be available, such as a synopsis, or specialized categorization.
Search results can be displayed various ways:
The normal way to display data information is in a base form. This means that the space allowed for display has to be adjusted to the maximum amount of text to be displayed. As a consequence, there will be a lot of empty space if the displayed information is considerably shorter.
PDFfolio has a mechanism to dynamically display the information, so that it appears as if there were no unnecessary space on the page. When displaying the information about a different item, the display gets rearranged accordingly. Physically, we remain on the same page of the PDF document. This feature is called “virtual page”.
Virtual pages are also used for tabular display of information, allowing to display any number of items, without knowing how many pages are needed in advance.
The master information is stored in a database, independent of the display document. Data from the database is exported in a specific way, and “injected” into a “blank” base document which contains the whole display and search engine. This “injection” can even happen when the base document gets opened, providing an up-to-date information.
In order to maintain the database, existing user interfaces can be used, or it is also possible to use a PDF-based data entry interface.
Unlike other methods for organizing and searching documents, an internet connection is not required. All it needs to run is Acrobat or Adobe Reader (version 5 or newer). This makes it ideal for use on CD- or DVD-based catalogs. It can of course also run in online environments, such as an intranet.
Using PDFfolio has a number of advantages for the developer and the user
One of the great advantages of using PDF-based systems to handle information as opposed to database runtime applications is that it does not require any additional software installation. An application developed using PDFFolio only needs the ubiquitous Adobe Acrobat (Reader) to run. Because the container document with data is “just an intelligent PDF document”, it can be used in environments where software installation is restricted.
Not requiring any additional application software allows for a fast deployment, even in organizations where extensive software testing and approval procedures must be followed. And because of its architecture, PDF documents provide a secure environment, not putting the system they are running on at risk.
Because the search engine does not use Acrobat Catalog, there are no version and platform incompatibilities. Products using PDFfolio do run on any Acrobat version 5 or newer.
This platform independency makes PDFfolio an ideal tool for developing “general public” applications.
An application created using PDFfolio can run strictly autonomously, without any connection to the outside world (such as from a CD-ROM), and all it needs from the host system is Acrobat (Reader). This makes applications created using PDFfolio ideal for sales representatives or to be used in presentations. And they are equally usable in industrial environments, where the documentation for machinery is accessible via that machinery’s control computer.
A connection to the outside world is, of course, needed, when data is loaded at run-time. Ordering systems for spare parts can be set up that the electronic transmission can be concentrated to a simple SMS message, for example.
Because an application created using PDFfolio is a PDF document, its user interface does not suffer from software developments restrictions. The user interface can be designed using graphic designer’s tools, allowing it to be fully customized. And because the user interface does not need to be programmed in an application language, its development is fast and very cost efficient. As PDF separates the visualization and the active elements, the user interface can be very easily customized, without the need to modify the “engine”.
All interaction occurs via elements of the document. This allows to run the document in full screen view, making an application created using PDFfolio suitable for kiosk mode.
Because the user interface per-se does not need to be coded, but can be “just designed”, and because Acrobat/PDF is among the most cost efficient rapid application development systems out there, developing an application using PDFfolio is very fast and cost efficient to develop and to maintain. In comparison, developing a comparable catalog for a web browser (using HTML technology), costs 2 to 4 times as much as for PDFfolio, and using Java is about 10 times as expensive. And the development times compare in the same magnitudes.
Another great advantage the PDF platform offers is that it does not need a stable connection to a central server, which means that it can equally well be used in a portable environment, making it ideal for sales representatives, or in workshops.
PDFfolio can be used for the development of a wide range of applications. This list shows just possible examples, but it is only your imagination which sets the limits.
With PDFfolio, finding your way through extensive document collections becomes easy, fast and user-friendly. When setting up the information in the database, it is possible to add arbitrary additional information about the document in concern, which can be searched.
Originally intended for use with CD-ROM or DVD, it is possible to use a PDFfolio-based navigation and search system in an on-line environment. In this case, the data transfer file can be created either on-the-fly or in a batch process, representing the links to the most current documents. This data transfer file can then be loaded when the navigation system gets opened. The user would always get the most current document, without having to worry about its path… PDFfolio knows where the document is. The document collection can be of considerable size. One of the implementations manages the access to 18 000 documents with relative ease.
The document collections are not limited to PDF. Because the information about the document is not direlcty linked to a specific file, but entered and managed in the database, it can be any file; the link to the document can even be the location of a book in the library.
The same ideas of PDFfolio for displaying information about documents can be applied to any kind of item. PDFfolio can as easily and quickly find chemical substances as it can find tax forms… The product information is stored in the database, which could retrieve that information even from an ERP system, and then transferred to the container document.
PDFfolio also provides a simple shopping cart capability, which allows the user to assemble an order, without leaving the Acrobat environment. The ordering infromation can be submitted electronically, or via paper.
PDFfolio also provides tools for the content developer to find item numbers in an existing plain PDF version of a catalog, and to add active elements (such as buttons for displaying more information, or to order the item) to that document. This allows to enhance the electronic version of an existing print product.
PDFfolio can be used to build up tests and quizzes in an easy way. An application is, for example, a self-control test at the end of a learning unit in textbooks, where randomly picked questions are presented to the student, and the answers may be statistically evaluated. It would also be possible to create “real” test systems, where the data gets prepared on-the-fly on a server, and transmitted to the test application (the container file) when it gets opened.
The following examples show the wide range of use of PDFfolio (and its predecessors).
Used by: US Internal Revenue Service, Washington DC, USA
PDFfolio was used to create a fast and easy way to find the right tax forms and documents. It contains a quick overview over all the tax products on that CD-ROM. The detail view shows the complete information about the selected product, consisting among others of the name of the document, its description, a push button to open the document, a reference to relate documents (if available), push buttons to open older versions of the document (if available on the CD-ROM), whether the form is fillable, etc.
Used by: Kern AG, Konolfingen, Switzerland.
PDFfolio was used to implement the ordering system (shopping cart) for spare part catalogs. The image-rich catalog contains captions with the item number. When the user moves the mouse over the caption, the item name shows, and clicking on the caption puts the according item in the shopping cart.
The shopping cart is a document of its own, and when opened, displays all previously selected items. It allows for changing the number of items ordered, as well as for deleting items.
Used by: Foerster&Rutow, Attorneys at law, Nürnberg, Germany
PDFfolio was used to implement an e-Learning tool which serves as a control instrument for the learning success with Legal Handbooks. The learning success is tested using multiple choice tests. In practicing mode, random questions are displayed, and the answers are evaluated. In chapter testing mode, questions about a chosen topic are randomly asked, and evaluated as a test. In the timed test mode, questions from various topics are randomly asked, and the user has a limited time, which depends on the number of questions available, for answering. The test modes offer a review function which displays the correct answer. Statistics are kept to verify the learning progress.
The questions are managed in another PDFfolio-based document, making use of the PDF Template feature. Each question with all the relevant information is entered and stored on its own page in a multipage document. The data is exported via the Report functionality, and then imported into the empty container document.
Used by: Bundesamt für Finanzen, Bonn, Germany
PDFfolio was used to implement a form containing a table with an unknown number of entries of VAT exempt transactions with non-German recipients within the EU. The list entries can be added, deleted and modified, and the recipient’s VAT IDs are validated against a set of rules. This form also has an interface to the German tax authorities’ electronic data submission system (Elster).
Used by: KDMZ, Kantonale Drucksachen- und Materialzentrale Zürich, Switzerland
PDFfolio was used to implement the ordering form. Besides the total number of items, the order form also manages an additional item number list which shows the disposition of the furniture items to various offices. In addition to items selectable directly from the catalog, it also allows ordering items from older catalogs.
Used by: Progressive Information Technologies, Hummelstown PA, USA
PDFfolio was used to implement a test preparation and reviewing tool, based on multiple-choice tests for medical handbooks. Randomly picked questions are presented to the users, and their response is evaluated and stored for review. In review mode, the correct answer is displayed, as well as additional information and explanations to the question.
The United States Internal Revenue Service(IRS) has been distributing their tax forms and publications in electronic form for years. They were iamong the very early adopters of the PDF technology. They initially distributed their products via BBS and CD-ROM.
There have been issues with the creation/compilation of the products placed on the CD-ROM, which escalated with the tax year 2003 edition. They included a manually intensive preparation cycle; incorrect (or missing) links to products; difficulty for the user finding the correct products, and early cutoff times. When the production process was reviewed at the end of the production cycle, a better solution was needed. The IRS performed an extensive search for solutions to help speed up the preparation process, and the PDFfolio solution was selected.
The data preparation for the 2003 version (and prior years) was done using numerous Word and Excel documents, consisting of up to 70 pages worth of redundant information. Keeping these documents up-to-date was tedious and error-prone. The process also required cutoff dates to be set earlier than would normally be desired (mid-November and mid-January), with more than a month needed for pre-mastering. Most of the work on the existing documents had to be done manually. One outcome of this was products being included on the CD that had previously been retired or obsoleted.
Despite the several documents listing of the tax products, customers found it rather difficult to find the product they were looking for. There were no detailed descriptions of the products available and no links to associated products. In addition, the file naming convention and document structure was rather complex and sometimes inconsistent.
“I put my neck on the line by switching to this technology, and it’s never let me down.”
- Paul Showalter, Senior Analyst, Internal Revenue Service
From the user’s point of view, a more easily navigable system was required. The ability to perform a fast and efficient search was demanded as well. The user interface needed a redesign. In addition, the design of the overview pages had to be improved.
Of primary importance for the development part of the CD was the necessity to shorten the production time, allowing later cutoff dates. A change of one single product must not induce the complete recreation of the overview documents. The information about the documents on the CD had to be consolidated into one place (a database).
The design issues were addressed by hiring a professional designer. An extensive evaluation demonstrated that PDFfolio was the only application that met the other requirements.
In order to make the transition to a PDFfolio-based application, the database was temporarily suspended, verified and cleansed in the original Excel tables. That data was imported into a FileMaker Pro database whose primary use is to create a data format suitable for the new navigation system.
Concurrently, a new navigation system was developed as a PDF document. This document uses the concept of virtual pages. With virtual pages, information is displayed in page-size “chunks”, and navigation elements (push buttons) are used to display another “chunk” of information. Physically, the PDF document always displays the same page. In order to implement virtual pages, the data has to be embedded into the document, and must reside in the application’s memory. Using the database to create the appropriate program code ensures a correct transfer of information from the database to the document and it guarantees data integrity.
On the second page, a more detailed view of the product information is displayed. The second page also uses the concept of virtual pages, but here, the information is in addition dynamically displayed depending on the contents to be presented. This is accomplished by using the “liquid fields” concept, which changes the size and position of each field, depending on the data that is present for each product.
All of the data is contained within the application’s memory, which enables the creation of a search subsystem in a rather straightforward manner. A simple string search was implemented for product numbers, product names, and a combination of product name and descriptions. These search functions create a short search list, similar to the complete product list, from where the detailed information can be accessed.
Using the tools provided with PDFfolio, the production time could be reduced from 6 to 10 weeks to approximately one week. This allowed the cutoff dates to be extended by almost one month. Last second changes required only minor modifications to the products database entry, and the recreation of the database output … a matter of minutes.
The navigation system worked extremely well, and the search was (because all the information is in the application’s memory) extremely fast; the search list is displayed almost instantaneously. Feedback from users (and the help desk support staff) was very positive.
The capabilities for producing the CD were more than fulfilled, and the 2004 issue of the Tax Product CD-ROM was an unprecedented success. For the first time in many years, the CD was released on schedule. “I put my neck on the line by switching to this technology, and it never let me down”, says Paul Showalter, Producer of the CD-ROM, and he continues “one of the most amazing things is the size of the navigation system file, which is only a little bit more than 400 Kbytes, and contains all the information for more than 1150 products, and the links to more than 2,500 files.”
“We are extremely proud of the decision to migrate to PDFfolio, and we’ve delivered exciting new projects thanks to this toolkit.”
- Paul Showalter
The producers of the Tax Products CD-ROM try hard to listen to their customers, and there were additional suggestions concerning the user interface that was tried to address. Users wanted to be able to easily scan the entire list of products, so we developed a scrollable list of the tax products on the CD. This was implemented without any changes in the original data structure, by simply replacing the display function and modifying the search functions. The 2005 issue has 1260 products, and about half of the products changed since the 2004 issue was released.
In addition to the user interface change, the search functionality was fine-tuned, in order to become more user-friendly. Again, they listened to their users and improved the product.
Once again, the shortened production time is considered to be a great advantage of this system, and the chance for errors has been reduced to virtually zero. In the past, you could always count on links not working, and files not being where they should be, but that has been virtually eliminated, in large part due to the redesign of the interface, and the use of the PDFfolio set of tools.