The Making of a Pre-Pub

The Logos Pre-Publication Program allows us to offer electronic books before we produce them. Users place pre-orders that are only processed if and when the proposed project is completed, ensuring that there is enough interest to justify the significant investment required to produce a title.

Logos has offered a variety of titles as Pre-Pubs, ranging from public domain titles to recent single and multi-volume titles, commentaries, and more. In some cases the titles offered were withdrawn when it was clear there wasn't enough interest to support the project. While we'd like to be able to produce every title, we consider even the withdrawn offerings a success: we identified projects that we couldn't afford to do and were able to re-allocate our resources to projects users wanted more.

The titles that are offered as Pre-Pubs come from many places, and each has a different story. This is the story of the one that almost didn't make it.

What About the Big One?

In the early 1990's Logos took over the CDWordLibrary project, a late 1980's CD-ROM-based Bible software package that ran on Microsoft Windows 2.0. Ahead of its time, CDWordLibrary included some heavy-duty reference books, like An Intermediate Greek-English Lexicon (an abridgement of Oxford's huge Greek-English Lexicon by Liddell & Scott) and The Theological Dictionary of the New Testament, Abridged in One Volume (the one-volume "Little Kittel" based on the 10-volume TDNT.)

When Logos released these titles in 1995 they became immediate favorites of our users and did a lot to enhance the reputation of Logos Bible Software as a scholarly tool. But with the accolades came the inevitable question: What about the big one?

If you offer the abridged edition, you're sure to stimulate demand for the unabridged.

And Logos had the unabridged editions—on a shelf.

All ten bright-blue volumes of the TDNT sat high on a shelf in the office, prompting innumerable exchanges like:

  • "A customer just called and asked why we don't have the unabridged Kittel. Why don't we just get the rights?"
  • "Rights aren't the problem—there are no electronic files."
  • "Then let's type it—I get lots of requests."
  • "See that blue shelf? That's it. Start typing."

The physical size of the TDNT ended a lot of discussions—prompting some people to look at the seemingly simpler Liddell & Scott:

  • "Well what about Liddell & Scott? It's just one volume," the innocent would say.
  • "Open it," the tired would reply.

The complete Lexicon may be one volume, but it's a huge book. It's tall. It's wide. It's thick.

That's not the problem, though. When you open it you discover that there are more than 2,400 pages, the print is tiny, and the text isn't simple English prose but rather an incredibly detailed mix of etymologies and references in several scripts. It's so big that when Oxford University Press revised it (more than once) in the twentieth century, they didn't typeset it again. Instead, they attached a huge supplement, forcing the user to look every word up in both the supplement and the main text.

The unabridged Kittel was an impossible project. The unabridged Liddell & Scott was inconceivable.

Passion and Prudence Strike a Bargain

Several years ago the optimists at Logos began to believe that we could afford to produce the unabridged TDNT. There were now enough users, and our efficiency at producing electronic texts had improved enough, that the project could work financially.

The more careful among us were afraid to embark on such a huge project for fear that the data-preparation costs could strike the company a fatal blow if the user interest wasn't as strong as the passionate believed:

  • "I could get enough users today to guarantee we wouldn't lose money on this project!" exclaimed the passionate.
  • "Sure, they say they'd buy it, but what if they don't actually place the orders when the project is done?" fretted the prudent.
  • "The users want it so badly that they'd give their credit card numbers today, even if we couldn't deliver for months!" enthused the confident.
  • "Then we could do it," conceded the cautious.

And so the Pre-Pub Program was born.

A Brilliant Success

The Pre-Publication offer of the unabridged TDNT was a tremendous success. Users placed pre-orders for the project and Logos was able to fund the very expensive data-preparation.

The bargain of passion and prudence led to a string of exciting projects. Lexicons, commentaries, and many other books have gone through the pre-pub program, increasing the rate of new releases and making available an even larger library of Bible reference material. Logos now has a means to protect its exposure on large projects and users have a way to speed the release of the titles they need.

Emboldened by the sales success of the TDNT project and the technical success of the complicated Brown-Driver-Briggs Hebrew-English Lexicon project, Logos made arrangements with Oxford University Press, prepared a very-rough data-entry cost estimate, and put the Liddell & Scott Lexicon into the Pre-Pub program. Orders started pouring in—along with thanks from patient users who'd been waiting for years to see it offered for use with Logos Bible Software.

Crossing the 90% Bridge

The response to the Liddell & Scott Pre-Pub offer was enthusiastic—but it wasn't enough.

When a title is offered through the Pre-Pub program, a rough estimate is made of the raw data-entry costs—the cost of having the book typed in, before special tagging, indexing, and other processing is done.

This rough estimate is very simply done. A small, typical sample of the book is typed in. The characters in the line are counted, and the number of lines typed, lines on a page, and pages in the book are entered to get a total character count. This is multiplied by a standard cost-per-thousand-characters and then multiplied again by a complexity factor that accounts for things like the amount of Greek, Hebrew, and other text in non-Roman scripts, the number of tables, the amount of special formatting, and so on.

The Pre-Pub title typically starts with a red "traffic-light". It stays in this first "Gathering Interest" phase until enough pre-orders are placed to cover the raw data-entry costs. This doesn't cover all of the cost of bringing the title to market, but experience has shown that a new wave of orders follows promotion to the orange "Almost There!" phase, and, later, to the final green "Under Development" phase.

In the word-picture language used at Logos, moving into development with only the raw data-entry costs covered is like crossing a bridge that's 90% complete. The road goes all the way across the river, and you can reach the other side, but there's still work (and pre-orders!) needed before it's 100% done—with the lights, the walkways, the bells and whistles.

Liddell & Scott was turning out to be the other kind of 90% bridge—the one that only covers 90% of the span across the river. It's 90% done, but you can't reach the other side. And as the technical team looked more carefully into the complexity of the project and the amount of post-data-entry processing that would need to be done, the accumulated pre-orders were starting to look like a 60% bridge.

Our Users Shout Jump

In October, 2002, Logos announced that Liddell & Scott was unlikely to get enough pre-orders to go into development, and that we would soon need to take it out of the Pre-Pub program. (Since the pre-orders are actual orders, with credit-card numbers, we aren't comfortable keeping them active indefinitely.)

On the Logos forums and by email the users who had pre-ordered the Lexicon pleaded for its production. They recruited friends and colleagues to sign up, and more than one placed a second pre-order themselves. This user-driven push was enough to get the bridge back to a solid 90%—still not a profitable project, but close enough that there was hope.

And so, in response to the user calls and emails, and because we believe it's an important book that should be available to Logos users, we jumped.

Making it So

There's a popular misconception that all that is involved in making an electronic book is typing it in. While this may be true for a simple, plain text delivered in Adobe Acrobat, there's a lot more involved in producing a truly useful electronic title.

There are many formats for "marking-up" electronic text. HTML is the most widely known, using angle-bracket-tags to mark the start and end of <b>bold</b> and <i>italic</i> text, for example. (HTML is a descendent of the older and much more powerful SGML, which in its modern simplification became the extremely popular XML.)

Like most electronic publishing companies, Logos has chosen a markup language and built a special vocabulary of markup tags suited to our needs. We've learned over the years, though, that every book is unique— especially reference books. Despite the common interface of page numbers and the use of standard features like tables-of-contents and indexes, many of the most useful reference books are so useful because they employ special formats, conventions, layout, and vocabulary.

Building a quality electronic reference book involves making a detailed analysis of a book's special features and then translating them into the vocabulary of the digital library system. The goal is to create an electronic book that works like all the others in the ways the user expects but which still exposes all of the unique features that make it so valuable.

The Book Designer

Once the decision was made to move Liddell & Scott into production, the book was handed over to the Electronic Text Development department. The ETD book designer read all of the introductory material to learn how the lexicon was organized and how it was designed to be used. He then spent hours paging through the lexicon looking for special formatting, unusual characters and scripts, and learning the structure of the text. This phase is called document analysis and involves the book designer becoming generally familiar not only with the overall structure of the book, but also with the deviations from the overall structure of the book. The first task is to understand what is normal for the book. The second task is to identify all the different ways and places that deviate from the usual structure.

In the course of document analysis, the book designer identified a list of special fields that the Logos electronic edition would support in searching and began writing a specification to guide the data-entry team in typing the book. In addition to the specification, a prototype document—essentially a sample of all of the unique structures in Liddell & Scott—was created to serve as an example to the data-entry team. The written specification and prototype document serve to provide a complete specification and example to the data-entry team.

A special concern in Liddell & Scott was the huge supplement, which Logos planned to integrate back into the main body of the text. Liddell & Scott consists of more than 127,000 articles—and the supplement has more than 26,000 articles. The supplement articles replace either some or all of the main body article with the same headword. Some supplement articles are completely new and are not reflected in the original Liddell & Scott main body text. The supplement identifies the extent of enhancement or replacement with special characters. The data-entry specification explained how the articles related to each other and gave careful instructions on the integration of the supplement and the main text.

LSJ Specification LSJ Specification

Figures 1 & 2. - Two pages from the data-entry specification for Liddell & Scott. Click on each image to get a closer look.

The Data-Entry

Depending on the quality of the print materials, paper books are turned into electronic ones by either scanning, typing, or a combination of the two. Usually the process is done twice and the two data streams are compared in order to find mistakes. Logos works with companies around the world that specialize in data-entry. Over the years one company has been consistently willing to work with us on complicated projects, doing data-entry of mixed-language books and working with Hebrew, Greek, Arabic, Syriac, and more. It was to this firm's Sri Lanka facility that Liddell & Scott was shipped, along with the 26-page data-entry specification developed specifically for this project. (This project-specific document supplements the 100+ page standard specification used for data-entry for Logos projects.)

Over the course of five months the data-entry team emailed samples and questions back to the book designer. These queries ranged from minute to global in nature. Special cases were investigated and magnifying glasses were employed to examine broken type and unusual letter forms. The specification was modified as necessary to account for exceptions that were missed in the original document analysis phase. At last the data was returned in an SGML-markup format, ready for processing and indexing by the Logos book designer.


Figures 3 & 4. - The marked-up text returned by the data-entry firm. Click on each image to get a closer look.

Processing and Indexing

Even though the text of Liddell & Scott had been returned as specified, the process was not complete. The idea of simply "pushing a button" at this phase to create a Libronix DLS (Logos' engine technology prior to Logos 4) resource would be a vast oversimplification. Much yet remained to be done in the text. Rather than have the data-entry team do everything required, several enhancements that could be done algorithmically were not included in the document specification. After all, the data-entry costs are arrived at on a per-character basis. If the amount of characters the data-entry team enters can be minimized, then the cost is reduced. This means the threshold for being able to commence a Pre-Pub has been lowered, which is good for both Logos and Logos customers.

The book designer resolved various questions concerning indecipherable blobs of ink in the printed text. Some were resolved based on context, some were resolved by examining the citation in available documents or databases. A few of these questions, unfortunately, were irreconcilable. These occurrences (fewer than fifty in all) were marked in the text as such.

Enhancements to the files were made using short programs (Perl scripts) custom-written for the task. These scripts identified Bible references and references to Works of Josephus and enabled them as links. Topics were deduced and entered where appropriate. Most importantly, a Logos text developer visually examined each screen of the text in the Libronix Digital Library System to ensure that no gross formatting mistakes had been made. Liddell & Scott's A Greek-English Lexicon was now a fully functional Libronix DLS resource.

BDAG and LSJ aligned on the same article. LEH and LSJ aligned on the same article.

Figures 5 & 6. - Screenshots for Liddell & Scott (on the right in each image). Click on each image to get a closer look.

The Final Product

Once the resource itself was complete, a few steps remained. An actual product needed to be created. This involved making a product specific setup to install the latest version of the Libronix DLS and unlocking Liddell & Scott on the user's computer. After the setup was created, it was tested on various operating systems. Once these tests were complete, the disk, referred to as a "Gold Master" or "GM," was created and sent off for duplication. Sometime over the next two to three weeks, the manufactured CD arrived at Logos, ready to be distributed to those who have purchased it.

The entire process, from writing a document specification to the completion of work on the resource, involved people working on three continents—North America, Africa, and Asia—over the course of eight months. The result of this effort was the most useful edition of Liddell & Scott's A Greek-English Lexicon that has been produced to date. The process was long and the work sometimes complex, but the result is surely worth it, and we at Logos trust that your study of God's Word will be enhanced as a result of it.

