NCBI Bookshelf. A service of the National Library of Medicine, National Institutes of Health.

Journal Article Tag Suite Conference (JATS-Con) Proceedings 2010 [Internet]. Bethesda (MD): National Center for Biotechnology Information (US); 2010.

Cover of Journal Article Tag Suite Conference (JATS-Con) Proceedings 2010

Journal Article Tag Suite Conference (JATS-Con) Proceedings 2010 [Internet].

Show details

eXtyles, Typefi, and the NLM Journal Publishing DTD

and .

Author Information

The Federation of Animal Science Societies recently implemented an XML-based journal publishing workflow with the eXtyles software and Typéfi Publish at the core. With eXtyles, we export XML (validated to the NLM Journal Publishing DTD) from edited Word files, which is used by the Typéfi Publish system to automatically generate a composed journal article in minutes. This new workflow combines the efficiency of batch pagination with the rich design capabilities and ease of use of InDesign. We are able to supply NLM XML to our online journal hosts, enabling rapid online publication (now within hours instead of days). The new workflow has allowed us to eliminate a labor-intensive, time-consuming, and expensive typecoding process. In this presentation, we will review the eXtyles-Typéfi workflow and the Typéfi-InDesign template and XSLT and discuss some of the challenges encountered in implementing this new composition process for STM journals.

Introduction

The Federation of Animal Science Societies (FASS) is an association management organization that provides a range of services (membership management and support, meeting planning, accounting, information technology, and publishing) to its founding member associations and client associations. The publications department at FASS produces 4 monthly journals, 1 bimonthly journal, and 1 quarterly journal, as well as 1 bimonthly magazine, totaling approximately 18,000 published pages in 2009. In addition, the department produces association newsletters, marketing materials, conference program books, conference proceedings and abstract books, and a variety of other print and electronic publications. These additional products account for another 4,000 published pages each year. Staffing in the department includes 2.75 full-time compositors, 1 graphic artist, 5.5 technical editors, and 1 full-time proofreader.

In 2007, our publications department was using Miles33 (OASYS) and La TeX to produce the journals. These code-driven typesetting systems provided effective batch-pagination engines for journal publishing; however, the workflow surrounding them was time-consuming and cost-intensive.

In 2008, we transitioned to an XML-based workflow using eXtyles software (www.inera.com) and Typéfi Publish (www.Typefi.com). Our technical editors work in the familiar environment of Word, compositors export XML (validated to the NLM Journal Publishing DTD) from Word using eXtyles, and the Typéfi Publish engine uses InDesign server, a journal-specific template, NLM XML, and graphics files (figures and math equations) to compose and produce an InDesign article and PDF in minutes. Details of the transition and implementation process have been published previously (Adam, 2009).

The objective of this paper is to describe our journal publishing workflow, highlight the benefits of using an XML workflow, and discuss some of the challenges encountered in an end-to-end XML process using the NLM DTD.

eXtyles and Typéfi

Although some might argue that an XML workflow should begin with content creation in XML, it probably isn’t feasible in scholarly publishing today: authors use the tools they have available, which is usually Microsoft Word, and they are not usually concerned about document structure or tagging. And most web-based manuscript submission and peer-review systems accept Word or LaTeX to create PDFs from the submitted files to facilitate peer review. To go from Word to XML, we use eXtyles from Inera Inc. eXtyles is a Word plug-in that combines XML creation and export with a variety of macro-based editorial tools. It was specifically designed to facilitate publishing workflows in which content starts life in Word and where XML is needed for composition. eXtyles allows editors to work in a familiar software environment without ever having to interact with or have specialized knowledge of XML.

eXtyles

To begin the editorial/composition process, we run the original manuscript through eXtyles (a process called tooling). Tooling runs through several steps to

  • Add metadata
  • Clean up unnecessary formatting (e.g., What is that?)
  • Apply custom Word styles to all elements (including tables and figure captions)
  • Run auto-redaction (routine editorial clean-up)
  • Slice, dice, and style elements in references
  • Match references to PubMed and/or CrossRef databases
  • Match references to in-text citations

Compositors run the tooling process; tooling an average manuscript takes about 15 or 20 min. Figures 1 and 2 show a title page and references, respectively, before and after tooling.

Fig. 1. Title page of a manuscript before and after tooling using the eXtyles software.

Fig. 1

Title page of a manuscript before and after tooling using the eXtyles software. Note the custom Word styles applied to all elements of the title page after tooling.

Fig. 2. Reference section of a manuscript before and after tooling using the eXtyles software.

Fig. 2

Reference section of a manuscript before and after tooling using the eXtyles software. Note the addition of citation type tags, styling (tagging) of each element of the reference (e.g., author names, year, article title), and links to PubMed citation (more...)

Now the manuscript is ready for copyediting. One of the most valuable aspects of using eXtyles in the editorial office is that it automates many mundane editorial tasks: changing British to American spelling, formatting statistical terms (p-value, p-value, or P-value?) and units, and changing spelled-out numbers to numerals, according to house style. Moreover, having consistent styles applied before editing allows the editors to focus on content instead of formatting. The bibliographic processing tools add immense value, too, in terms of time and quality.

Once the manuscript is edited, it is returned to the compositor. The compositor adds processing instructions in the form of sizing tags (Figure 3) from a custom eXtyles menu. These allow tables and figures of various sizes to be placed in the appropriate Typéfi templates (e.g., 1-column, 1.5-column, 2-column, or side-turned format). The custom Word styles applied during tooling facilitate XML export and tagging. The Word styles are used to tag different elements of the article according to the NLM Journal Publishing DTD: article title; authors; affiliations; abstract; text; equations (as inline or display graphics); table titles, column heads, body, and footnotes; figure captions; and references.

Fig. 3. Figure and table sizing tags (processing instructions) are added to figure captions (top panel) and table titles (bottom panel), respectively.

Fig. 3

Figure and table sizing tags (processing instructions) are added to figure captions (top panel) and table titles (bottom panel), respectively. The XSLT maps the processing instruction to the InDesign/Typéfi template variant, ensuring that the (more...)

eXtyles offers many tagging/export options to users; we currently export XML validated to the NLM Journal Publishing DTD, version 2.3, with the CALS table model. Because export includes a validation step, errors in the XML can be corrected before composition. Most parsing errors arise from elements mistagged in the Word file (i.e., human error) and are easily resolved by our compositors. Figure 4 shows a sample of XML exported from an edited manuscript.

Fig. 4. Sample of a journal manuscript XML export showing publisher and article metadata.

Fig. 4

Sample of a journal manuscript XML export showing publisher and article metadata. Metadata is added to the manuscript file during tooling and editing and updated throughout the composition process.

Typéfi Publish

Typéfi Publish is a design-driven platform that adds batch pagination to an InDesign workflow. Because the design and composition rules are embedded in the template, no external coding or scripting is required for basic layout. However, custom scripting is used to facilitate complex page and column balancing and complex table data alignment. The FASS workflow includes 2 custom JavaScripts running within InDesign. The first of these is a math alignment script that is used to correctly adjust the baseline of inline MathType .eps files; without the script, inline equations would sit above the baseline. The second is a table alignment script that analyzes table cell content and automatically adds leading and trailing space to achieve the correct alignment (e.g., alignment on decimal or ± symbol).

Figure 5 shows the title page template for one of our journals. The Typéfi templates use containers to hold specific elements; for example, the title box contains article title, authors, and affiliations. The title page box is linked, in this case, to the container for the abstract; as the title box grows to contain the title page info, it pushes the abstract box further down the page. The container for the title page footnotes is anchored to the bottom of the page and grows upward as needed, pushing the introduction (the main story container) out of the way. A set of templates for a journal includes the opener page, such as this one, main story template, and templates for figures and tables of different sizes. Two versions of these templates are required for articles so that they can begin on recto or verso pages.

Fig. 5. Title page template showing Typéfi containers for title, abstract, key words, and the main story.

Fig. 5

Title page template showing Typéfi containers for title, abstract, key words, and the main story. Some content in the template is fixed (e.g., copyright line), whereas other content is pulled in from metadata exported from the Word file (see red (more...)

Fixed elements of a page (e.g., copyright line on the title page, slug line, running heads) are also added to the template. Some containers hold metadata from the XML file; metadata is added to the Word file during the tooling and editing processes.

The Typéfi template also embeds rules for handling floating elements, or floats. Users can set journal-specific priorities for automatic placement of floats: bottom of page, top or bottom, one float per page, and so on. Floats are placed according to the corresponding call-out in the text and then according to the priority rules for that element.

Content XML

Content XML (CXML) is a DocBook-based (http://www.docbook.org/) schema that underlies the Typéfi engine; it is optimized for composition rather than for information storage or retrieval. Typéfi uses XSLT to transform the incoming NLM XML to CXML. The purpose of CXML is to provide input to the composition process (e.g., for print, HTML, or ePub outputs). For example, CXML implementation of CALS tables allows paragraphs to appear within a table, enabling complex table formatting that is not allowed in the base CALS table model.

Article composition is initiated from a simple browser-based interface. The compositor selects the journal (thus selecting the correct template and XSLT) and enters the manuscript number. The Typéfi Engine ingests the XML file and any graphics (figures and MathType files) associated with the manuscript and composes the article. Composition takes from 2 to 10 min for the average article, although manuscripts with lots of references (>50) or with very large tables can take 30 to 60 min. Figure 6 shows a screen capture of recent activity showing the range of composition times for a sample of jobs.

Fig. 6. Typéfi Publish job monitor, showing actual composition time ("Duration" column) for a range of articles.

Fig. 6

Typéfi Publish job monitor, showing actual composition time ("Duration" column) for a range of articles. Note the "Job Option" column specifying "odd" or "even": all articles are initially run (first build) using the "odd" template (article starts (more...)

The Typéfi Publish system produces an InDesign file and a PDF. Our compositors open the InDesign file and review the composed article. At this point, changes can be made to the layout; all of the tools available in InDesign are available for use. Typically, very few changes are required at this point, and the automatically generated PDF can be sent to authors as a proof. Less-than-ideal layouts might result from manuscripts that have little text relative to the number of floats; such a situation might require manual intervention for a more pleasing layout. That, however, is not a limitation of Typéfi Publish—similar adjustments were needed with Miles33 and LaTeX workflows but without the ease of drag-and-drop that InDesign allows. Unusual symbols might also require attention. If a symbol is not present in the base font or is presented as a non-Unicode symbol in the Word file, the symbol may not display correctly in the InDesign file. eXtyles and Typéfi are fully Unicode compliant.

Math content

Many of the journals produced at FASS have content with complex math. In particular, IEEE’s Transactions in Ultrasonics, Ferroelectrics, and Frequency Control (UFFC) is very math-intensive. For complex math, we use MathType from Design Science (www.dessci.com); MathType is the full-featured version of the Equation Editor in Word (pre-Word 2007). It provides a straightforward, point-and-click math editor and allows export of mathematical content in graphical (.eps) or XML (MathML) formats, as well as TeX and LaTeX formats. In our current workflow, we export MathType equations as graphics, and those files are handled by Typéfi in the same way as other figures. Figure 7 shows MathType in Word, in XML, and in the composed article.

Fig. 7. Complex equations in Word/MathType (top left), XML (bottom), and in the composed article (top right).

Fig. 7

Complex equations in Word/MathType (top left), XML (bottom), and in the composed article (top right).

We anticipate that the inclusion of MathML in the HTML5 standard will allow us to move away from math in a graphical format and toward math rendered in the browser directly from the MathML. A recent initiative, called MathJax (http://www.mathjax.org/), from Design Science, the American Mathematical Society, and the Society for Industrial and Applied Mathematics is an exciting development. MathJax is an open-source, JavaScript display engine that uses CSS and web fonts to display math in any modern browser and even on smartphones.

Inline graphics

We can also handle a variety of inline graphics in text, tables, and figure captions, such as chemical structures, icons for multimedia content, or biography photos. For example, we recently published an article containing numerous inline sparklines (Cole and VanRaden, 2010; Figure 8), all automatically placed within the text and rendered in the XML as <inline-graphic>. The term “sparkline” was coined by Edward Tufte (Tufte, 2004) to describe “data-intense, design-simple, word-sized graphics.” The flexibility of the NLM Journal Publishing DTD and the eXtyles-Typéfi workflow allows us to incorporate unusual elements into journal articles without extensive manual workarounds.

Fig. 8. Inline graphics are usually shown inline (top right) but can also be shown as for a display equation (bottom right).

Fig. 8

Inline graphics are usually shown inline (top right) but can also be shown as for a display equation (bottom right).

Corrections and roundtripping

The Typéfi system enables corrections to be made in the InDesign file and “roundtripped” to the NLM XML. However, InDesign is not a validating XML editor, so corrections made in this way could disrupt the NLM structure. This is an inherent limitation of an XML-InDesign workflow (Inera Inc., 2008), and we do not currently use the roundtripping function. Instead, major corrections are made in the working Word file, XML is re-exported, and the article is rebuilt in Typéfi. Minor (late stage) corrections are made in 2 places: Word (and exported to final XML) and InDesign; checks and balances ensure the fidelity of this process. All articles are composed at least twice: the first time in draft mode (the proof may contain author queries) and at least once in final mode (which cannot be run if author queries remain). The speed and accuracy of the Typéfi composition process means that recomposing articles from scratch after the correction stage is still an efficient approach.

Online publication

Most of the FASS journals are hosted online by Stanford University Libraries’ HighWire Press (http://highwire.stanford.edu/). HighWire publishes PDF and full-text versions of journal articles. In our old workflow, we sent PDF files to HighWire and a third-party vendor converted the content to full text. Around the time we moved to an XML-based process, HighWire transitioned from a proprietary DTD to the NLM Journal Publishing DTD. This has allowed us to submit PDF and NLM XML to HighWire, facilitating rapid online publication, now within hours instead of days.

Submitting NLM-validated XML to HighWire was not as simple as anticipated. In fact, several tweaks to our tagging and export were needed:

  • Added the <license> element to identify open-access papers (and define the embargo period) on HighWire and in deposits to PubMed:
    • <license license-type = “open-access”>
  • Added a paragraph style, Related Article, and a metadata field to tag and link errata and corrected papers; letters and replies; and companion papers:
    • <related-article>
  • Added support to map hard and soft returns in table cells to </break> during XML export
  • Added a footnote-type attribute to all <fn> types [fn-type = “current-aff”]
  • Moved the <fn-group> that holds footnotes related to financial disclosure to <back> matter
  • Changed <custom-meta-wrap> <custom-meta><meta-name>Primary Audience </meta-name><meta-value><bold>Primary Audience:</bold>Nutritionists, Meat Scientists, Poultry Producers</meta-value> </custom-name></custom-meta-wrap>
    to
    <notes><p><bold>Primary Audience:</bold>Nutritionists, Meat Scientists, Poultry Producers</p></notes>

Overall, great collaboration among our colleagues at HighWire Press and Inera allowed us to transition to XML delivery to HighWire; we are very happy with the speed and ease of online publication today.

Results

FASS has seen many benefits of transitioning to an XML-based editorial and composition workflow. The most important has been a significant reduction in the time it takes to generate an author proof. Figure 9 shows the change in proof turnaround time across all journals, for the math-intensive UFFC journal, and for our largest journal, Journal of Dairy Science® from 2008 (mostly Miles33 workflow) to 2009 (first full year in XML workflow). In that time, the number of pages published increased from 16,148 to 18,259 (total of all journals). We have not increased composition staff time despite the 13% increase in pages published in this period. The most noticeable change apparent to authors has been the reduced time to receive a proof following acceptance.

Fig. 9. Production time in the Miles33/LaTeX (2008) and eXtyles-Typéfi (2010) workflows.

Fig. 9

Production time in the Miles33/LaTeX (2008) and eXtyles-Typéfi (2010) workflows. Production time (acceptance of manuscript until proof sent to authors) is shown for all journals, for the largest journal (Journal of Dairy Science, JDS), and for (more...)

The impact on our editorial and composition staff has been positive. Our copyeditors have minimal exposure to the XML, and the powerful editorial tools within eXtyles have allowed the editors to focus on content instead of routine editorial clean-up and formatting. Our compositors have more direct interaction with the XML: they export the XML, resolve parsing errors (with only occasional assistance from eXtyles Support), build the articles in Typéfi, and troubleshoot any composition problems that arise. The compositors are also responsible for changes to the templates (e.g., updating the date for copyright lines each year, changing a font or other design element). Changes that affect the conversion of NLM XML to Typéfi’s internal content XML are made by Typéfi support staff, although this is rare.

We have reduced the costs associated with using freelance typecoders while retaining control over all aspects of the workflow. Many small publishing operations have made the choice to outsource the composition process (and sometimes the editorial work) to a third party. By keeping the editorial and composition work in house, we have retained complete control over quality, timeliness, and cost-effectiveness of the publishing operation. Our member societies benefit from the reduction in composition costs achieved by elimination of typecoding. Our authors, members, and journal readers benefit from the reduced time to publication of high-quality, accessible research.

Next steps

FASS continues to explore new ways to leverage the editorial and composition tools we have. eXtyles is used in many of our non-journal (non-XML) projects because it allows us to apply consistent editorial and formatting rules, especially useful for projects with multiple authors. When working on a nonjournal layout project, the production specialist or graphic artist creates an InDesign template with paragraph styles that match the style names used in the “eXtyled” Word document.

We use a customized eXtyles module to apply a consistent editorial style to meeting abstracts submitted for the member society meetings. Again, using eXtyles with abstracts permits consistent formatting and a consistent set of editorial rules to be applied without incurring extensive copyediting time. To do this manually for 2000 abstracts (a typical joint society meeting) would be time-consuming and cost-prohibitive. By using eXtyles, we are able to produce better quality programs and abstract books without additional costs to the member societies.

Although Typéfi’s template-based system is ideal for journal composition (because the design elements do not change from article to article or from issue to issue), it can also be used for one-off book designs. Experienced InDesign users can modify an existing template or create a new one in a matter of hours. Creating a new template for each book project would be an efficient approach to automating long-document layout. We plan to integrate the full eXtyles-Typéfi workflow into book projects in the future.

Another benefit of transitioning to an XML workflow is that our journal content is stored in an accessible, searchable, and reusable platform-neutral format. We hope to add a content management system (CMS) to our XML workflow in the next few years. Addition of a CMS will facilitate greater automation and better long-term storage and reuse of our content within and between the journals published by our member societies.

The eXtyles-XML-Typéfi combination is a flexible, scalable, and powerful workflow. The combination should allow the FASS publications department to continue producing high-quality journals effectively and to adapt quickly to new delivery channels (e.g., ePub and applications) and end-user devices (e.g., smartphones, iPads, and eReaders) adopted by the STM publishing community.

References

  1. Adam, L. R.2009. A brave new workflow for the composition of scientific, technical, and medical journals. http://www.stcsig.org/sc/newsletter/html/2009-4.htm.
  2. Cole J. B., VanRaden P. M. 2010. Visualization of results from genomic evaluations. J. Dairy Sci. 93:2727–2740. [PubMed: 20494182]
  3. Inera Inc. 2008.White paper: NLM DTD XML and InDesign workflows for scholarly publishing. Inera Inc., Newton, MA.
  4. Tufte, E. R.2004. Sparklines: Theory and practice. http://www.edwardtufte.com/bboard/q-and-a-fetch-msg?msg_id=0001OR.
Copyright 2010 by Louise R. Adam.

The copyright holder grants the U.S. National Library of Medicine permission to archive and post a copy of this paper on the Journal Article Tag Suite Conference proceedings website.

This work is licensed under a Creative Commons Attribution-Noncommercial 3.0 United States License

Bookshelf ID: NBK47080

Views

  • PubReader
  • Print View
  • Cite this Page

Related information

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...