U.S. flag

An official website of the United States government

NCBI Bookshelf. A service of the National Library of Medicine, National Institutes of Health.

Journal Article Tag Suite Conference (JATS-Con) Proceedings 2016 [Internet]. Bethesda (MD): National Center for Biotechnology Information (US); 2016.

Cover of Journal Article Tag Suite Conference (JATS-Con) Proceedings 2016

Journal Article Tag Suite Conference (JATS-Con) Proceedings 2016 [Internet].

Show details

Write once, use everywhere — making an oXygen Framework accessible on the Web

.

Author Information and Affiliations

An oXygen XML editor framework for checking JATS articles according to business rules and building journal issue packages for Atypon Literatum is presented. In addition, re-use of the framework's unaltered Schematron, XProc, and XSLT in the context of a Web-based service are presented. This demonstrates the power of the standards-based XML stack in conjunction with diverse runtime platforms such as oXygen XML editor and le-tex transpect.

Introduction

Hogrefe Publishing is making their journal content available through the Atypon Literatum [1] platform. In addition to the JATS DTD that the articles must conform to, there are additional constraints imposed, some of them as additional DTDs (submission manifest, issue XML), some of them as documented naming and tagging conventions. In addition, there are Hogrefe’s copy editing conventions, a list of journals and their proper abbreviations, a list of permitted article types, etc.

Many of these conventions may be formalized as Schematron rules, both on the article and on the issue packaging level. The packaging itself is an error-prone process that may be automated. This process involves checking of the content and the ancillary files, copying of image files to their package locations, checking of presence, correct names, and file types for the referenced images and article PDFs, creating an issue table of contents and finally zipping the whole issue.

The author forked and extended Wendell Piez’s JATS framework [2] with these Schematron rules and XProc pipelines and published the derived framework on Github [3]. The framework has been successfully employed by typesetters who use oXygen. There are, however, typesetters who use a different XML editor and don’t want to switch. They also had difficulties making the article Schematron run at all – it relies on XPath 2 and XSLT 2 functions, access of external files, and other features that are well supported by oXygen and the default ISO Schematron implementation, but not necessarily by other implementations. In addition, the messages need to be rendered in a digestible form, so some XSLT rendering of the resulting SVRL is necessary. This is cumbersome to set up for arbitrary heterogeneous editing/production environments.

The author suggested to make the framework’s Schematron and XProc available for use on Web  services [4], utilizing the open source transpect framework with its HTML reports [5].

Framework Customization

Schematron Rules

There are currently 142 report and assert statements in the article Schematron file. The effects of using this framework in oXygen may be seen in Figures 1 and 2.

Fig. 1

Fig. 1

Schematron validation in oXygen text mode

Fig. 2

Fig. 2

Schematron validation in oXygen author mode

It can be seen from the screenshots that there are different types of checks: DTD validation (the iss element) and Schematron validation for enforcing naming, metadata, and styling conventions.

XProc Package Building, Consolidated Report

As stated above, the package building is implemented as an XProc pipeline that may be invoked from oXygen as a transformation scenario (Figure 3). It will process a manifest file and then will collect the issue’s content from a folder that corresponds to the ext-id element. It will check the directory structure and generate an issue ToC.

Fig. 3

Fig. 3

Invoking the package building transformation scenario

It will also give a summary of all checks for the articles, differentiated by severity (fatal-error, error, warning, info), in an HTML page (Figure 4). If an article is not DTD-valid, it won’t be included but flagged as an error. The user then needs to go back to oXygen an react upon the interactive validation messages.

Fig. 4

Fig. 4

The summary report created by the oXygen transformation scenario

In the absence of fatal errors, the HTML page will contain a link to the package in the file system.

Putting the Framework on the Web

Although it might be desirable to have a full validating (including custom Schematron) JATS editor on the Web, the premise on which this Web interface was build is as follows:

Typesetters continue using their preferred desktop XML editor. They zip their files and upload them to a service that checks them. In principle, a summary report of Figure 4 will be sufficient if the DTD checks will be performed by their editors (which all editors that they use support). However, they’d need to be able to copy the error’s XPath from the report and navigate to the corresponding location in their document. This is a feature that not every XML editor supports.

So we wanted to use transpect’s HTML reports that permit to display the messages in an HTML rendering of the input, at the error locations. transpect HTML reports may consolidate validation messages of multiple intermediate steps, or, in our case, schema and Schematron validations.

In order to use schema validation, the DTD had to be converted to Relax NG (which is straightforward) because transpect is only able to render Relax NG and Schematron validation output back into the original document or a rendering thereof (cf. [5]).

Github Project

The transpect project resides in a git repository  that mostly consists of submodule specifications. Most of the functionality is contained in xpl/process-manifest-transpect.xpl that is a front end to the original framework pipeline, build-issue/process-manifest.xpl. This original pipeline had to be tweaked a bit for transpect use: It now has an option to read also DTD-invalid files and to automatically patch location information into the source files and to include this information in the Schematron messages, so that the messages may be attached to the error locations. The oXygen framework now uses a different front end pipeline to this common pipeline. The necessary changes to the pipeline have been quite moderate, and there is no duplicated code in the two git repositories.

HTML Reports

Maintaining the location information (the @srcpath attributes) when rendering JATS to HTML turned out to be a problem with the framework’s bundled jats-html.xsl XSLT. The XSLT simply is not designed with extensibility in mind. A major rework of these 3rd-party stylesheets would be necessary to be able to pass @srcpath attributes from the patched JATS to the HTML rendering.

Therefore we used transpect’s jats2html renderer whose default output looks less fancy but that is more customizable.

The consolidated report may be seen in Figure 5. Please note that this is a rendering of the whole package, including directory structure and ancillary files, not just of the articles.

Fig. 5

Fig. 5

The full HTML report, with Relax NG errors and Schematron errors/warnings attached to their locations in a rendering of the package contents

Upload Interface

The transpect front-end pipeline may be invoked on the command line, using the XML Calabash runtime that is bundled with the transpect project as a submodule.

There is a simple Web GUI and WebDAV upload interface that is in use with many transpect projects (and that will be open-sourced soon, too). This Rails application is already in use at Hogrefe for other conversion pipelines (IDML→BITS→EPUB, BITS→docx), so it was quite easy to add this conversion pipeline to that server.

The upload interface may be seen in Figure 6.

Fig. 6

Fig. 6

transpect upload interface

Conclusion/Outlook

It has been demonstrated that an oXygen framework that uses standard technologies such as schema/Schematron validation, XSLT and XProc may be ported to transpect to yield something with similar functionality (apart from interactive editing) on the Web, with moderate effort.

Currently the oXygen framework is being enhanced with Schematron Quick Fixes (SQF, [6]) for more ease of use, i.e., correction suggestions. It will be desirable to port SQF also to transpect’s HTML reports. The form elements (text entry fields, acceptance buttons, drop-down lists) may easily be rendered in the HTML report, and the users’ choices may be posted to another pipeline that patches the changes into the source files. At least for a single-step XML conversion as seen in the current application, applying the changes to the input should be not too complicated.

In summary, typesetters deprived of running Schematron checks, accessing XPath locations, or running XProc pipelines, may use transpect’s Web interface to check and package their journal production work a for friction-reduced upload to Atypon Literatum.

Copyright 2016 by le-tex publishing services GmbH.

The copyright holder grants the U.S. National Library of Medicine permission to archive and post a copy of this paper on the Journal Article Tag Suite Conference proceedings website.

This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.

Bookshelf ID: NBK350145

Views

  • PubReader
  • Print View
  • Cite this Page

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...