Scanning technology selection impacts acceptability and usefulness of image-rich content * †

Methods: Residency coordinators selected eighteen figures from studies from radiology, clinical pathology, and anatomic pathology journals. With original PDF controls, each figure was prepared in three or four experimental conditions: PDF conversion to TIFF, and scans from print in B&W, grayscale, and color. Twelve independent observers indicated whether they could identify the features and whether the image quality was acceptable. They also ranked all the experimental conditions of each figure in terms of usefulness.

Interlibrary loan (ILL) and document delivery (DD) operations are an essential component of libraries.
Color remains a special request for ILL/DD in spite of how common color content has become.Scientific disciplines are keenly aware of the importance of color images in their work.In 2001, the Journal of Histochemistry and Cytochemistry began offering one full page of color figures per article at no cost to authors because the majority of their content required color images [1].Scholarly disciplines that need color to convey meaning are not having their needs met by regular ILL/DD processes.Anecdotal reports from faculty and residents at the North Carolina State University (NCSU) College of Veterinary Medicine indicate black and white (B&W) article scans of color or grayscale originals are not of sufficient quality for their needs.Requests for acceptable replacements add time and cost to the process for both the lending and borrowing libraries.
Literature about the quality of images in ILL/DDsupplied articles is scarce.None of the recently published ILL handbooks addresses the issue of color in discussions of quality [2][3][4].In a single paragraph discussion of quality, Forro mentions ''obscured images'' (p.28) as a type of problem in the general discussion of whether or not to review quality [5].She does not mention best practice regarding review versus waiting for users to contact ILL regarding quality problems, but she states that most articles are sent without quality problems.This may be true of ILL articles overall as Connell and Janke reported rates of 1 patron resend request per 196 Odyssey-delivered articles (0.5%), and 4 of 1,319 (0.3%) of all staff-mediated articles that were received and supposedly reviewed by staff before being sent to patrons [6].In their 2004 quality survey, Naylor and Wolfe reported that of 233 faculty and graduate students responding to a survey question about satisfaction with overall print quality, only 2 ranked it poor, but there were 17 negative comments about print quality of the portable document format (PDF) documents where ''photographs or tables are unclear and the overall print quality is poor'' (p.360) [7].
It remains unclear whether the receiving ILL staff, or perhaps even the user, would recognize a subtle image quality problem having not seen the original image.In 2004, Warner compared the quality of print original journals, custom supplied photocopies from the Canada Institute for Scientific and Technical Information, and the online and printed quality of Ariel transmitted files and found the Ariel copies lacking [8].Neither color nor image quality is mentioned in the American Library Association (ALA) Interlibrary Loan Code for the United States [9] or its explanatory supplement, nor the sample ALA interlibrary loan request forms.File size delivery concerns for color images have been raised and continue to be mentioned as an issue in emailing articles directly to clients [10].Most libraries' forms and processes assume that a readable B&W scan fulfills user needs.User experiences have shown that this is not always the case, but more systematic data were needed to strengthen this argument and advocate for improvements to the fulfillment of requests where the original is likely in color or grayscale.
In September 2009, a sample page of a color article was captured in B&W, grayscale, and color scans to document file size and image quality for the variety of delivery and scanning mechanisms available at the Veterinary Medicine Library.Sharing that data as part of an In the Library with the Lead Pipe invited editorial with a visual comparison of original to B&W scan led to an online discussion [11].Issues raised included fears of ILL services becoming irrelevant if a library is unable to provide highquality documents, the extent to which decent grayscale images would work in situations where color might be cost prohibitive, degraded quality from printing and rescanning in response to interpretations of electronic content licenses, and requests for best practices for scanning that would result in consistently useful articles while still falling within constraints of Ariel or Odyssey delivery.
Many libraries have turned to online-only subscriptions for their journals.Licenses generally include ILL provisions, but these may contain restrictions on the format or manner in which the content can be distributed.For licenses that require that the article be printed or made into a static image such as a tagged image file format (TIFF), libraries can convert an original PDF to a TIFF using software programs, and in some cases, color present in the original document would display as black and white.To retain color, some libraries have turned to Adobe Acrobat Pro and other programs to export PDFs as images.The capabilities of ILL/DD systems change regularly, but at the time of this research, libraries using Ariel or Odyssey for transmission of articles between libraries would upload TIFFs to send electronically to the borrowing library.The borrowing library's Odyssey software would recompile the TIFF pages into a PDF for secure web delivery to the end user.This recompilation could also be done using a program like Adobe to create a PDF from the group of TIFF pages representing the article.
Radiology and pathology are scientific fields that rely heavily on visual images for research studies and diagnostic purposes.It is critical that published and library-provided copies be of sufficient quality that readers can not only confidently believe the authors' work and interpretation, but also use these images as learning tools to enhance their own research and diagnostic skills.The research question was to what extent radiology and pathology faculty and residents perceive figures converted from PDF to TIFF or scanned in B&W, grayscale, or color as typically provided by ILL/DD services as acceptable replacements for original digital figures in conveying the content of the author of the article.The authors hypothesized that users would find PDF to TIFF conversions as acceptable as the original article, but that the rates of acceptability of the scans would vary significantly by the scanning conditions.Acceptability is a subjective assessment.Since the important concepts for imaging standards are that images should be useful to the clinician or scientist, not necessarily better or worse than direct examination of a slide [12] or a radiographic study, we decided to measure usefulness as a second characteristic.The clinical and research usefulness of an image depends to a great extent on its quality.At least one previous study used a five-point Likert scale to rate the display quality of different anatomic image structures [13].
The specific aims of this study were to capture rates of usefulness and acceptability for each type of scan and document reasons for which images were found to be unacceptable.

Discipline and assessor selection
The three study areas for this project were anatomic pathology, clinical pathology, and radiology.These disciplines were chosen because they are image intensive, their faculty and residents have expressed quality concerns about images in articles received from ILL requests, and they were willing to participate in this research.The faculty residency coordinators participated in the research by contributing to the grant application and institutional review board (IRB) proposal and selecting images for the training set and the study set.They also introduced the study to the population of residents and board-certified specialists who were eligible for recruitment to serve as independent image reviewers.To prevent any sense of coercion, the faculty coordinators were not informed who from their programs participated in the study.The study was approved by the NCSU IRB administrator as exempt human subject research.Each coordinator received $200 from the grant funding that could be used to purchase materials relevant to their residency program.
The recruitment goal was four to six independent observers in each discipline, preferably comprising one board-certified pathologist or radiologist and three to five residents in various years of the threeyear training program.We anticipated that time to complete the observations would vary based on residents' years of experience, as over time in radiology and pathology, it is reported to take significantly less time to view an individual slide [14].Three residents and one board-certified specialist in each of the disciplines completed the study.One additional anatomic pathology resident was enrolled but did not complete the study.

Image selection
The sample size of images for the study was based on the anticipated time that a resident would be willing to devote to the study and the need to include a wide variety of relevant types of images.We estimated residents would be willing to devote up to two hours to the study in addition to the fifteen minutes for training.We decided on eighteen images to be evaluated in up to five conditions, a maximum of ninety images based on the initial proof-of-concept testing showing a time of less than one minute per image.The same images were then used for the comparison exercise in which the conditions within the eighteen sets of images would be ranked.
We selected discipline-specific, high-quality images representing a wide range of imaging modalities from high-impact journals to increase relevance of the study to broad audiences.The population of the articles selected for the study by the project investigator and the residency program coordinators met the following criteria: (1) be available in the NCSU Libraries collection both in print and online as born-digital content, (2) be of superior quality in the original, (3) be relevant to human and veterinary medicine, and (4) preferably be unfamiliar to the residency and faculty independent reviewers.The quality of the original, as assessed by the residency coordinators, was especially important to eliminate the possibility that the poor quality of an original be perceived as a quality problem with the scan.We used the 2010 Journal Citation Reports (JCR) Science Edition to identify the highest impact factor (IF) titles with original articles for which the NCSU Libraries had print and online issues.In the category ''Radiology, Nuclear Medicine & Medical Imaging,'' the highest two IF titles that met the criteria were the Journal of Nuclear Medicine (IF 7.022), for which print was owned through 2010, and Radiology (IF 6.066), for which print was owned through 2008.To reduce the possibility that image quality had changed from 2008 to 2010, we limited our selections to articles from 2008 in our search of PubMed for articles from those journals indexed with ''Animals'' limit.The lead author chose 12 papers with multiple images to send to the radiology residency coordinator to identify 18 images representing a diverse range of diagnostic imaging techniques, including magnetic resonance imaging (MRI), computed tomography (CT), positron emission tomography (PET) scan, and plain film.After the initial selection, we returned to the journals to select additional images from human articles to include images of the thoracic region and long bones, which were not represented in the original articles..''Methodologies specifically requested by the residency coordinator selectors were electron microscopy and histopathology.Images were selected from articles where the image was a critical component of the article, either for documentation of study results or confirmation of an interpretation or diagnosis.Groups of potential articles were sent to the residency coordinator, and images were selected after further discussion.We were unable to identify a urine image of sufficient quality for the study, so the urine image was used in the training set.
The three selecting faculty were offered twelve possible articles containing more than three images each, from which they were to select six articles for the study.Within those six articles, if there were more than three images each, they could also indicate which images they preferred for the study.In several cases, only some of the images in the paper met the study criteria, and therefore, additional articles were identified.In November 2011, the three selecting faculty, one coordinator in each discipline, selected at least twenty-one acceptable-quality images representing the types of images typically found in articles that the residents would normally be reading.Eighteen of these were held for the study, and the remaining were used for the pretest/training set to check the functionality and comprehension of the online assessment system.

Figure processing
The born-digital PDF was downloaded from the publisher website to serve as the control figure.Three or four experimental conditions were then created based on the original nature of the image: PDF to TIFF conversion, B&W scan, grayscale scan, and color scan.Images originating in B&W or grayscale were not scanned in color.This primarily involved electron micrograph images and various types of diagnostic imaging.All images were shown in the context of the full page of the article with caption as would be received via a scanned copy of the article for normal ILL.
The original download and the TIFF conversion were performed by the first author.The PDF to TIFF to PDF conversions took place in Adobe Acrobat Pro, version 9. The original PDFs were exported as images and saved in TIFF format; then the TIFF files were converted back to a PDF.For the sake of internal validity, all of the scans of the original paper articles were made by a single student worker in the NCSU Libraries' Interlibrary and Document Delivery Services, using the regular parameters at the standard resolutions and procedures used by the scanning staff.B&W was done on a Minolta PS7000 at 300 dots per inch (dpi), grayscale was done on a Minolta PS5000c at 300 dpi, and color was scanned on the same Minolta PS5000c at 200 dpi.The 200 dpi on color scans was the standard used by the NCSU Interlibrary and Document Delivery Services unit to control the file size for color articles.
A training set was created in Qualtrics for each discipline and included articles with more than 1 figure, where the second figure was not assessed in order to train participants to only evaluate the specified study figure.The number of training set items varied slightly by discipline: anatomic pathology (n¼14), radiology or diagnostic imaging (n¼13), and clinical pathology (n¼10).
The study population for each of the disciplines was 18 images each for original, TIFF conversion, B&W, and grayscale.The only variation was in the number of color images, which was highest for anatomic pathology (n¼15), lower for clinical pathology (n¼11), and lowest for radiology (n¼5).Of the 247 total images, 87 were for anatomic pathology, 83 for clinical pathology, and 77 for radiology.

Operationalization of quality and utility
Due to our research question about acceptability rather than subjective quality, we used a three-item response rather than a five-point Likert scale.To have a mechanism to evaluate image usefulness objectively, we added a question about the figure feature identification.The assessment study consisted of two parts designed in Qualtrics.Participants were advised to view and respond to images in the order they were presented and not to go back to compare with previous images in the individual assessment exercise, although it was still possible for participants to look back.Usefulness was operationalized as whether the viewer could identify the feature described by the author in the caption accompanying the image.Quality was asked about separately as we thought even a poor-quality image might still be useful in terms of meeting the contentcarrying function.
In the first individual image assessment, each condition of the image was randomized across the study population as presented on a single screen with the image link and the questions about figure identification and acceptability with one answer allowed by radio button.Article pages were hosted on the integrated library system (ILS) server and opened as a new windows or tabs depending on how the user chose to interact with the link (Figure 1, online only).After all individual ratings were completed, assessors moved on to the second task, which was ranking all four or five conditions of the image in order of quality (Figure 2, online only).The order in which the images were presented in their groups was also random, based on the random numbers assigned to the individual images.Participants were encouraged to not spend a great deal of time trying to discriminate between their top choices because we hypothesized that there would be little difference between the original and conversion condition.

Participants
Between December 2011 and January 2012, the primary investigator recruited, consented, and trained in person thirteen independent assessors from the population of residents and board-certified specialists (instructors and faculty) available across the three disciplines at NCSU.At training, each participant created a unique identifier to use with the online survey system and was emailed the links to the two assessments for completion.Twelve independent assessors completed the assessments, three residents and one board-certified specialist in each of the three disciplines.

RESULTS
The total number of possible identifications for all 247 conditions of the images would have been 988, but several identifications were skipped by respondents in each question, so the number of observations is different for each question.Table 1 shows the 982 figure identification responses.The percentage of positive figure identification by discipline ranged from 44% for anatomic pathology to 56% for clinical pathology and radiology/diagnostic imaging.
In regard to the first hypothesis that conversions are acceptable, Table 2 shows that the overall rates of acceptance are 100% for originals and 99% for conversions.However, participants did recognize a difference as evidenced by the statistically significant difference (P,0.0001) in proportion of originals rated superior (148/216, 68%) compared with conversions (93/215, 43%).The percentage of images deemed not acceptable by discipline ranged from 37% for clinical pathology, to 39% for radiology/diagnostic imaging, to 46% for anatomic pathology.The variation in rates between anatomic pathology and the other 2 disciplines likely has to do with the greater number of original color images (n¼15 versus n¼11 or n¼5) and the fact that grayscale and B&W would be mostly unacceptable and not identifiable for these images.
The high rates of not acceptable for grayscale and B&W needed further investigation to assess whether For images rated unacceptable, 402 responses explaining that rating were captured: 120 from radiology, 121 from clinical pathology, and 161 from anatomic pathology.The responses were reviewed by the first author, and codes were created inductively [16] to capture various components of the comments.There frequently were multiple comments within a single response, a total of 567 comments in all.During the coding process, the draft list of code names and definitions were shared with 3 of the authors to address comments that did not fall neatly into categories.The comments were then coded with the final list of themes, and those themes were grouped into overarching categories of content and quality as represented by Table 3.The language used to describe unacceptable images varied by discipline.For example, pathologists tended to describe the B&W scans of grayscale materials as dark, while radiologists tended to describe them in terms of contrast.Certain comments like ''pixelated'' were used primarily by radiologists.These categories provide a starting point for librarians in understanding the feedback they may receive from requestors about unacceptable images.
The second study exercise completed by the participants was the ranking exercise shown in Figure 2 (online only).The ranking order selected by the participants in 149 (69%) of the 215 assessments followed the hypothesized order of (1) original, (2) conversion, (3) color (when available), (4) grayscale, and (5) B&W.Assuming equivalence between the original and conversion, order congruence rose to 162 (75%).
No participants made substantive comments about the survey by using the space at the end of both the individual and the ranking assessments.

Table 3
Categorization of reasons for image unacceptability, number and percentage of comments among combined disciplines (n¼567)

DISCUSSION
These findings have implications for library collections and services.All partners in the borrowing and lending chain, including requestors, have a role in ensuring that the highest quality information is provided.Like-item provision (digital original or scanning color to color, grayscale to grayscale) at no additional charge would be an ideal default practice.Pages of original print articles containing color or grayscale images should be scanned using those modalities.Note that color and grayscale scanning may negatively impact text readability, and duplicate scanning in B&W for text and color or grayscale for images may be warranted.Although it would be preferable for libraries to provide color or grayscale automatically by policy, if that is not possible, then libraries' request pages and instructions should remind users to ask for color or grayscale if they anticipate that type of content is conveyed in what they are requesting.NCSU Libraries will encourage users to specify their image needs when making requests.These results were presented to residents and interns at the NCSU College of Veterinary Medicine to highlight the importance of asking for color and grayscale when requesting content that is likely to be image intensive.
For purchasing or licensing digital content that cannot be provided in the native PDF, a PDF to TIFF to PDF conversion maintaining color could be used.Librarians should pay special attention to the provisions in ILL and DD components of online journal licenses to ensure that digital copies can be provided to libraries regardless of the systems in place on the receiving end.In special cases where additional image content is supplemental to the article, paying to download visual content from the publisher website may be more straightforward and cost effective than ILL. Brown noted that when it is cost effective to obtain an article through pay per view, the patron benefits by receiving an article at the quality that the publisher intended, but they did not otherwise compare the quality of scanned ILL versus pay-per-view downloaded articles [17].Authors and journal editors should want to ensure that those relying on their investigations have access to a reasonable facsimile of the original.
Due to the small number of board-certified specialists recruited, we did not perform a subgroup analysis to see if there was a significant difference between residents and specialists, such as an experience effect, in these assessment tasks.This experience effect has been recently considered in clinical pathology: A study of microscopic tissue analysis by experts (clinical pathologists) and intermediates (pathology residents) showed equal levels of diagnostic accuracy in spite of different visual and cognitive strategies [18].
Librarians commenting on the blog post in 2009 asked for recommendations about resolutions, and so on.We did not experiment at various resolutions primarily because we were using our ILL/DD department default scanning practices in the fall of 2011, and radiology literature cited by Parissis et al. suggests that quality in radiographic images did not improve at scanning resolutions higher than 400 dpi [19].It is possible that scanning the color images at 300 dpi rather than 200 dpi would have made them more usable and acceptable.Butler and Bankole studied measures of time needed to scan articles in ways that preserve the data in grayscale and color figures with the goal of generating more discussion of library practice standards for scanning journal articles [20].

Limitations
Subjects in this study were limited to one institution and one library service, which may affect generalizability.

Future service and research implications
File size limitations in resource sharing software are often cited as a driver behind the choice of lower resolution scans in ILL/DD services.Resource sharing software should provide options to deliver better compressed versions of files that reduce the file size burdens for file transfer that currently discourage the use of grayscale or color scans.These data might also influence institutional information technology departments to be more flexible in allowing large file size attachments or providing easy-to-use, secure file transfer services.Providing ILL/DD from scanned vendor files with low-image quality may also be problematic.A study by Joseph of digitized geology dissertations found that 82% had at least 1 figure with unacceptable quality and cited additional research on image quality in online journals [21].
There are several institutional and associationwide mechanisms that the ILL community could influence to improve the situation for requestors.
Librarians and requestors need improved ways to share image needs and preferences in the request mechanisms used in OCLC and ALA, as well as encourage greater participation in the DOCLINE color option.Resource sharing systems should provide an automated way to match the user's request for color materials with lending libraries' capacities for filling requests in color.Lending libraries should indicate whether they provide color or grayscale scanning or copying services and any associated charges.The interlibrary loan code and handbooks on best practices for ILL/DD could enhance their treatment of user satisfaction as it relates to quality overall.Butler and Bankole made several of these recommendations in 2013 after reviewing the practices that ALA, Rapid, and the Greater Western Library Alliance endorsed [20].
Future research could include investigation into user satisfaction with actual requests; however, in our discussions with users, many remarked that they did not complain about the quality of the ILL/DD articles received and proceeded to get the articles from other sources, so the reported rejections might be lower than the true rate of unsatisfactory requests.One possibility from the radiology literature is the concept of reject analysis.Reject analysis is an accepted standard of practice for quality assurance in conventional radiology [22].This practice involves a system or process to accommodate identifying, isolating, and archiving repeated examinations, and then studying the frequency and reasons for repeated examinations in order to improve processes.A form of reject analysis could be a very useful process to engage in for ILL/DD requests that are rejected by users or requested more than once.

For
anatomical and clinical pathology, in the JCR 2010 category ''Pathology,'' we selected the American Journal of Clinical Pathology (IF 2.506) from the American Society of Clinical Pathologists, for which we held print up to 2010.In that title, we searched 2008-2010 for ''animals'' and 2008-2010 for ''urine [sb].''Additional titles used for clinical and anatomic pathology were the American Journal of Pathology (IF 5.224) and Journal of Pathology (IF 7.274), both held in print up to 2009.We searched in the American Journal of Pathology for studies with either ''blood [sh] OR urine [sh]

Table 2
Acceptability observations across all disciplines