NCBI Bookshelf. A service of the National Library of Medicine, National Institutes of Health.

Institute of Medicine (US) Roundtable on Translating Genomic-Based Research for Health. Establishing Precompetitive Collaborations to Stimulate Genomics-Driven Product Development: Workshop Summary. Washington (DC): National Academies Press (US); 2011.

Cover of Establishing Precompetitive Collaborations to Stimulate Genomics-Driven Product Development

Establishing Precompetitive Collaborations to Stimulate Genomics-Driven Product Development: Workshop Summary.

Show details

5The Use of Biospecimens in Precompetitive Collaborations

Key Points Raised by Speakers

  • Viewing biobanks as national or international resources requires strict adherence to high standards of quality, confidentiality, and fairness.
  • Ensuring that consent provisions are observed requires centralized control of biobanks and associated databases.
  • High-quality data can be derived only from high-quality biospecimens.
  • Strong incentives are essential for companies to invest in biobanks that are used for collaborative efforts.

Biospecimens stored by investigators in industry and academia and the data derived from those biospecimens represent a considerable resource of genetic and genomic information that can be used to develop individualized treatment regimens or even drug and diagnostic devices. Since the value in these samples lies in their inherent potential for use in discovery, biospecimens are prime candidates for collaboration and coordination. However, in order to achieve this promise, their quality must be high, related data annotated, and their accessibility assured since the majority of collections are fragmented and isolated. With this in mind, the workshop participants were asked to address the following points in regard to potentially sharing biospecimens in a precompetitive collaboration:

  • What are the unique issues in sharing biospecimens?
  • What have speakers learned from their initiatives that could be used to find the best practices for biospecimen and data sharing?
  • What incentives should or need to be in place to encourage sharing of specimens and data?
  • What key structures or rules are required to establish a framework for sharing biospecimens and data?


The quality of the data derived from biospecimens can never be higher than the quality of the analytes from which the data are derived, said Carolyn Compton, director of the Office of Biorepositories and Biospecimen Research at the National Cancer Institute (NCI). However, the quality of human biospecimens is often either unknown or low, which can harm research that uses these samples. The question she posed was not “can I get access to existing samples?” but “do I want them?” “The roadblock to success and to greater efficiency in the translational research realm is related to the lack of high-quality human samples,” said Compton.

The NCI and other government agencies are making major investments of public dollars in research projects that depend on high-quality human samples. The NCI alone invests $50 million to $70 million annually in biobanking efforts of various kinds, but many researchers do not understand what needs to be done to achieve high quality. An evaluation of the biobanking system found different collection, processing, and storage procedures; differing degrees and types of annotation; variations in the scope of patient consent; differing material transfer agreements; inconsistent information technology support; and inadequate access policies. Most researchers also have few incentives to share their samples, further contributing to wide variation in the quality and accessibility of samples for research.

In acquiring biospecimens for the Cancer Genome Atlas Project, which is seeking to identify genomic changes occurring in cancer and make those data publicly available, NCI identified a large number of problems with existing samples. The quality of existing samples in biobanks is typically overestimated, said Compton. The collection of normal control samples is not routine, and clinical data on specimen donors are not readily available. Even if a specimen looks pristine under a microscope, according to Compton, its molecular quality may be low. The NCI was seeking to collect 1,500 high-quality samples for the pilot of the Cancer Genome Atlas Project, yet in 3 years it was unable to do so, said Compton.

Biospecimens are subjected to “industrial-strength biologic stresses” once they are collected, said Compton. Even the extent and type of molecular changes induced in acquiring a sample for research are largely unknown. Variables before acquisition include exposure to antibiotics or other drugs, the type and duration of anesthesia used during surgery, and the arterial clamp time. Variables after acquisition include the time at room temperature, the temperature of the room, the type of fixative used, the time spent in the fixative, the rate of freezing, and the size of aliquots. Any of these can have dramatic effects on a biospecimen (Figure 5-1).

Principal components analysis chart and microarray expression profile of the changes in gene expression induced by increasing the time between artery ligation and tumor resection in colon cancer specimens.


Changes in gene expression over time after intrasurgical ischemia. SOURCE: Compton, NCI Indivumed Study, 2010.

As an example, Compton cited a study of colon cancers showing that the time of intrasurgical ischemia while a tumor is anoxic dramatically affects gene expression in removed tissues (Spruessel et al., 2004). Even standard biomarkers for colon cancer alter their expression depending on the amount of time a tumor spends sitting at room temperature after removal. “If you let this specimen sit around for a half an hour before you fix or freeze it and then do your tests, you will be led to think that artifactual upregulation of this protein or gene actually represents the disease,” related Compton. This will require surgeons to change some of their practices. “For surgeons, it’s a cultural issue of their not feeling a sense of professional responsibility toward the sample. In other words, when they remove the tumor, all of their focus goes back on the patient on the table, and the tumor itself, the reception specimen, falls into no-man’s land where custodianship is concerned. It has not yet become the pathologist’s professional responsibility, and the surgeon does not regard it as his or her professional responsibility.” Either new ways need to be found to control the influences on a biospecimen or it should not be assayed, Compton said, “because it will give you the wrong answer.”

Liquid specimens such as blood plasma are subject to even more variables. According to Compton, “with the stakes going up with proteomics, genomics, metabolomics, and other ‘omics’ experiments, you stand a great risk of misinterpreting artifacts as a biomarker unless you know what happened to your specimen before it went into your analysis.”

To counter these problems, the NCI has been taking a stepwise approach. It began by developing best practices for biospecimen resources (NCI, 2007) that provide a baseline on which to build as the science evolves. These best practices represent a set of unifying policies and procedures for biospecimen resources across the United States, although NCI does not have the authority to enforce these guidelines, which means that they must be adopted voluntarily. However, Compton did suggest that a new focus on the quality of samples will create incentives for scientists and publishers to question the quality of the samples on which research or a publication was based, and “this will change the culture.” Also, the existence of a guidebook and an expanding science base on the factors that affect samples will lead researchers to want to do the right thing. Finally, NIH could decide that best practices should be more than voluntary and make adherence to standards a condition of receiving a research award. “We require our scientists to report on the care and feeding of the animals but not the care and feeding of the human specimens that they use in our funded research,” said Compton.

The guidelines include recommendations for technical, operational, and safety practices; quality assurance, control, and management; implementation of enabling informatics systems; ethical, legal, and policy issues; reporting mechanisms; and administrative and management structures. “A biobank is no longer regarded as the minus 80 degree freezer in your hallway with an Excel spreadsheet Scotch-taped to the door,” said Compton.

Recognizing that the science of biobanking is weak, the NCI also has set out to strengthen the evidence base for biobanking. It is building a national disease biobank called the Cancer Human Biobank (caHUB) in which all the variables currently known to be important are being investigated. Launched with funds from the American Recovery and Reinvestment Act of 2009, it is initially a government-owned and government-operated enterprise, but the vision is that ultimately it will be a public–private partnership. “It will be a unique, centralized, not-for-profit public resource that will be a source of adequate and continuous supply of human biospecimens and associated data of measurable high quality within an ethical framework, and a source of high-quality biobanking services for the community,” said Compton.

All the data associated with the biobank will be in the public domain, according to Compton. Tumor samples and data will be collected from hospitals approved by the Commission on Cancer, which will ensure that patients receive a certain standard of care and that standardized data elements are collected on every cancer patient. The biobank will be centrally managed and quality controlled but will provide access to all members of the community.

Compton expressed the firm belief that the biobank should be in the precompetitive space. “No one can do this alone, and everyone will benefit from it.” In particular, drug development relies in crucial ways on the quality of biospecimens, and the decision to move forward based on results from biospecimens of unknown quality is one reason why drug development is so expensive and so often fails. “This is going to require a seismic shift in the way we think,” said Compton. “If we think it’s too expensive, or too labor-intensive, or too time-consuming to do it right [the first time], I don’t know when we’ll have the time or money to do it over and so I would suggest that this is a just-in-time step to invest in the right stuff that is going to move the agenda forward, and it’s a perfect space for a public–private partnership.”


There are many potential obstacles to the access and utilization of biospecimens, said Cynthia Helphingstine, president and CEO of the Fairbanks Institute for Healthy Communities. Biospecimens may not be available, or their quality may be highly variable. Phenotypic data may be incomplete, or access to the biospecimens or associated data may be restricted. Longitudinal outcomes data associated with the biospecimens may not be available. Biospecimens may not have adequate consent, or biospecimens from appropriate controls may not be available.

Overcoming these barriers was the motivation behind the creation of the nonprofit Fairbanks Institute for Healthy Communities in Indianapolis in 2006, noted Helphingstine, with seed funding of $10.5 million from the Richard M. Fairbanks and Guidant Foundations in collaboration with BioCrossroads, which is the State of Indiana’s life sciences initiative; the Indiana University School of Medicine; the Regenstrief Institute, Inc.; and other community partners from central Indiana.

The vision of the institute, according to Helphingstine, is to conduct a longitudinal study of Indiana’s population in which biological specimens are linked with clinical outcomes data from the Indiana Network for Patient Care to create a novel and powerful research platform that will facilitate basic and translational research breakthroughs and lead to improved patient outcomes. Also, by engaging the community as a partner in the creation of the institute’s research platform, resultant medical breakthroughs would have the potential to create civic pride among the citizens of Indiana and facilitate community participation in all Indiana research studies, said Helphingstine.

As is the case elsewhere, Indiana has high rates of smoking, obesity, cancer, and heart disease. “Finding health-challenged populations to study would be, unfortunately, relatively easy in Indiana,” said Helphingstine. The state also has the Indiana Network for Patient Care, which is a regional health information exchange that serves as a repository of clinical data for health systems in central Indiana and represents about 1.6 million patients. The network was founded by the Regenstrief Institute more than a decade ago with the idea of providing clinicians with the information they need whenever they need it. The network also has access to health information going back about four decades to some of the nation’s first electronic medical records. “You can go back and pull out someone’s blood glucose or glycated hemoglobin from 1978, and it’s there. Not on everybody, but it does go back a long way . . . and importantly, these health systems are continuing to generate information.”

This ability to link data and perform retrospective and prospective analyses on large numbers of patients contributed to the vision of creating a resource for the genomics, proteomics, and metabolic studies that are needed today. With the ability to query more than 8,000 clinical, laboratory, and outcome variables, “you can design studies that never were possible to think about before.” As an example, Helphingstine mentioned a study enabled by the institute on coronary artery disease, which is examining 1,500 individuals—750 individuals with documented history of the disease and 750 in the control group, with the control population being annotated to ensure they are not on statins, do not have diabetes or hypertension, and so on. Clinical information on study participants can be retrospectively and prospectively updated from the electronic medical record at any time to gather additional phenotypic information on study participants and to select biospecimens for discovery and validation studies.

The institute has 16 sites at which patients are enrolled and consents obtained. Partner organizations provide storage and analysis services. The Regenstrief Institute links the information with the biological samples, which are available to academic, government, and commercial researchers from all around the world.

The institute has made a major effort to engage with communities through roundtables, partnerships with community organizations, and community events (Figure 5-2). Consent is a major issue with the institute’s work, and “we make sure the people in our study understand that these samples will be used by academics, government, and commercial researchers.” People in the study are told that their medical record data will be made available and they may be contacted for more samples or information in the future. They also are informed that information may be obtained from prescription and payer databases.

Diagram of the community engagement strategy for the Fairbanks Institute for Healthy Communities detailing their plan to enroll, connect, and build trust with participants.


Community engagement strategy. NOTE: CHEP = Community Health Engagement Program; CTSI = Indiana Clinical and Translational Sciences Institute. SOURCE: Fairbanks Institute for Healthy Communities, 2010.

Although medical record data are disclosed, people are informed in the consent process that they will not receive any information about themselves, said Helphingstine in response to a participant question. However, she continued, if an investigator found something he or she felt was imperative to be reported back to a participant, the Institutional Review Board could make a determination on contact and, if appropriate, the Regenstrief Institute can reconnect the data to the person’s name.

In terms of IP, the institute does not pursue any, said Helphingstine. “It is not our job to own the IP. We just provide the samples, we hope industry does something with it, we hope academics do things with it.”

Moving forward, Helphingstine related, the Fairbanks Institute would like to initiate new longitudinal studies as collaborative efforts. “We really would like to do things with collaborators where people will really want to put data back into our database, so that [information] would then be available for everyone that wants to use our samples.”


The UK DNA Banking Network (UDBN) is an “infrastructural research project,” said Martin Yuille, a reader in biobanking and co-director of the Centre for Integrated Genomic Medical Research at the University of Manchester. It is both an infrastructure and a research project designed to ensure access to annotated samples and data including genomic, genetic, and phenotypic information. The UDBN does DNA extraction and quality control; maintains cell lines; stores, retrieves, distributes, and tracks samples and data; and maintains an identifying link to all annotations. It holds all data centrally to achieve the goals and standards of the UDBN, which also allows for consent provisions to be attached to a sample and its associated data so that samples and data can be withdrawn from the biobank if requested. It currently manages about 60,000 samples and has distributed 80,000 aliquots. Assurance from the International Standards Organization ensures consistency, imposes continuous quality improvement, and reduces problems associated with staff turnover and succession.

In the UDBN’s first phase, it established a biobank for advanced sample management by collating samples from disparate collections, standardizing them and their associated data, and making the samples available for further investigation. In phase 2, it is undertaking advanced management of the sample annotations, including phenotypic data from electronic health records and other sources (Figure 5-3). Any researcher will be able to browse the resources and request access to the data and the associated samples. The samples and data are treated as “national resources” that must be readily available to those deemed to be bona fide collaborators, whether they are from commercial companies, universities, or some other organization.

Flow diagram of the UK DNA Banking Network™s process for annotating samples received from patients and managing requests for sample access from researchers.


UK DNA Banking Network advanced management of annotation. SOURCE: UK DNA Banking Network, 2010.

This arrangement has generated some new issues that have not been dealt with before, said Yuille. The idea of biospecimens as a “national resource” is a shift comparable to that of the transition from hunting and gathering to agriculture. The development of such a resource requires very broad consent. It also requires a chain of quality at all steps, “from the patient to the paper,” said Yuille. Additionally, a system needs to be implemented to guarantee return of data back to the repository from those that have been granted access to specimens. The movement of samples and data across borders requires a global vision for sharing. “Biobankers in the [United] States need to be talking to biobankers in Europe and in the Far East and everywhere else in the world,” continued Yuille, “about how we’re going to improve the movement of samples and data.”

Finally, there needs to be respect for all of the stakeholders involved in this work, added Yuille. The UDBN has adopted a “fair access” charter that spells out how various stakeholders should be treated (Yuille et al., 2010). Fairness to the subject involves guaranteeing privacy and confidentiality, the ethical use of samples and data, consent management (including national and open methods to permit effective withdrawal of consent), and public engagement with the work and its goals. Yuille pointed out, however, that “The UK Biobank model is that you have donated your sample to UK Biobank and that’s the end of the matter. . . . We will not tell you anything about the genetic outcomes, for example, of an analysis. We will tell you if your blood pressure is too high when we see you, on the day that we see you, or we’ll tell your GP, but beyond that we won’t go.”

Fairness to the collection entails giving the people who made the collection the right of first access to the samples. In this way, said Yuille, they can carry out the research for which they received funding to create the biospecimen collection.

Fairness to the recipient of the samples, according to Yuille, requires collaborative management to ensure transparency of a sample’s use, access to published and unpublished data about the sample, long-term availability of the sample, and a minimum of administration. “We don’t want recipients having to spend forever filling in forms in order to get access to samples or data.”

Finally, fairness to the collectors’ and investigators’ institutions requires long-term tracking and management of samples.

The sharing of samples and data is not a major technical problem, concluded Yuille. It is more of a cultural problem. Extensive national and international discussions and collaborations have been and will continue to be essential to win an understanding that competition should be between ideas, not about de facto control of biological samples.


The goal of the Clinical Translational and Science Awards (CTSA) Biobank Consortium is to develop a virtual biobank using an automated online sample request management system for use across multiple CTSA centers with samples maintained at their home institutions, said Lorraine Frazier, professor of nursing and assistant dean and department chair of Nursing Systems at the University of Texas Health Science Center in Houston. At the Houston site, the biobank began as a manual system with more than 48,000 patients and more than 188,000 samples. About 70 percent of the biobank was collected with standardized protocols, and more than 12,000 samples have been distributed to 46 researchers since 2002. “We’re not about giving millions and millions of samples away,” said Frazier. “We’re about being effective and making sure samples are used effectively.”

The consortium has created and tested a prototype custom biobank software application and associated technologies. It also piloted the electronic capture of patient consent variables. A requirement for joining the biobank is the automation of an investigator’s samples. Currently, the University of Texas Health Science Center at Houston has partnerships with the University of Michigan, the University of Texas Health Science Center at San Antonio, the University of California at Davis, Indiana University- Purdue University Indianapolis, and the Baylor College of Medicine. “We’re all coming together to collaborate and connect with an automated system where we can ask for samples and disburse those samples.”

The Biobank Executive Steering Committee agrees on policies and procedures for the biobank. These policies and procedures are agreed to and used by all sites and by the Web application when requesting and managing samples. The Web application searches all the individual sites and coordinates request procedures, notifications, and NIH reporting requirements. Samples are then delivered to the researcher from the sample owners at the local sites.

“The samples belong to the PIs [principal investigators], they don’t belong to us,” said Frazier. “PIs don’t like to let go of their samples. They talk about sharing them, they love to talk about collaborating, but when it gets right down to it, they have to have . . . say over the samples.” PIs thus have the ability to decide whether or not to collaborate on a research project.

Stephen Eck from Eli Lilly and Company pointed out that the federal government paid for the collection of samples and that giving PIs complete control over how they are used is unlikely to be in the public’s interest. Frazier agreed but observed that “most of the PIs across this nation feel that they own those samples. A big cultural shift needs to occur. We’re asking a lot. I think we’re making progress, but I agree it could hinder innovation and investigation.” Kelly Edwards added that ownership is not just about control, it is also about responsibility. PIs often feel a stewardship obligation toward the participants they have recruited into their studies. The participants trust the fact that the PI is going to make decisions about data release, data use, and so on. This trust will have to be maintained in moving toward a more federated model, she said.

Cost and time are some of the issues that have prompted PIs to refuse to share their samples, according to Frazier. In response, the consortium is developing a business model that would entail charging for the samples. “We don’t make money, but we’ll have enough money to pay those individuals to actually pull those samples and get some of that work done.” Martin Yuille pointed out that his group has been required by the Medical Research Council to implement a cost recovery policy as well, but the strategy “doesn’t really work.” Sample requests and the subsequent payments are irregular, but employees need to be paid on a regular basis. As a result, his organization has to take on other projects and charge for those projects to pay the people who are manning the robots in the UDBN, and even that does not cover all the personnel costs. “The cost recovery element isn’t that great,” he said.

Participating sites are expected to provide oversight of the biobank team personnel and to sample holders who contribute at their sites, said Frazier, participate on the executive steering committee and provide funding for personnel and technical resources. Individual sites also make sure that sample-related data, clinical data, and patient consent variables are migrated from paper to electronic format; that samples and data are consented for secondary use; and that the data are validated.

The benefits of membership include improved synergy and interactions across the institutions through sample sharing, acceptance of a business plan and cost recovery model, use of the online sample request management system, and lower costs for entry and maintenance than for the closed data models inherent in commercial software solutions.

Cultural considerations are important, said Frazier. “We’re shifting from competing to collaborating. We’ve made great strides, but we have a long way to go.”


Pfizer studies 30,000 to 40,000 people in randomized clinical studies every year, which represents a rich source of information for research collaborations, said Sally John, head of Human Genetics at Pfizer. Since 1996, Pfizer has been recruiting people in its clinical trials to contribute to a biobank of genomic, biofluid, and tissue samples. However, informed consent for the trials allows for only the assessments described in the protocol, requiring a different approach to use specimens for other purposes. Thus, participants are asked for additional consent to participate in genomic, proteomic, metabolomic, biomarker, and other studies. John said that 70 to 80 percent of clinical trial participants give samples when asked if they want to participate. These samples are of use within Pfizer, but they also have many potential uses outside the company; thus far, only 35 percent of the samples have been used in research projects, leaving an intact, substantial biological resource, said John.

Industry has many incentives for investing in such biobanking efforts, according to John. Such investments demonstrate a commitment to improving science in both academia and industry with a particular focus on human research. They contribute to the development of robust predictive models of disease. They enable meta-analyses of multiple data sets, which provide more power to detect and estimate modest effect sizes and conduct rapid replication and validation of exploratory findings. They also attract funding for clinical research to address questions that are common across the industry.

John cited two examples of collaborative studies, one that benefited from industry inclusion and one that could have. A paper published in Nature Genetics used almost all of the academic samples that were available at the time to investigate a genetic variant involved in drug-induced liver injury due to flucloxacillin (Daly et al., 2009). The research brought together several academic partners and was funded by pharmaceutical companies and the Wellcome Trust. “The beauty of this paper is it was overwhelmingly robust and validated the result in its first publication. We didn’t have to wait for another publication to come out and validate it,” said John. The second paper used large-scale association analysis to identify twelve type 2 diabetes susceptibility loci (Voight et al., 2010). The sample size in this study was 45,000, but according to John, “we could have doubled the power of this study by making [industry] samples available for this type of research.”

Participation in large-scale studies is also beneficial for junior faculty. They “provide opportunities for young researchers to ask questions that were simply not addressable prior to having all of this data together in the same place. I think we will see, over the years, that this type of multicollaborative research actually provides more opportunities for young researchers to flourish,” said John.

Several factors can inhibit the use of Pfizer’s samples, though, continued John. The consents allow access to data only through a collaboration with Pfizer. However, the business value of the collaboration is considered low internally, partly because there is a view that large consortia never deliver anything. Additionally, external and internal stakeholders may need to okay or endorse a specific project, and academic partners sometimes expect that industry will foot the bill for the research.

Pfizer is trying to think internally about how to implement some best practices related to its collection of biospecimens. The company is looking to simplify the process to access its biospecimens and data and is contemplating seeking endorsements from stakeholders early. One approach would be to standardize the conditions of access across studies, which is something that John said should be done across the entire industry. As an example, John said she personally would not have a problem with putting all of the data associated with the controls for a study in a publicly accessible database, though that may “not necessarily represent the views of Pfizer. I think, in terms of standardizing and simplifying the process, . . . we don’t see any reason why we shouldn’t make all baseline data available for comparative arms.”

Additionally, all data generated from the studies must come back to a central database for additional meta-analysis so that knowledge and information are continuously built. Industry partners also need to ensure that skilled experts are fully engaged with precompetitive efforts. “You won’t get much out of it if you dial into a telecom every six months and stand on the sidelines,” said John.

A variety of incentives can motivate companies and individuals to participate in this work. Academic and industry partners both need rewards for making data available. All work coming from the collaboration has to be published. Funding to support the operational aspects of precompetitive research also can encourage participation. “It’s not a trivial task to order 20,000 DNA samples for a collaborative research project, and so ways or mechanisms to make it easier for pharmaceutical industries to do that would be welcome.” What is needed, said John, is “a commitment in these large consortia to enable all of the parties to access and use the data effectively.”

Copyright © 2011, National Academy of Sciences.
Bookshelf ID: NBK54315


  • PubReader
  • Print View
  • Cite this Page
  • PDF version of this title (1.8M)

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...