![]() | ![]() |
Formats:
|
||||||||||||||
Copyright ©2008 AMIA - All rights reserved. Discovering Synergistic Qualities of Published Authors to Enhance Translational Research Oregon Health and Science University, Portland OR, USA This is an Open Access article: verbatim copying and redistribution of this article are permitted in all media for any purpose Abstract Translational research is the process of bringing together basic scientific research and improvements in patient care. This process, by its very nature, requires a wide range of skills and resources, typically not found within any single individual. This project investigates the synergistic features of published researchers at the Oregon Clinical and Translational Research Institute (OCTRI) to see how scientists with different specializations can be brought together to improve translational research. The investigated features were author connectivity and complementarity of research subjects. Author connectivity was measured by taking the Average Path Length (APL) and cluster coefficients [1] over an OCTRI coauthor network. A high degree of connectivity, or low APL value, would indicate that a researcher has participated in many collaborations and published many papers with other OCTRI researchers. This would imply that they have some experience we could leverage to build teams for translational research. Subject complementarity was established by using pairs of frequently co-occurring MeSH terms. Those terms were then used to bridge researchers together through indirect Swanson matching [2] and present evidence of topic synergy, or potential collaborative synergy. Our initial investigation supports the development of a collaborative browsing tool to assist the creation of new translational research teams. Such a tool is being developed at the OCTRI and will include a user-centric evaluation in the near future. Introduction Translational biomedical research involves translating scientific discoveries into new clinical diagnositics and using the outcomes to guide further research [3]. Accomplishing this task often requires expertise of researchers over a broad range of disciplines [4]. OCTRI (the Oregon Clinical and Translations Research Institute), a partnership between OHSU and Kaiser Northwest, is particularly suited to this task as an academic medical center because it has a pool of accomplished researchers practicing in a wide range of areas from the basic sciences to the clinical fields [5]. Using a simple Pubmed query, it was discovered that there were roughly six thousand published authors coming from the two institutions between the years 1990 and 2006. The two institutions have decided to hold thematic retreats to encourage collaboration and novel translational research in various areas. For this to be successful the most effective multidisciplinary group of scientists relevant to the topic at hand must be identified. Invitees should have different but complementary areas of expertise in order to encourage work that transcends disciplines and help drive the translation research engine. For example, consider the differences between a pairing of two cancer biologists and a pairing between a biologist and clinician. Two biologists might discover a cure for treating a mouse model of cancer and publish their work, but this will not likely lead directly to clinical applications. On the other hand, a collaboration between the biologist and a clinician might would focus on a means of applying the discovery to patients. The latter case presents a better opportunity to develop more rapid clinical applications of bench-side research. To foster such opportunities, this work investigates the feasibility of using coauthor relations and MeSH associations to explore the collaborative space. A key requirement to constructing multidisciplinary groups is to ensure that members have different but complementary skills. For this project, MeSH terms associated with an author's publications are considered to represent their expertise. One method of finding complementary concepts, or in this case, authors, is related to Swanson matching [2]. Swanson matching is a method of finding an intermediary concept B that connects to disparate concepts A and C. ARROWSMITH applies Swanson matching in the following manner [2]: A similar method was used to match researchers together synergistically. Biomedical researchers publish documents which are assigned MeSH terms by the National Library of Medicine (NLM). This gives the document some semantic information about the authors, which aids in indexing. The collection of MeSH terms from an author’s publications can give semantic information about their specialty and areas of expertise. If two authors (who have never met and have different MeSH terms) can be joined indirectly by a mutually related third concept, this would constitute evidence of potential synergy in their areas of expertise. Using this modification of the Swanson matching method to provide evidence of synergy, the next important task is to select a set of scientists whose experience could provide the most guidance in fruitful areas to pursue translational research. For this, we can perform a Small World analysis to measure their past level of collaboration. The Small World concept, originally devised by Milgram [6], describes a phenomena in social networks where individuals who don’t know each other are usually connected to each other through a small chain of acquaintances, roughly 6 to be exact. This type of analysis has been adapted for practical uses such as optimizing the distribution of skills among a call-center's network of employees [1] and to develop recommender systems on customer-product networks [7]. In this case, co-author networks from OHSU and Kaiser publications can be used to identify how researchers are connected to other published researchers and give some indication of who may be in a position to participate in or provide guidance on initiating translational collaboration. Methods The focus of this study was to develop a method of finding groups of researchers with synergistic features. We looked at two features to connect groups of authors: 1. experience in working with many different coauthors and 2. background knowledge in a particular field. Authors were measured on the first point by computing their APL within the network of OCTRI researchers. This would give some indication as to who could act as advisors in working with researchers of different backgrounds. The second point was measured by building a profile of MeSH terms for each author. This gave a succinct description of their background and was further used to indirectly connect complementary researchers. Selecting the researchers involved extracting the authors and their associated information from the Pubmed queries in Figure 2
The first query corresponded to OHSU researchers and the second query, Center for Health Research in Portland, corresponded to the subgroup of Kaiser researchers affiliated with OHSU. These queries resulted in 3091 Medline entries and 5943 authors. The Medline entries contained publication information, such as author names, MeSH terms, title, etc, which were used in discovering researcher connectivity. Data from the TREC [8] Medline corpus was also used to collect baseline co-occurrence statistics on MeSH terms; these data were used in the Swanson matching to find complementary MeSH terms. Authors were extracted by making a unique profile for each AU, or author field. The AU field was used, as the names were unique enough to distinguish between individual researchers within OCTRI and it provided a consistent format of Last Name, First Initials; one drawback however, was that author data could be split between two similar names, e.g. Hersh W versus Hersh WR. Since the Medline records did not contain sufficient information to determine if the names corresponded to the same person or not, this was tolerated as a limitation of the current work. Small world analysis: The published researcher data was studied in two ways. A small-world analysis was performed to highlight researchers with experience and reduce the overall number of researchers that had to be processed. Then, Swanson matching was performed on those researchers' data to find synergistic candidate pairings for collaboration. Small-world analysis used two measures: the Average Path Length (APL) and the cluster coefficient. A coauthor network was constructed from the Medline data by linking authors together with their known coauthors. Figure 3
In addition to the APL and cluster coefficient measures, the stability of the overall network was measured by iteratively removing authors of lower publication count. This would give an indication whether or not there were any key authors with low publications counts connecting other researchers together. The top 10 researchers with the smallest APL, or greatest connectivity to other researchers, were selected to search for synergistic candidates. Pairing authors together: A key aspect of synergy was to find researchers with skills that were not the same but instead complementary. Authors were associated with the MeSH terms that occurred in their publications and they were connected through an intermediary collection of MeSH terms found to frequently co-occur with their own specific terms. The intermediary collection was made by calculating the mutual information score, or I (A,B)=
Results In constructing the co-author network, certain topological features became immediately apparent. Namely, there was a super-cluster containing > 80% of the authors and several smaller clusters ranging in size from 5 to 30 authors. This meant that the vast majority of the authors could be associated with one another purely through past collaborative acquaintances. The smaller groups were often composed of isolated groups that published one or two papers on their own. When identifying the most connected researchers, it is important to ensure that the results are stable across multiple samples of the co-author data. The stability of the super cluster was tested by applying a publication threshold between 1 and 10; that is, authors with less than 1, 2, 3, etc. publications were removed from the network. This was done to see if there were any significant sub-populations of researchers who were loosely connected through a few key author associations. If they were, this would represent opportunities for establishing connections between them. It would also require us to investigate the meaning of being ‘well connected’ (low APL) if linkages through authors with small numbers of publications played a large role. The thresholds were applied iteratively to remove authors from the cluster. Throughout the thresholding the super-cluster remained, indicating that the backbone was made up of highly published authors and that the newer, less published authors lay at the fringes of the network. For authors in the super cluster, APL values were computed. The mean of the APL values was 6.37, which, interestingly was very close to the value derived in the Milgram studies [6]. Publication thresholding was applied up to 10 and it was found that the order of the top ranked individuals did not change drastically, showing that typically junior researchers did not generate significant links between veterans. The APL values were used to discover highly connected, and by implication, highly collaborative researchers. The researchers in Table 2 were ranked as the top 10 because they had the shortest APL to all other researchers in the super cluster. Although their disciplines were varied, three of the top 10 most connected researchers were biostatisiticians. This was considered significant as biostatisticians might have more experience in collaborating with multidisciplinary teams by working on different projects in a supportive role rather than focusing on a specific topic in a given area of research.
While the APL measured an author's connectivity, the cluster coefficient was used to measure the density of their coauthors, or how well the coauthors knew one another. The cluster coefficient ranged from 0 to 1, with 0 being that none of the coauthors knew one another and 1 being that all of the coauthors knew one another. This would indicate that authors with a low cluster coefficient may interface with a wider variety of researchers. The correlation of publications to cluster coefficients for authors in the super cluster was –0.69 with a p-value 0.001. This suggests that junior investigators tend to work with past coauthors for a time, before they participate more diverse collaboration.Swanson matching was used to provide evidence of synergy between the pairwise matching of the top 10 authors. Table 3 shows a sample pairing between scientist A, BTM, a physician studying prostate cancer, and scientist B, DBJ, an MD specializing in Leukemia. The evidence shows that BTM and DBJ are linked together through the ABC tuple: Receptors, Calcitriol; Dihydroxy- cholecalciferols (Vitamin D); and Exons. Calcitriol receptors are activated by Vitamin D, which has been shown to inhibit prostate cancer [10, 11]. This evidence shows that they could work together on studying the common role of Vitamin D in reducing various types of cancer. BTM could characterize how the biological system with cancer reacts to Vitamin D while DBJ could provide expertise on the receptor's transcription and translational features and what molecules or drugs might affect it. From there, they could develop potential therapies to reduce prostate cancer.
Discussion The specific aim of this project is to help basic science and clinical research lab managers and translational research organizers to discover how their particular interests are connected to other areas of research, and consequently help them locate synergistic experts in those fields. In the network analysis, researchers were profiled by experience through past collaboration. Authors with the greatest connectivity, as evidenced by low APL and cluster coefficient values, tended to be in positions of authority. This has enabled them to gain a vast amount experience by coordinating and collaborating with other researchers. Note that this can be both good and bad. A researcher may have a wealth of knowledge in working with researchers of different disciplines but they may also be too busy managing those people to work on new translational research projects. Conversely, authors just starting their publishing careers have low connectivity, or high APL and cluster coefficient values, because they have yet to establish their professional contacts. They may have more time to devote to a project due to fewer responsibilities, but they may not have the necessary skills. This suggests that it may be possible to measure an increase in translational collaborative research among junior scientists by showing an earlier drop in the cluster coefficient. The top 10 most highly connected researchers (by APL), who have also had 10 or more publications, had many MeSH terms associated with them. This made it easier to find complementary skills when pairing against the other researchers. Constructing a good pairing depends on the type of research project one would want to accomplish. This could be done by filtering for a specific set of MeSH terms according to their semantic type. This is considered closed Swanson matching, in that, we are given the two researchers and we are trying to find a rational for connecting them. Another type of search would be considered the open Swanson match in which we have only one author and we are trying to find the best complement to that author. This would be done by taking the terms that frequently co-occur with an author's MeSH terms and use them to look for other researchers whose background covers those terms well. Ultimately, we have a way of saying two authors might work well together based on the evidence in their publications. One thing to note, in doing the matching, is that this method of Swanson matching was able to generate at least 20 sets of ABC terms for each pair of authors. If one were to consider that those with the top APL ranks have the most experience, then it is not surprising to see that each has some knowledge that can contribute to the others' works. Generating an ABC list can therefore be helpful in identifying the types of research that should be investigated in a collaborative project. This study presents a proof of concept for a set of techniques that can be used to identify groups of potentially collaborative researchers. Currently, this process is done constructing MeSH association statistics and manually looking up authors that might be paired together. The next step would be to create an interactive visual tool that lets translational research organizers browse the “collaborative space” for researchers who might provide the most impact in developing and translating human related treatments. We are currently developing this application and have plans to test it in a user-centric evaluation Limitations In our study, we have found that we could connect authors to one another through association and complementary topics. There are, however, several limitations in the current work. The author data we used to build the coauthor network was noisy in two respects: two distinct authors could have the same name and the coauthors may have belonged to non-OCTRI institutions. To reduce the noise, we have found it necessary to have a list of valid authors to filter on. This would reduce the severity of collision between authors with the same name and including authors that we are not investigating. Unfortunately MEDLINE does not use a unique author identifier and these issues likely need to be resolved using ad hoc methods. The MeSH-MeSH mutual information scores were computed using the TREC corpus. It was used out of convenience while waiting for licenses to the Medline corpus. This caused the connection between authors to be biased based on genomics related MeSH terms. We have subsequently re-computed these numbers using the full MEDLINE data. Future research will be performed using the updated mutual information scores. We are currently developing an interactive user collaborative space application browser built on this research. The application itself is nearly completed and will include a user-centric evaluation in the near future. Conclusion This study presents a method for finding synergistic properties among an experienced group of researchers with the end goal to increase and enable translational research OHSU and Kaiser authors are searched for and ranked by their connectivity to other authors by the APL measure. The top scoring authors represent researchers with the most experience that could be leveraged for translational research purposes. These researchers were then paired against each other indirectly through Swanson matching. This method looked for intermediary concepts that frequently co-occurred with their prior work and used them to bridge to different authors, identifying potential synergistic partnerships. This ensured that the authors were not being matched purely on similarity and that they had some level of synergy or complementary background that would imply they could collaborate together on translational research projects. In this work, the exploration of an author's potential collaborations is a somewhat manual process that will be replaced with a tool for browsing the “collaborative space” consisting of both linked researchers and research topics. The tool will then be tested for its effectiveness, thus giving a more efficient way to identifying potential synergies for translational research. Acknowledgments We wish to acknowledge the support of an Oregon Clinical Translational Research Institute (OCTRI) PILOT8 grant in this research. References 1. Iravani S, Kolfal B, Oyen M. Call-Center Labor Cross-Training: It's a Small World After All. Management Science. 2007;53:1102–1112. 2. Smalheiser NR, Swanson DR. Using ARROW-SMITH: a computer-assisted approach to formulating and assessing scientific hypotheses. Comput Methods Programs Biomed. 1998;57:149–153. [PubMed] 3. Marincola FM. “Translational Medicine: A two-way road,” J Transl Med. 2003;1:1. [PubMed] 4. Ioannidis JP. Materializing research promises: opportunities, priorities and conflicts in translational medicine. J Transl Med. 2004;2:5. [PubMed] 5. Pober JS, Neuhauser CS, Pober JM. Obstacles facing translational research in academic medical centers. FASEB J. 2001;15:2303–2313. [PubMed] 6. Milgram S. The small-world problem. Psychology Today. 1967;1:61–67. 7. Huang Z, Zeng DD, Chen H. Analyzing Consumer-Product Graphs: Empirical Findings and Applications in Recommender Systems. Management Science. 2007;53:1146–1164. 8. Voorhees E. Text Retrieval Conference (TREC). [updated 2007 August 22; viewed 2007 August 22]. Available from : http://trec.nist.gov/ 9. Newman MEJ. The structure and function of complex networks. SIAM Review. 2003;45:167–256. 10. Ali MM, Vaidya V. Vitamin D and cancer. J Cancer Res Ther. 2007;3:225–230. [PubMed] 11. Serda RE, Bisoffi M, Thompson TA, Ji M, Omdahl JL, Sillerud LO. 1alpha,25-Dihydroxyvitamin D(3) down-regulates expression of prostate specific membrane antigen in prostate cancer cells. Prostate. 2008 |
PubMed related articles
Your browsing activity is empty. Activity recording is turned off. |
|||||||||||||
Comput Methods Programs Biomed. 1998 Nov; 57(3):149-53.
[Comput Methods Programs Biomed. 1998]J Transl Med. 2003 Jul 24; 1(1):1.
[J Transl Med. 2003]J Transl Med. 2004 Jan 31; 2(1):5.
[J Transl Med. 2004]FASEB J. 2001 Nov; 15(13):2303-13.
[FASEB J. 2001]Comput Methods Programs Biomed. 1998 Nov; 57(3):149-53.
[Comput Methods Programs Biomed. 1998]J Cancer Res Ther. 2007 Oct-Dec; 3(4):225-30.
[J Cancer Res Ther. 2007]Comput Methods Programs Biomed. 1998 Nov; 57(3):149-53.
[Comput Methods Programs Biomed. 1998]