Hypermedia-based software architecture enables Test-Driven Development

JAMIA Open. 2023 Oct 17;6(4):ooad089. doi: 10.1093/jamiaopen/ooad089. eCollection 2023 Dec.

Abstract

Objectives: Using agile software development practices, develop and evaluate an architecture and implementation for reliable and user-friendly self-service management of bioinformatic data stored in the cloud.

Materials and methods: Comprehensive Oncology Research Environment (CORE) Browser is a new open-source web application for cancer researchers to manage sequencing data organized in a flexible format in Amazon Simple Storage Service (S3) buckets. It has a microservices- and hypermedia-based architecture, which we integrated with Test-Driven Development (TDD), the iterative writing of computable specifications for how software should work prior to development. Relying on repeating patterns found in hypermedia-based architectures, we hypothesized that hypermedia would permit developing test "templates" that can be parameterized and executed for each microservice, maximizing code coverage while minimizing effort.

Results: After one-and-a-half years of development, the CORE Browser backend had 121 test templates and 875 custom tests that were parameterized and executed 3031 times, providing 78% code coverage.

Discussion: Architecting to permit test reuse through a hypermedia approach was a key success factor for our testing efforts. CORE Browser's application of hypermedia and TDD illustrates one way to integrate software engineering methods into data-intensive networked applications. Separating bioinformatic data management from analysis distinguishes this platform from others in bioinformatics and may provide stable data management while permitting analysis methods to advance more rapidly.

Conclusion: Software engineering practices are underutilized in informatics. Similar informatics projects will more likely succeed through application of good architecture and automated testing. Our approach is broadly applicable to data management tools involving cloud data storage.

Keywords: cloud computing; data management; high-throughput nucleotide sequencing; software.

Associated data

  • Dryad/10.5061/dryad.pvmcvdnrv