Ten Common Misconceptions About Galaxy (And Why They're Wrong!)
We are thrilled to announce that 'Ten Common Misconceptions About Galaxy (And Why They Are Wrong!)' has just been published in PLOS Computational Biology! This paper is the result of passionate discussions, collaborative debates, and a shared commitment to clarifying what Galaxy truly is—and what it can do. Whether you are a longtime Galaxy user or new to the platform, this paper will challenge assumptions and highlight Galaxy's versatility, scalability, and impact across disciplines.
The Story Behind the Paper
Every great idea starts with a spark. For this paper, that spark was a feeling, a nagging sense that Galaxy, despite its growing popularity and impact, was still misunderstood. That feeling grew into a thought, and that thought grew into a conversation. A lively conversation.
It began with a group of nerds (affectionately referred to as such) gathering to vent their frustrations about the persistent myths surrounding Galaxy. Another group of nerds disagreed with their conclusions. Then, life got busy, and progress stalled. Later, a third group joined the fray, leading to a grand, collective whinge session in Australia. After countless hours in Google Docs, a write-a-thon, numerous online meetings, and moments of near-despair, something remarkable emerged: a paper that not only addresses misconceptions but does so with evidence, humor, and a touch of defiance.
Why This Paper Matters
Galaxy is an open-source platform designed for accessible, reproducible, and scalable data analysis. It’s used by researchers, educators, clinicians, and industry professionals worldwide. Yet, despite its success, misconceptions persist. Some believe Galaxy is only for genomics, lacks scalability, or is just a teaching tool. Others question its security, software quality, or relevance outside academia.
This paper tackles these myths head-on. It’s not just a defense of Galaxy: it’s a celebration of its versatility, maturity, and impact across disciplines.
The 10 Misconceptions (And Why They’re Wrong)
Let’s dive into the myths and the reality:
1. “It is only useful for genome scientists.”
Reality: Galaxy supports -omics and beyond. While it originated for genome analysis, its data-type agnostic architecture enables broad applicability, from proteomics and metabolomics to ecology, climate science, and even astronomy. Galaxy’s flexibility allows tool developers to contribute tools for any domain, and its extensive datatype system supports over 700 formats and 9,000+ tools.
2. “It offers nothing to coders.”
Reality: Coders can write their own tools and make their analysis reproducible with Galaxy. Galaxy brings decades of accumulated expertise, allowing developers to create versioned, documented, and reproducible analyses. Tools are easily shared, and Galaxy’s API enables programmatic interaction, making it a powerful platform for both developers and researchers.
3. “It does not scale to large and complex problems.”
Reality: Galaxy scales to global analyses! UseGalaxy.* instances offer massive computing power, with thousands of CPU cores, TBs of RAM, and PBs of storage. Galaxy’s ability to handle large datasets is proven by its use in COVID-19 research, where it analyzed over 500,000 samples in near real-time.
4. “It’s hard to use.”
Reality: Galaxy is easier to use than the alternatives! Its standardized environments, simple user interface, and extensive training materials make it accessible to users of all skill levels. The Galaxy community actively improves usability, ensuring that trainees can focus on science rather than technical hurdles.
5. “It is only useful for teaching, not research.”
Reality: Galaxy is used widely for high-impact studies and industry applications. It supports large-scale data analyses, from the Vertebrate Genomes Project to the Human Cell Atlas. Galaxy’s reproducibility and scalability make it a powerful tool for both education and cutting-edge research, as proved by the 20k+ citations.
6. “It cannot be used on secure data.”
Reality: Galaxy is actively used in secure settings. It supports secure data analysis through features like Bring Your Own Compute (BYOC), encryption, and role-based access control. Galaxy is deployed in clinical and public health settings, ensuring compliance with data protection regulations.
7. “It is not suitable for industry.”
Reality: Galaxy is widely used in industry. Companies in biotech, pharma, and agritech leverage Galaxy for R&D and pipeline development. Its openness and reproducibility reduce vendor lock-in and accelerate innovation, making it a valuable tool for industry professionals.
8. “It is not suitable for advanced users.”
Reality: Galaxy is highly customizable and extensible. Advanced users can develop their own tools, integrate Galaxy with other platforms, and use its APIs for automation. Galaxy’s flexibility makes it suitable for both beginners and experts.
9. “It is not sustainable.”
Reality: Galaxy is a mature and sustainable platform. Its active global community, robust governance, and continuous development ensure long-term support and innovation. Galaxy’s open-source model fosters collaboration and shared responsibility.
10. “It is a black box.”
Reality: Galaxy prioritizes transparency. Every step of an analysis is documented, shareable, and reproducible. Users can inspect tools, workflows, and data provenance, ensuring full transparency and trust in the results.
Why This Matters for the Galaxy Community
This paper isn’t just about correcting the record, it’s about empowering users. By addressing these misconceptions, we hope to:
- Encourage researchers to explore Galaxy’s full potential.
- Reassure decision-makers about Galaxy’s robustness and security.
- Inspire developers to contribute to its growth.
- Foster collaboration across disciplines and sectors.
Galaxy is more than a platform; it’s a community-driven effort toward open, reproducible science. This paper is a testament to that spirit.
Acknowledgments
This paper and its impact would not have been possible without the entire Galaxy community, a vibrant, global network of researchers, developers, educators, and advocates who continuously push the boundaries of open, reproducible science. We are deeply grateful for your contributions, feedback, and passion.
We also acknowledge the founding vision and ongoing support of the Galaxy Project, which has grown from a small initiative into a cornerstone of accessible, scalable data analysis.
Special thanks go to:
- Andrew J. Page and Ross Lazarus for initial discussions;
- Anne Fouilloux, Simon Gladman, Sveinung Gundersen, Yvan Le Bras, Gareth Price, Jennifer Hillman-Jackson, Florian Heyl, Pavankumar Videm, Dave Clements, Jeremy Goecks, Bradley W. Langhorst, and Daniel Blankenberg for participating in the initial thought process;
- Aysam Guerler for historical Galaxy interface images;
- Nate Coraor and Catherine Bromhead for providing the active users and jobs data;
- Pratik Jagtap, Anton Nekrutenko, and Nikos Pechlivanis for conversations and contributions throughout the development of this manuscript.
Spread the Word: Poster Available!
To help share these insights, we’ve created a poster summarizing the 10 misconceptions. This poster was presented at the French Bioinformatics Conference (JOBIM 2025) and is now available for everyone to use. You can download, print, and share it as-is, or adapt it into a flyer to advocate for Galaxy in your own community. Let’s break those misconceptions together!