The August 2015 Galactic News!
Welcome to the August 2015 Galactic News, a summary of what is going on in the Galaxy community. It's been, and will be, a busy few months. If you have anything to include in the September News, please send it to [Galaxy Outreach](mailto:outreach AT galaxyproject DOT org).
Events
GCC2015 Report
The 2015 Galaxy Community Conference (GCC2015) was held 4-8 July 2015, in Norwich, UK. GCC2015 was hosted by The Sainsbury Lab in Norwich, UK. By all measures, the meeting was a tremendous success.
GCC2015 registrations reached an all-time high, and GC2015 featured more events than any previous GCC:
-
A record 230 researchers signed up for one or more GCC2015 events:
- 41 registered for the 2nd Annual GCC Coding Hackathon, held at TGAC, 4-5 July,
- 17 registered for the first ever GCC Data Wrangling Hackathon, held simultaneously at TGAC,
- 72 registered for the first ever (and sold out) Training Sunday on 5 July.
- 178 registered for Training Day on 6 July,
- 191 registered for the GCC Meeting, on 7-8 July,
- 15 topics were offered in 18 sessions by 32 instructors over two days of training.
-
The GCC2015 Meeting featured
- 28 accepted and keynote talks (slides for most are online, videos are coming)
- 34 posters (most are online)
- 14 lightning talks (slides for most are online, videos are coming)
- 10 Birds-of-a-feather meetups (writeups for most are available)
-
There was a record amount of Twitter traffic at GCC2015. Peter Cock has created several Storify pages that captured the stream:
The GCC2015 Organising Committee would like to thank the Scientific Committee, Coding Hackathon Organisers, Data Wrangling Hackathon Organisers, Conference Sponsors, BoF Organisers, Training Instructors and absolutely everyone else who contributed to the success of GCC2015.
Which brings us to ...
GCC2016: June 25-29, 2016, Bloomington, Indiana, United States
We are pleased to announce that the 2016 Galaxy Community Conference (GCC2016) will be held June 25-29 at Indiana University in Bloomington, Indiana, United States.
GCC2016 is the 7th annual gathering of the Galaxy community. The conference will include keynotes, accepted talks, poster sessions, birds-of-a-feather meetups, exhibitors, and plenty of networking opportunities. There will also be three days of pre-conference activities, including hackathons and training. Mark your calendars now. Registration will open in the fall.
And, please help get the word out by posting the announcement at your organization.
If you work in data-intensive biomedical research, there is no better place than GCC2016 to present your work and to learn from others.
August GalaxyAdmins Meetup
Please join us for the next GalaxyAdmins meetup on August 20 when Aaron Petkau will cover
Genomic data management at Canada's National Microbiology Laboratory with IRIDA and Galaxy
GalaxyAdmins is a special interest group for Galaxy community members who are responsible for Galaxy installations.
July 2015 Pitagora-Galaxy Meetup
The Pitagora-Galaxy community in Japan had a meetup on 23 July 2015 at the National Institute of Informatics in Tokyo. The discussion is summarized here (in Japanese).
Galaxy at ISMB/ECCB and BOSC 2015
BOSC 2015 and ISMB/ECCB 2015 followed immediately after GCC2015 this year, and Galaxy again a significant presence at both.
Slides, posters and videos for most presentations are now available on the Galaxy @ ISMB/ECCB and BOSC 2015 page.
Other Events
There are upcoming events in four countries on three continents. See the Galaxy Events Google Calendar for details on other events of interest to the community.
Designates a training event offered by GTN member(s) |
New Papers
143 new papers referencing, using, extending, and implementing Galaxy were added to the Galaxy CiteULike Group in June and July. Highlights include:
- AGEseq: Analysis of Genome Editing by Sequencing, Xue & Tsai
- Metaproteomic analysis using the Galaxy framework, Jagtap et al.
- Darwintree: A Molecular Data Analysis and Application Environment for Phylogenetic Study, Meng et al.
- BioMaS: a modular pipeline for Bioinformatic analysis of Metagenomic AmpliconS, Fosso et al.
- xHeinz: an algorithm for mining cross-species network modules under a flexible conservation model, El-Kebir et al.
- The Plant Genome Integrative Explorer Resource: PlantGenIE.org, Sundell et al,
- Genome-wide analysis of signatures of selection in populations of African honey bees (Apis mellifera) using new web-based tools, Fuller et al
- Enabling Data-Intensive Biomedical Studies, Simone Leo
- Enabling cloud bursting for life sciences within Galaxy, Afgan et al.
- Building and Provisioning Bioinformatics Environments on Public and Private Clouds, Afgan et al.
- LiSIs: An Online Scientific Workflow System for Virtual Screening, Kannas, et al.
- RNA-Rocket: an RNA-Seq analysis resource for infectious disease research, Warren et al.
- Genome-wide analysis of signatures of selection in populations of African honey bees (Apis mellifera) using new web-based tools, Fuller, et al.
- Context influences on TALE–DNA binding revealed by quantitative profiling, Rogers, et al.
- Compact graphical representation of phylogenetic data and metadata with GraPhlAn, Asnicar et al.
- ARC Control Tower A flexible generic distributed job management framework, Nilsen et al.
The new papers were tagged with:
# | Tag | # | Tag | # | Tag | # | Tag | |||
---|---|---|---|---|---|---|---|---|---|---|
8 | Cloud | 2 | Project | 11 | Tools | 11 | UsePublic | |||
1 | HowTo | 6 | RefPublic | - | UseCloud | 1 | Visualization | |||
10 | IsGalaxy | 5 | Reproducibility | 6 | UseLocal | 44 | Workbench | |||
57 | Methods | 3 | Shared | 16 | UseMain |
NGS 101 Tutorial Videos are Now Online
The new Galaxy NGS 101 Tutorial was published this spring and introduces a wide range of topics related to analyzing next generation sequencing data with Galaxy. We are happy to announce that videos to accompany this tutorial are now online as well. The 20 videos cover:
- Uploading data from your computer
- Uploading data from FTP
- Uploading data from the Web
- Uploading data from EBI SRA
- Uploading data from NCBI SRA
- QC'ing with FastQC
- A quick into to read mapping
- Mapping to YOUR reference
- Tweaking BAM
- Non-diploid variant calling with NVC
- Non-diploid varinat calling with FreeBayes
- Looking at multiple datasets in IGV
- Calling variants with FreeBayes
- RNA-Seq: Mapping with TopHat
- RNA-Seq: Assembling and quantifying transcripts with CuffLinks
- RNA-Seq: Playing with CuffDiff output and Cummerbund to find differentially expressed transcripts
There are also 4 new videos that provide an overview of different sequencing platforms:
The videos join over 60 other videos on the Galaxy Project Vimeo Channel.
Genomic Data Science with Galaxy Launched on Coursera
A Genomic Data Science with Galaxy course was launched on the online teaching platform Coursera in July. The course is part of the Genomic Data Science Specialization. A new session starts every month from now through at least the end of 2015. Coursera courses can be taken for free, or you can pay the course fee and receive a certificate upon completion.
The video lectures for the Galaxy material is on Vimeo.
Who's Hiring
The Galaxy is expanding! Please help it grow.
- Assistant Computational Scientist at VIB, Gent, Belgium
- The Galaxy Project is hiring software engineers and post-docs
Got a Galaxy-related opening? Send it to outreach@galaxyproject.org and we'll put it in the Galaxy News feed and include it in next month's update.
New Public Galaxy Servers
July was a banner month, with the addition of 5 new publicly accessible Galaxy servers, bringing the list of public servers to over 70.
- FingeRprinting Ontology of Genomic variations: FROG fingerprints capture genomic variations at different levels. See the FROG server description for more.
- AGEseq Galaxy @ AspenDB, from the Tsai Lab, Warnell School of Forestry and Natural Resources and Department of Genetics, University of Georgia.
- AGEseq (Analysis of Genome Editing by Sequencing) compares amplicon sequences with expected target sequences and finds insertion/deletion sites in the amplicon sequences. See the AGEseq server description for more.
- Taxonomic studies of environmental microbial communities. See the server description for more.
- MISSISSIPPI Server from The ARTbio bioinformatics facility of the Institut de Biologie Paris Seine based at the University Pierre & Marie Curie.
- RNA and small RNA sequencing dataset analysis as well as for epigenetics or metagenomics studies. See the server descriptions for more.
- SIFTED from Bulyk Lab, Division of Genetics in the Department of Medicine at Brigham & Women's Hospital and Harvard Medical School
- Specificity Inference For TAL-Effector Design (SIFTED) is a computational model for predicting the DNA-binding specificity of any Transcription activator-like effector (TALE) proteins.
Community Committers!
Galaxy now has a formal and open policy for managing the project source code - including for how to add and remove committers. Anyone can issue pull requests and join in the conversation, but committers are trusted to decide how these pull requests are integrated and can participate in formal voting. Read more in (Pull Request 314). Based on their frequent contributions both in terms of code written and contribution to discussions - the Galaxy Project added three new contributors as part of this process. Nicola Soranzo, Björn Grüning, and Helena Rasche are now committers.
Galaxy Community Hubs
Share your training resources and experience now | Share your experience now |
Three new Community Log Board entries were added in July:
- Moving from MySQL to PostgreSQL by Hans-Rudolf Hotz
- BioJS2Galaxy – A step by step guide by Benedikt Rauscher and Benjamen White
- Exposing Galaxy reports via nginx in a production instance by Peter Briggs
And the new Galaxy NGS 101 set of exercises, and the Galaxy Coursera material were added to the Teaching Resources Directory:
- Galaxy NGS 101 by Anton Nekrutenko
- Genomic Data Science with Galaxy on Coursera by James Taylor
Updated Galaxy Logos
You may have noticed in the past two months that an updated Galaxy logo has been slowly appearing on project web sites and presentations. The new logo was created by Petr Kadlec of Pura Design, and it slowly making its way into all Galaxy Project logos. As this happens, they will be added to the Images/GalaxyLogos page.
Releases
BioBlend 0.6.0 and 0.6.1
BioBlend versions 0.6.0 and 0.6.1 were released in June and July. BioBlend is a python library for interacting with CloudMan and the Galaxy API(CloudMan offers an easy way to get a personal and completely functional instance of Galaxy in the cloud in just a few minutes, without any manual configuration.) From the release CHANGELOG:
- BioBlend.objects: Rename
ObjDatasetClient
abstract class toObjDatasetContainerClient
. - BioBlend.objects: Add
ABCMeta
metaclass andlist()
method toObjClient
. - BioBlend.objects: Add
io_details
andlink_details parameters
toObjToolClient.get()
method. - Open port 8800 when launching cloud instances for use by NodeJS proxy for Galaxy IPython Interactive Environments.
- When launching cloud instances, propagate error messages back to the caller. The return types for methods
create_cm_security_group
,create_key_pair
inCloudManLauncher
class have changed as a result of this. - Add support for Python >= 3.3.
- Add
get_library_permissions()
method toLibraryClient
. - Add
update_group(), get_group_users(), get_group_roles(), add_group_user(), add_group_role(), delete_group_user()
anddelete_group_role()
methods toGroupsClient
. - Add
full_details parameter
toJobsClient.show_job()
(thanks to Rossano Atzeni). - BioBlend.objects: add
ObjJobClient
andJob
wrapper (thanks to Rossano Atzeni). - BioBlend.objects: add check to verify that all tools in a workflow are installed on the Galaxy instance (thanks to Gianmauro Cuccuru).
- Remove several deprecated parameters: see commits 19e168f and 442ae98.
- Verify SSL certificates by default.
- Add documentation about the Tool Sed and properly link all the docs on ReadTheDocs.
- Solidify automated testing by using tox and flake8.
Planemo 0.13.0 through 0.13.2
Planemo saw one major and several minor releases in the past two months. The most recent is v0.13.2. From the release history:
- Fix for
shed_init
producing non-standard type hints. Issue 243, f0610d7 - Fix tool linting for parameters that define an
argument
but not aname
. Issue 245, aad1eed - Many doc updates including a tutorial for developing tools in a test-driven fashion and instructions for using the planemo appliance through Kitematic (with Kitematic screenshots from Helena Rasche).
- If planemo cannot find a Galaxy root, it will now automatically fetch one (specifying
--galaxy_install
will still force a fetch). Pull Request 235 - Documentation has been updated to reflect new and vastly improved Docker and Vagrant virtual appliances are now available, as well as a new VirtualBox OVA variant.
- Update linting for new tool XML features (including
detect_errors
and output collections). Issue 233, 334f2d4 - Fix
shed_test
help text. Issue 223 - Fix code typo (thanks to Nicola Soranzo). Pull Request 230
- Improvements to algorithm used to guess if an XML file is a tool XML file. Issue 231
- Fix configuration file handling bug. Issue 240
Planemo is a set of command-line utilities to assist in building tools for the Galaxy project.
Others
May 2015 Galaxy Release (v 15.05) Release Notes v 15.05
Pulsar Pulsar 0.5.0 was released in May. Pulsar is a Python server application that allows a Galaxy server to run jobs on remote systems (including Windows) without requiring a shared mounted file systems. Unlike traditional Galaxy job runners - input files, scripts, and config files may be transferred to the remote system, the job is executed, and the results are transferred back to the Galaxy server - eliminating the need for a shared file system.
**CloudMan ** The most recent edition of CloudMan was released in August.
blend4j v0.1.2 blend4j v0.1.2 was released in December 2014. blend4j is a JVM partial reimplemenation of the Python library bioblend for interacting with Galaxy, CloudMan, and BioCloudCentral.
Other News
- Michael Crusoe created a checklist for Galaxy DataSource docs
- Michael Crusoe is also working on supporting the Open Science Framework as a data source in Galaxy.
- The main Galaxy Project repository on GitHub has over 100 forks! Thank you very much for all the contributions!
- Samuel Lampa's created a blog post on Workflow tool makers: Allow defining data flow, not just task dependencies
- The semi-annual update of the Galaxy Project Statistic page is done.