December 2012 Galaxy Update
Welcome to the December 2012 edition of Galaxy Update, a monthly summary of what is going on in the Galaxy community. Galaxy Updates complement the Galaxy Development News Briefs which accompany new Galaxy releases and focus on Galaxy code updates.
New Papers
These papers may be of interest to the Galaxy community:
- Enis Afgan, Brad Chapman and James Taylor, "CloudMan as a platform for tool, data, and analysis distribution." BMC Bioinformatics 2012, 13:315
- Jeremy Goecks, Nate Coraor, The Galaxy Team, Anton Nekrutenko & James Taylor, "NGS analyses by visualization with Trackster." Nature Biotechnology 30, 1036–1039 (2012)
- Samantha Baldwin, Roopashree Revanna, Susan Thomson, et al., "A Toolkit for bulk PCR-based marker design from next-generation sequence data: application for development of a framework linkage map in bulb onion (Allium cepa L.)," BMC Genomics, Vol. 13, No. 1. (2012), 637
- Jeremy C. Morgan, Robert W. Chapman, Paul E. Anderson, "A next generation sequence processing and analysis platform with integrated cloud-storage and high performance computing resources. Proceedings of the ACM Conference on Bioinformatics, Computational Biology and Biomedicine
- Bo Liu, Borja Sotomayor, Ravi Madduri, Kyle Chard, "[Deploying Bioinformatics Workflows on Clouds with Galaxy and Globus Provision](http:// bit.ly/UPBbRI)." Third International Workshop on Data Intensive Computing in the Clouds (DataCloud 2012)
These papers were among 37 papers added to the Galaxy CiteULike group since the last Galaxy Update.
New Public Galaxy Servers
- Andromeda: The Netherlands Bioinformatics Centre (NBIC) and BiG Grid/SARA have launched a public and fully populated Galaxy instance that runs on a cloud infrastructure. See the announcement for more details.
- RepeatExplorer: Features utilities for Graph-based clustering and characterization of repetitive sequences in next-generation sequencing data and tools for the detection of transposable element protein coding domains. RepeatExplorer is hosted by the Laboratory of Molecular Cytogenetics, Institute of Plant Molecular Biology, Biology Centre ASCR
See the list of public Galaxy servers for more open and accessible Galaxy instances.
The Galaxy is expanding! Please help it grow. Who's Hiring
- Research Fellow (Molecular Biologist) @ MRC Human Genetics Unit at the MRC Institute of Genetics and Molecular Medicine, Edinburgh, UK
- Offre de stage M2 PRO, Mathématique, Informatique et Génome (MIG) est une unité de l'INRA de Jouy-en-Josas
- Engineer position in bioinformatics: structural polymorphism analysis from NGS data @ UMR de Génétique Végétale, INRA-Université Paris Sud-CNRS
- R/Bioconductor and Genomics Expert @ the Friedrich Miescher Institute, affiliated with the University of Basel.
- Computational Biologist @ the Harvard Stem Cell Institute's (HSCI)Center for Stem Cell Bioinformatics
- The Galaxy Project is hiring post-docs @ Penn State and Emory
Got a Galaxy-related opening? Send it to outreach@galaxyproject.org and we'll put it in the Galaxy News feed and include it in next month's update.
Upcoming Events and Deadlines
See the Galaxy Events Google Calendar for details on these and other events.
Events
Deadlines
We only know of one deadline coming up in December:
- Genomic Standards Council Meeting (GSC 15) (Dec 20: Deadline for submission of abstracts)
However, there are a few high-profile meetings:
Source Code Documentation
The Galaxy Project is now using Sphinx hosted at Read the Docs to document the Galaxy code base. Two versions are available:
- galaxy-dist: Code documentation for latest official release of Galaxy.
- galaxy-central: Code documentation for the latest committed version of the code. This should never lag behind the Bitbucket checkins by more than an hour.
This documentation is a work in program and should make incremental improvements with each release.
New Galaxy Distributions
new: $ hg clone http://www.bx.psu.edu/hg/galaxy galaxy-dist
upgrade: $ hg pull -u -r 5dcbbdfe1087
New Galaxy CloudMan Release
CloudMan offers an easy way to get a personal and completely functional instance of Galaxy in the cloud in just a few minutes, without any manual configuration.
This update brings a large number of updates and new features, the most prominent ones being:
- Support for Eucalyptus cloud middleware. Thanks to Alex Richter. Also, CloudMan can now run on the HPcloud in basic mode (note that there is no public image available on the HPcloud at the moment and one would thus need to be built by you).
- Added a new file system management interface on the CloudMan Admin page, allowing control and providing insight into each available file system
- Added quite a few new user data options. See the UserData page for details. Thanks to John Chilton.
- Galaxy can now be run in multi-process mode. Thanks to John Chilton.
- Added Galaxy Reports app as a CloudMan service. Thanks to John Chilton.
- Introduced a new format for cluster configuration persistence, allowing more flexibility in how services are maintained
- Added a new file system service for instance's transient storage, allowing it to be used across the cluster over NFS. The file system is available at /mnt/transient_nfs just know that any data stored there will not be preserved after a cluster is terminated.
- Support for Ubuntu 12.10
- Worker instances are now also SGE submit hosts
This update comes as a result of 175 code changesets; for a complete list of changes, see the commit messages.
Any new cluster will automatically start using this version of CloudMan. Existing clusters will be given an option to do an automatic update once the main interface page is refreshed.
Tool Shed Contributions
Several new repositories were added to, and existing repositories updated in the Galaxy Tool Shed in the past month.
bsmap
: A short read mapping tool for bisulfite sequencing readsbowtie_wrappers, bowtie_color_wrappers, lastz, lastz_paired_ends
: Wrappers all created by the Galaxy Teamsemweb_tools
: A collection of Semantic Web tools, including a (pure python) SPARQL-to-tabular format toolea_utils
: FASTQ processing utilities; sam-stats addedregex_replace
: Regular expression replacement using the Pythonblast_datatypes
andncbi_blast_plus
: Added BLAST database supporthomer
: motif discovery and next generation sequencing analysismeme_chip
: Motif based sequence analysisSVDetect
: detect genomic structural variations from paired-end and mate-pair sequencing datacloudmap_in_silico_complementation
: Perform in silico complementation analysis on multiple tabular snpEff output filessniploid
: compares SNP detected from a polyploid to those derived from its parental genomestabular_label_convert
: Takes a tabular format file of numerical values and converts the labels of the rows or columns using an alias maptabular_edit
: Edit the contents and row/column labels of a tabular file using python statements
Other News
- GCC2013 is coming. Help get the word out!
- Training Day Topic Nominations for GCC2013 will open in December. Please start thinking of ideas now.
- Slides and Screencast from November GalaxyAdmins Meetup are online. The next GalaxyAdmins Meetup will be on January 16 and feature John Chilton discussing "Deploying Galaxy on OpenStack with CloudBioLinux & CloudMan"
- New Tutorial: Analysis of ChIP-seq data in Galaxy from the BaRC, Whitehead Institute
- A short "Getting started with JGalaxy" document (with screenshots), by John Chilton
- 生物人的電腦教室:高通量定序分析一次搞定 including Galaxy, by Eric Lee
- Batch Workflow starting using the Galaxy API : Practical Example by Geert Vandeweyer