Journal article

Rapid protein alignment in the cloud: HAMOND combines fast DIAMOND alignments with Hadoop parallelism


Authors listYu, J; Blom, J; Sczyrba, A; Goesmann, A

Publication year2017

Pages58-60

JournalJournal of Biotechnology

Volume number257

ISSN0168-1656

eISSN1873-4863

Open access statusHybrid

DOI Linkhttps://doi.org/10.1016/j.jbiotec.2017.02.020

PublisherElsevier


Abstract
The introduction of next generation sequencing has caused a steady increase in the amounts of data that have to be processed in modern life science. Sequence alignment plays a key role in the analysis of sequencing data e.g. within whole genome sequencing or metagenome projects. BLAST is a commonly used alignment tool that was the standard approach for more than two decades, but in the last years faster alternatives have been proposed including RapSearch, GHOSTX, and DIAMOND. Here we introduce HAMOND, an application that uses Apache Hadoop to parallelize DIAMOND computation in order to scale-out the calculation of alignments. HAMOND is fault tolerant and scalable by utilizing large cloud computing infrastructures like Amazon Web Services. HAMOND has been tested in comparative genomics analyses and showed promising results both in efficiency and accuracy.



Citation Styles

Harvard Citation styleYu, J., Blom, J., Sczyrba, A. and Goesmann, A. (2017) Rapid protein alignment in the cloud: HAMOND combines fast DIAMOND alignments with Hadoop parallelism, Journal of Biotechnology, 257, pp. 58-60. https://doi.org/10.1016/j.jbiotec.2017.02.020

APA Citation styleYu, J., Blom, J., Sczyrba, A., & Goesmann, A. (2017). Rapid protein alignment in the cloud: HAMOND combines fast DIAMOND alignments with Hadoop parallelism. Journal of Biotechnology. 257, 58-60. https://doi.org/10.1016/j.jbiotec.2017.02.020



SDG Areas


Last updated on 2025-10-06 at 10:46