The Vertebrate Genomes Project (VGP) aims to create reference quality genome assemblies of all vertebrate species on earth. In Phase I of the project, reference genomes will be assembled for mammals, birds, reptiles, amphibians, and fish from 260 vertebrate orders. This ambitious collaboration brings together researchers from across the globe and utilizes several sequencing technologies, including long reads from Pacific Biosciences, optical maps from Bionano Genomics, HiC data from Arima-HiC and linked reads from 10X Genomics.
Here we present the proposed analysis pipeline for VGP Phase I genome assemblies. This approach uses PacBio’s Falcon and Falcon Unzip for assembly, followed by multiple rounds of scaffolding with 10X Genomics linked reads, Bionano optical maps, and Hi-C scaffolding. Some of the tools used include Scaff10X, Pilon, PBJelly and PacBio’s Arrow algorithm. Several species have been assembled using this approach, including Anna’s hummingbird, the kakapo, and the Canada lynx. The software tools required are available in an easy to run manner on the DNAnexus Platform, removing dependency headaches and allowing researchers to run a single, consistent analysis pipeline on all genome assemblies. Data on the platform can be easily shared with collaborators and with the public. As sequencing technologies evolve, further tools and data types can be incorporated into the pipeline to create chromosome-length, reference quality assemblies.
Learn how DNAnexus can accelerate your De Novo Assembly: http://go.dnanexus.com/de-novo-assembly