This directory contains datasets for the 7,903 UBA MAGs described in: Parks DH, et al. 2017. Recovery of nearly 8,000 metagenome-assembled genomes substantially expands the tree of life. Nat Microbiol, doi:10.1038/s41564-017-0012-7/ https://www.nature.com/articles/s41564-017-0012-7 uba_ar_genomes.tar.gz: The 623 archaeal UBA genomes. These genomes have been submitted to NCBI and should appear in INSDC repositories shortly. uba_bac_genomes.tar.gz: The 7,280 bacterial UBA genomes. These genomes have been submitted to NCBI and should appear in INSDC repositories uba_assemblies.tar.gz: CLC assemblies of the 1,550 SRA metagenomes along with coverage information. uba_assemblies.part-*: Same as above, split into files of 10GB. uba_ar_prokka.tar.gz: Gene prediction and annotation for the 623 archaeal genomes using Prokka v1.12 with Pfam v31 and UniProt databases created on April 17, 2017 according to the Prokka instructions. uba_bac_prokka.tar.gz: Gene prediction and annotation for the 7,280 bacterial genomes using Prokka v1.12 with Pfam v31 and UniProt databases created on April 17, 2017 according to the Prokka instructions. uba_bac_prokka.part-*: Same as above, split into files of 10GB. uba_ssu.tar.gz: 16S sequences identified in the 7,903 UBA genomes using the 'ssu_finder' feature of CheckM. The files provided as multiple parts can be uncompressed using: > cat uba_bac_prokka.part-* | tar xz If you make use of the Prokka annotations please cite: Seemann T. 2014. Prokka: rapid prokaryotic genome annotation Bioinformatics, 30, 2068-9.