Release 226.0: -------------- GTDB release R10-RS226 comprises 732,475 genomes organised into 143,614 species clusters. Additional statistics for this release are available on the GTDB Statistics page. Release notes: -------------- - Post-curation cycle, we identified updated spelling for 1 taxon and a valid name for a placeholder: g__Prometheoarchaeum (updated name: Promethearchaeum) f__MK-D1 (updated name: Promethearchaeaceae) Note that the LPSN linkouts point to the correct updated names. We encourage users to use the updated names as these will appear in the next release. - QC criteria for GTDB was modified to consider CheckM v1 and v2 completeness and contamination estimates. In order to pass QC, a genome must have completeness >=50%, contamination <5%, and quality (completeness - 5*contamination) >=50% using both the CheckM v1 and v2 estimates. The exception is that a contig comprised of <10 contigs passes QC if these criteria are meet be either CheckM v1 or v2. - Mash is no longer used as a prefilter for establishing GTDB species clusters as this was found to be unnecessary with the prefiltering provided internally by skani (Shaw et al., Nat Methods, 2023). - The 20% most heterogeneous sites were removed from the archaeal MSA using alignment_pruner.pl (https://github.com/novigit/broCode/blob/master/alignment_pruner.pl). - The GTDB taxonomy tree now provides links to Sandpiper (https://sandpiper.qut.edu.au) results which provide information about the geographic and environmental distribution of a taxon. - We thank Jan Mares for his assistance in curating the class Cyanobacteriia, Peter Golyshin for bringing Ferroplasma acidiphilum strain Y (GCF_002078355.1) to our attention, and Brian Kemish for providing IT support to the project.