Do you need help with with essay, research paper, homework or even dissertation?
Visit our website - https://goo.gl/HKbmHS (assignmenthelp24.com)
=================>>>GET ASSIGNMENT HELP<<<=================
=================>>>GET ASSIGNMENT HELP<<<=================
=================>>>GET ASSIGNMENT HELP<<<=================
essays from the spectator
persuasive essay music industry
dismissal of tenured teachers dissertation
scholarship essay examples for nursing
lab safety rules essay format
lesson 1 homework practice lines for acting
what are the contents of research reports smu assignment
persuasive I Need Help Writing My Research Paper essay body paragraph lesson plans
optimism essay title examples
essay on my aim to become scientist pictures
essay about foreign who invaded india during medieval period music history
duedaa essay competition 2012
cover letter for credit controller job application
china claims new nuclear technology essay
why we fight essay
bioc views descriptive essay
fidm essay prompt questions
career powerpoint presentation assignment
biodiversity conservation in india essay
wergild essay typer
athlete biography assignment for high school
quoting a movie in an essay mla guidelines
internship cover letter examples uk
500 word essay saying sorry
help with writing a college admission essay
msc business dissertation examples in health
sustainable water use case study
easy classification essay
genetics ap biology essays
all year round education argumentative essay
arrangements music definition essay
bgan terminal comparison essay
essayer c est l adopter en anglais intermediaire
difference between indian and western culture essay
media research paper ideas for 8th
memorise essays quickly
o levels english essays for children
thesis statement on gender stereotypes
essay summary sentence
math homework help free geometry homework
we are what we eat essay examples
national junior honor society requirements essay
gender representation in the media essays
non linear fashion definition essay
beispiel essay auf englisch telefonieren
help with essay outline for human trafficking
water shortage problem solution essay rubric
harvard case study writing examples
summer introduction essay paragraph
technology in hospitality industry essay topics
how things work assignment discovery energy flow ecosystem
bibliography style invalid security
ted talk assignment and criminal justice 101
sample essays for revision practice
essay on social media today webinars
chern's case study chapter 9
ucas personal I Need Help Writing My Research Paper statement history of art janson
answer sociology essay questions
an example of a introduction for an essay
flocabulary five paragraph essay youtube
importance of military accountability essays about life
op ed essay definition of freedom
an ideal student short essay
needleman wunsch algorithmus beispiel essay
self assignment of local variable and global variable
research paper smoking during pregnancy
snoke theme song comparison essay
last minute homework memes
thesis statement on shakespeare
bshf 101 solved assignment 2010 chevrolet
evaluation essay examples on movies free
life topics for essays
acta cu5 assignment of mortgage
topic for research paper in philippines filipino
pygmalion education essay
studymode essay downloader premium account
judy brady I Need Help Writing My Research Paper want a wife rhetorical analysis essay
emailing a cover letter as an attachment to this email
omega assignments mass effect 2 dlc
referencing images in dissertation proposal example
socialism critique of liberal democracy essays
the giver fiona assignment of rents
sec520 assignment 2
how to write an opinion essay youtube
elemente der analysis essay
I Need Help Writing My Research Paper case study on dowry death in india
cover letter sample for sales job description
critical thinking programs
pink and say book trailer assignment
how to improve myself essay example
essay on children s day pdf creator
who killed the electric car sample essay about love
elisa franzmann dissertation format
chc2d1 assignments afi
vrp case study
research paper on product innovation definition
essays from lord of the flies
resident evil 4 assignment ada chicago typewriter revelations
10 helpful homework tips for kids
teenage love affair essay
ambition definition essay
example of a bibliography mla format
advertisement essay ideas for college
ruinas rafael garcia romero analysis essay
art and culture critical essays clement greenberg abstract
c++ assignment operator overload inheritance
sari essayah auria/teliasonera finland oyj
50 essays isbn
good topics to write an argumentative essay on
where you going where you been essay music
I Need Help Writing My Research Paper communication internet essay urdu
20 guidelines in writing bibliography entries
texas teacher of the year essays online
free national junior honor society essays
sc0001 assignment of lease
rabindranath tagore in hindi essay on corruption
cover I Need Help Writing My Research Paper letter for retiree re entering the workforce after children
leadership and management in education essay
a raisin in the sun act 2 scene 1 analysis essay
simple photo essay ideas photographers
paradoxes and oxymorons analysis essay
annotated bibliography example apa format 2011 ram
texas college application essay examples
mandolin classification essay
satirical essay on pollution in pakistan
gamsat example essays of compare
sample essay for spm students
us patriot act essay
essays censorship of rap music
examples of cover letters for project management positions
separation of powers essay australia
proquest dissertations and theses umich
presidential election in india essay
essay on my hostel life experience
vidas ajenas critical thinking
erik satie gymnopedie 1 analysis essay
chapter 15 to kill a mockingbird analysis essay
marriage definition essay topics
anfertigung der dissertation englisch lernen
srg 1119 essay
good essay examples pdf files
pewter casting homework help
english extended essay ideas on counter
vasant ritu essay in hindi
significance of the study thesis definition essay
format bibliography mendeleyev
tar sands canada case study
paul weidner nfl referee assignments
gender roles in a midsummer night's dream essay
sample essay in chicago style writing
In 2011, the first fully reconstructed ancient bacterial genome sequence was published—that of Yersinia pestis—which confirmed at least one of the etiological agents of the Black Death pandemic (16) and put to rest years of controversy that had dogged polymerase chain reaction (PCR)–based attempts to identify the pathogen in archaeological samples (41, 57, 58, 154, 155). Other ancient microbial genome sequences quickly followed, including those from other plague epidemics (14, 47, 158, 178, 192) and additional pathogens, such as Mycobacterium leprae (170), Mycobacterium tuberculosis (13), Tannerella forsythia (196), Brucella melitensis (85), and Helicobacter pylori (112). The key turning point was the availability of high-throughput sequencing (HTS), a transformative innovation in DNA sequencing (114, 115), and sequence capture enrichment methods (16, 20, 62, 75)—two techniques that increased both sample throughput and data output by orders of magnitude. These advancements revolutionized ancient DNA (aDNA) research and, more broadly, ushered in the era of genomics nearly overnight (92, 140).
However, with these technological advances come new challenges. Tools and techniques are needed to sort, evaluate, authenticate, and interpret the hundreds of millions of DNA sequences that have now become the standard output of genomics and paleogenomics laboratories alike. Numerous protocols, scripts, pipelines, and computational environments are available, as are a myriad of genetic and genomic databases, but the rapid proliferation of these tools has left many uncertain about which ones to use and when to use them. For example, the decision to use either alignment-based or alignment-free taxonomic classifiers can have a strong impact on microbial community reconstruction (106). Likewise, the choice of reference databases can greatly affect taxonomic assignment (149) and, consequently, the false positive and false negative rates of pathogen detection. Similar but nonequivalent choices in parameter settings can introduce systematic biases, leading to spurious sequence alignments and false claims, and failure to statistically account for both biological and taphonomic factors in the selection of appropriate analysis pipelines and statistical tests can result in inaccurate conclusions.
As the complexity of paleogenomic data analysis increases, standards and guidelines for best practices are required to ensure not only high-quality data generation, but also accurate and meaningful data interpretation. Numerous challenges face the growing field of microbial archaeology—some stemming from the way microbes reproduce and recombine during life, others shared with genomics more generally, and still others specific to ancient and degraded samples. Concerted effort will be required by the research community to identify and address these challenges in order to achieve a robust and established scientific discipline.
In March 2016, the Max Planck Institute for the Science of Human History hosted the first Standards, Precautions, and Advances in Ancient Metagenomics (SPAAM) conference in order to identify and discuss the challenges involved in analyzing ancient microbial metagenomic data. Here, we present the outcomes of this meeting and outline a series of precautions and best practices for the emerging field of microbial archaeology.
2. RESEARCH DIRECTIONS IN MICROBIAL ARCHAEOLOGY
Research directions within the growing field of microbial archaeology can generally be divided into two paths: pathogenomics and microbiome studies. The former focuses on understanding pathogen evolution and host-microbe interactions involved in disease states (138), whereas the latter focuses on understanding the diversity, structure, and function of endogenous microbial communities and their interactions with the host during both health and disease states (78, 79). In general, pathogenomics is concerned primarily with individual disease-causing microorganisms, such as those causing plague (Y. pestis), tuberculosis (M. tuberculosis), or leprosy (M. leprae), whereas microbiome studies focus more on the distribution and diversity of the microbes native to a given host and their role in host functions, such as digestion, immune system stimulation, and chronic inflammation.
There is a great deal of overlap between these two disciplines in practice, as pathogenomics may include polymicrobial infections (e.g., dental caries) or mixed coinfections (e.g., pneumonia and tuberculosis), and microbiome studies may focus on keystone taxa that disproportionately drive community behavior (e.g., Streptococcus mutans or Porphyromonas gingivalis). Additionally, both disciplines rely heavily on metagenomic sequence data, and thus many of their analytical tools are shared or similar.
2.1. The Growth of the Field
Microbial archaeology can trace its origins back several decades, and early research in the field focused on targeted PCR amplification of short specific loci, followed by electrophoretic characterization or Sanger DNA sequencing. Mycobacterial spoligotyping of skeletal lesions (208) and sequencing of amplified 16S ribosomal RNA (rRNA) gene clones from paleofeces (23) are characteristic of paleomicrobiology approaches in the pre-HTS era. However, these low-throughput techniques, which were adapted from protocols originally developed for clinical and ecological applications, have several drawbacks when applied to ancient and degraded samples from environmental contexts. First, targeted PCR typically requires long (>100 base pairs), well-preserved DNA templates, which are not characteristic of the vast majority of authentic aDNA fragments (64, 181); second, ancient samples typically require a large number (>35) of PCR cycles for successful target amplification, which makes this approach particularly sensitive to background and environmental contamination; third, cloning and Sanger sequencing do not allow efficient investigation of template damage patterns in order to authenticate aDNA sequences; fourth, targeted PCR is particularly susceptible to amplification biases, including both off-target and skewed PCR amplification, as well as taxonomic dropout; and finally, the experimental replicability of studies using these techniques is generally low, and the results have proven to be difficult to independently authenticate or validate (57, 64, 199, 207). Such problems reached a critical point in 2005, when a prominent review of aDNA research summed up the field of microbial archaeology as “the microbial problem” and largely dismissed it as a discipline (200).
The advent of HTS technologies in the mid-2000s presented a powerful solution to the inherent shortcomings of conventional PCR-based approaches, and this new technology has dramatically influenced the field of microbial archaeology. Today, nearly all ancient microbial research utilizes HTS-based techniques, and multiple sequencing platforms and analytical strategies are available. The situation mirrors that of genetic research on ancient humans, which at first was hampered by contamination concerns resulting from PCR amplification and Sanger sequencing–based approaches but is now flourishing in the post-HTS era (65, 157, 159).
2.2. Definition of Terms
This article focuses on the analysis of metagenomic (all available DNA) data obtained from HTS shotgun-sequenced (untargeted) or sequence-captured (target-enriched) genes and genomes obtained from a microbiota (an assemblage of microorganisms) present within a microbiome (a defined microbial ecosystem) (113). Archaeological samples typically contain mixtures of endogenous (antemortem) and exogenous (postmortem) microbial DNA that may include host-associated commensal taxa (e.g., oral microbes in dental calculus), epidemic pathogens (e.g., Y. pestis in the pulp cavity of teeth), and environmental bacteria (e.g., soil microbes involved in decomposition). Additionally, contaminating DNA sequences from handling (e.g., skin microbes), storage conditions (e.g., bacteria and fungi overgrowth), and laboratory sources (e.g., reagents contaminated with enzyme expression vectors) may also be present.
This addition of both ancient and modern exogenous microbial DNA in archaeological remains makes ancient pathogen and microbiome studies more complicated than investigations of fresh samples. For example, in contrast to a freshly cultured clinical specimen, which would typically contain a single clonal pathogen and no other major DNA sources, analysts of ancient pathogens must grapple with complex host and environmental backgrounds, potentially including nonpathogenic, soil-derived relatives of the pathogen of interest. Postmortem colonization and contamination also present challenges for microbiome analysis by skewing diversity metrics and inflating community membership. It is thus important to note the distinction between ancient endogenous microbiota, which are the host-associated microbes that were present during life, and exogenous microbiota, which include both decomposition-related and recent contaminant taxa.
3. WHAT IS A MICROBIAL SPECIES?
Before endogenous microbiota can be analyzed, it is first important to define what microbes are. For the purposes of this review, we define microbes as members of the prokaryotic domains Bacteria and Archaea. Microeukaryotes and viruses are thus beyond the scope of this review, even though noteworthy achievements have been made in the successful genetic characterization of potato late blight evolution (116, 205), barley stripe mosaic virus (176), Spanish influenza strains (189), early simian immunodeficiency virus (150), and seventeenth- and eighteenth-century smallpox strains (12, 42).
Although species annotations are routinely applied to microbial taxa, there is relatively little consensus on what a microbial species actually is (2, 40). Unlike Ernst Mayr's birds, microbes adhere to few, if any, of the tenets of the biological species concept (35, 72, 117), and although many microbial species concepts have been proposed, none have been widely accepted (2). This is largely because although microbes reproduce asexually by binary fission at a cellular level, they also exchange genetic information horizontally—including across taxonomically divergent groups. The discovery of such microbial mating systems earned the 1958 Nobel Prize in Physiology or Medicine for molecular biologist Joshua Lederberg (82), who, incidentally, later went on to popularize the term microbiome in 2001 (101).
At the heart of the microbial species problem is a tension between methods-based and methods-free species definitions, in part reflecting philosophical differences in the fields of microbial systematics and evolutionary biology. At a crude level, methods-based approaches are objectively measurable, but they suffer from the fact that methods continuously change with new technologies and that the species criteria that have been established are largely arbitrary. Methods-free definitions are more grounded in evolutionary theory but are often unmeasurable (2). In the genomics era, methods-based definitions currently prevail as pragmatic solutions to allow researchers to name and discuss taxonomic groups using Linnaean taxonomy, a stopgap measure that is both unsatisfying and at times misleading but is also necessary to allow investigations of what are essentially phenotypic and genetic clusters (29) of metapopulation lineages (2) that defy easy classification.
3.1. Pragmatic Definitions
For historical reasons, the gold standard of pragmatic solutions to the microbial species problem is the characterization of genome similarity based on reciprocal, pairwise DNA reassociation values under controlled conditions. Microbes whose reassociation values are ≥70% in DNA hybridization experiments are generally considered to belong to the same species, in part because this threshold generally recapitulates classical species distinctions based on phenotypic traits (2). Because the method is empirical and requires purified genomic DNA from both microbes being tested, it can be applied only to cultivable microbes. Given that only a small fraction of microbial taxa can currently be cultivated using known techniques (147, 193), this definition is poorly suited to the identification of most microbial species. Moreover, because of the highly fragmented nature of aDNA, this method cannot be applied to ancient samples.
Alternatively, the 16S rRNA gene can be PCR amplified from a pool of noncultured microbes, and the resulting sequence identities can be calculated as a proxy for DNA reassociation values. A cutoff of roughly 97–99% sequence identity for the full gene generally correlates with species boundaries determined by DNA reassociation (118, 190). Taxa defined by their 16S rRNA gene sequence alone are not described as species, but rather as operational taxonomic units (OTUs)—convenient measurable proxies for microbes that are related by descent. Although the term OTU is generally used to refer to a species-like unit, it can theoretically represent microbial biodiversity at any level as long as its definition is clear and consistent (77).
Because 16S rRNA gene amplification and sequencing can be performed on mixed microbial communities without cultivation, it is a powerful method for the discovery of novel taxa; however, this method also has important limitations. Current short-read HTS technologies, such as Illumina sequencing by synthesis, do not allow for deep sequencing of the full ∼1,540-base-pair-long 16S rRNA gene; instead, maximum achievable read lengths typically limit analysis to one or more of the gene's nine shorter hypervariable regions. However, even these short regions are generally longer than most aDNA fragments (207). Taxonomic resolution varies across these regions (207), effectively reducing confident taxonomic assignment to the level of genus or family for many groups. This reduction in resolution is not consistent across microbial phyla and tends to disproportionately affect certain groups (207). Emerging technologies, such as Pacific Biosciences’ single-molecule real-time sequencing, are capable of sequencing full genes and may soon replace hypervariable-region short-read sequencing in metataxonomic studies of modern samples (167); however, the highly fragmented nature of aDNA strongly limits the benefit of this technology for ancient samples. Nevertheless, even with full-length 16S rRNA gene sequences, taxonomic assignment can be problematic for some microbial groups (Figure 1). For example, gut bacteria belonging to the family Enterobacteriaceae are generally poorly resolved by 16S rRNA sequences, with the clinically distinctive genera Escherichia, Salmonella, Shigella, and Klebsiella essentially forming one 16S rRNA gene sequence cluster (Figure 1a), whereas other groups, such as the oral genera Porphyromonas and Tannerella, are monophyletic and can be easily distinguished on the basis of 16S rRNA sequences alone (Figure 1b).
Finally, unlike most microbial genes, the number of 16S rRNA gene copies per genome is highly variable, ranging from 1 to 5 in archaea and from 1 to 15 or more in bacteria (3, 102). Microbial rRNA (rrn) genes are typically colocated into an operon, and operon copy number is associated with microbial habitat and lifestyle (102, 180). Operon copy number is only weakly correlated with taxonomic ranks of genus and higher, and in some cases copy number even varies within species (190). Among archaea, >60% of taxa have a single rrn operon, but among bacteria, >60% of taxa have three or more copies, and up to seven copies are commonly found (3). Although 16S rRNA gene copies may undergo homogenization through gene conversion (68), different sequences are observed within a single species and even within a single genome. Fewer than 40% of taxa with multiple 16S rRNA genes have identical 16S rRNA sequences in each operon, although sequence divergence between copies is typically <1% (3, 190). 16S rRNA gene reference databases generally do not take this into account, and instead contain composite or consensus sequences obtained from simultaneous PCR amplification and pooled sequencing of all 16S gene copies on a genome (90). The combined effects of multiple rrn operons per genome and different 16S rRNA gene sequences per operon result in systematic skewing of relative taxonomic abundance and overestimation of microbial diversity in mixed microbial communities (190). The Ribosomal RNA Operon Copy Number Database (rrnDB) maintains updated, annotated lists of rRNA operon copy numbers (180).
Other widely used methods for defining species boundaries include multilocus sequence analysis and multilocus sequence typing, which are similar to the method described above but rely on a panel of usually seven to ten core genes rather than focusing on a single gene (60, 146), as well as genome-wide average nucleotide identity, which compares the sequences of all orthologous genes in the complete genomes of species pairs (87, 160). Average nucleotide identity is in some ways simply a methodological update of the DNA reassociation approach, in which a 95–96% average nucleotide identity is roughly equivalent to a 70% DNA reassociation value (87, 160); however, it therefore also suffers from the same problem that complete genomes are required either from cultivated isolates or from genomes painstakingly assembled in silico from deep-sequenced shotgun or sequence-enriched metagenomic data sets, making it difficult to apply in general, but especially in the context of microbial archaeology.
When assigning taxonomy to metagenomic data, many popular tools use a combination of core gene–focused and whole-genome data, and such an approach is favored in emerging taxonomic tools such as the Metagenomic Intra-Species Diversity Analysis System (MIDAS) pipeline (129), which seeks to characterize strain-level differences in mixed microbial communities. Departing from these approaches are those based on k-mer binning, such as algorithms implemented by Kraken (201), which differ from a gene-centered focus and instead mine taxonomic information from reference databases containing frequency distributions of short sequence fragments (k-mers) across a range of known taxa. Because both of these tools rely on relatively short DNA sequences for taxonomic classification, they are particularly amenable to studies of ancient microbes.
3.2. Complicating Factors
Although microbes reproduce asexually, they do not transmit genetic information in a strictly vertical manner. Microbes can—and frequently do—horizontally transfer genes, plasmids, transposons, and other genetic elements by a wide range of means, including transformation (uptake of DNA from the environment), conjugation (direct transfer of DNA between cells via a pilus), and transduction (transfer of DNA by viruses). Collectively, these processes are referred to as horizontal gene transfer or lateral gene transfer (132, 185), and the transferred DNA can subsequently gain enhanced permanence in the cell through homologous recombination or insertion into the host chromosome.
Although most horizontal gene transfer occurs between related taxa, DNA can also be transferred across higher taxonomic ranks, and even across domains (63). Horizontal gene transfer can also transcend time through the uptake of short, degraded aDNA fragments into living cells (136). Within the context of the microbiome, some bacterial members of a biofilm are prolific producers of extracellular DNA, which they use as a scaffold to anchor themselves in space (61, 67, 198). Given the close proximity and metabolic cooperation of diverse taxa within biofilms, such extracellular DNA serves as an important source of genetic material for horizontal gene transfer via transformation and is thought to be a major factor in the spread of virulence and antibiotic resistance genes within host-associated microbiota (144).
The fluidity by which microbes can acquire—and also lose—large portions of their genomes has no parallel among macroorganisms. Within a given microbial species, the number of genes frequently varies by as much as 20% across strains. For example, genome size among 17 strains of the periopathogen P. gingivalis ranges from 2.2 to 2.4 Mb, a difference of 8%, but the number of genes differs by 22%, ranging from 1,870 genes in strain F0569 to 2,405 genes in strain JCVI SC001. This is true even though these strains exhibit >99.4% sequence identity in the 16S rRNA gene and >98.8% sequence identity across a panel of 11 housekeeping genes (coa, dnaK, ef-tu, ftsQ, gdpxJ, hagB, mcmA, nah, pga, recA, and pepO) (93) [analysis performed on all complete or nearly complete (scaffolded) genomes available in GenBank as of November 2016; for details, see Supplemental Appendix 1]. By contrast, all members of a eukaryotic species carry a nearly identical gene set, and 75% of human genes have homologs in the genome of the puffer fish Takifugu rubripes, which diverged from mammals more than 450 Mya (8, 148).
To account for these vast differences in genome size and gene content among strains, the collective genomes of all members of a microbial species-level clade (a monophyletic group of related taxa) are conceptualized as having two parts: a core genome and a pan-genome. The pan-genome, a term first introduced in 2005, consists of all genes within all strains of a species-level clade (184), whereas the core genome represents a subset of genes that are generally shared among strains. The core genome is variably defined, but the National Center for Biotechnology Information defines it as comprising the genes that are present in >80% of all genomes within a species-level clade (183). By contrast, the minimum core genome is defined as the number of genes shared by all genomes within a special-level clade or equivalent. By either definition, the core genome comprises primarily housekeeping genes involved in replication, transcription, translation, and other basic cell functions required for life (118). In general, core genes of well-studied clades make up ∼70–80% of the pan-genome (183).
Despite being relatively central to cell function, core genes can undergo homologous recombination, a process known as core genome transfer. Core genes involved in transcription and translation, such as rRNAs, recombine only rarely (28, 89, 99), but recombination rates of other core genes can be high (202). Streptococcus is an important host-associated genus known for high levels of genome plasticity and recombination. In one study of streptococcal human and agricultural pathogens, core genome recombination was detected in all investigated streptococcal lineages, and 18–37% of the core genome was estimated to be recombinant (103). Core genome transfer rates vary considerably among taxa. H. pylori, Salmonella enterica, Streptococcus pneumoniae, Neisseria meningitidis, and Neisseria lactamica are human-associated bacteria with unusually high core genome recombination rates, in which nucleotide changes resulting from recombination exceed those arising from mutation by more than fivefold; by contrast, recombination rates in Staphylococcus aureus, Lactobacillus casei, Bartonella henselae, and Bordetella pertussis are fivefold lower than mutation rates (191). Core genome recombination occurs most frequently in taxa that are naturally competent (genetically capable of transformation), but it has also been documented in noncompetent cells at genetic loci in proximity to mobile elements (44, 145, 191).
Noncore genes of the pan-genome include many gene types that may be involved in adaptation to various nutrient sources or environmental conditions, and they may or may not be carried on mobile elements. Noncore genes that are found within >20% of strains are called accessory genes. Those found in 1–20% of strains and in <1% of strains are called dispensable and unique genes, respectively, and they are more common than accessory genes (122, 183).
The nonvertical transfer of DNA among microbes serves as a mechanism to increase genetic diversity beyond that introduced through mutation alone, and it plays a major role in microbial evolution (34, 132). This fundamental process, however, complicates attempts to define species boundaries and to trace the evolutionary history of microbial lineages, and it has led some to argue that no natural classification system can be described for microbes because their evolutionary relationships are web-like rather than tree-like (10, 39). However, not all taxa freely exchange genetic information (191), and not all genes transfer easily or frequently (202, 203). For example, monomorphic pathogens that reproduce primarily by clonal expansion show little evidence of recombination over broad timescales (1, 203), and core housekeeping genes that are informational in nature rarely transfer or recombine (99). Consequently, the ancient genomes of monomorphic pathogens, such as M. leprae and M. tuberculosis, are easier to reconstruct than those of commensal taxa, such as H. pylori or T. forsythia (13, 112, 170, 196). Despite the messiness of microbial phylogenies (66), however, microbes generally behave as ecologically coherent entities at the levels of species, genus, family, and order, as currently defined by 16S rRNA gene sequence cutoffs (148).
4. THE POWER AND PITFALLS OF NAMES
Names are powerful entities that allow microbial taxa to be discussed and analyzed in a meaningful way. However, given the heterogeneous phenotypic, genetic, genomic, and metagenomic means by which microbial taxa are detected and observed, it is difficult to devise a single nomenclature system. Instead, overlapping systems of both formal and provisional schemes are currently in use, which both facilitate and limit the study of individual microbes and communities, as well as the reconstruction of ancient microbial genomes and microbiota.
4.1. Valid Species Names and Microbial Systematics
Despite the difficulty of defining what a microbial species is, methods for granting valid microbial species names are outlined by the International Code of Nomenclature of Bacteria set forth by the International Committee on Systematics of Prokaryotes (ICSP; http://www.the-icsp.org) (98, 186). This code governs all microbial taxonomic assignments at and below the Linnaean rank of class (141); however, only the rank of species has a formal definition: “[A] species is a category that circumscribes a (preferably) genomically coherent group of individual isolates/strains sharing a high degree of similarity in (many) independent features, comparatively tested under highly standardized conditions” (179, p. 1044; see also 162). The ICSP requires all new taxa to be published in the International Journal of Systematic and Evolutionary Microbiology, and minimal standards for the description of new species have been established by ICSP subcommittees (51). These standards include (a) isolation of the new species in pure culture, (b) 16S rRNA gene sequencing to establish phylogenetic position, (c) morphological description, (d) chemotaxonomic characterization to establish genus affiliation, (e) explanation of the genotypic and phenotypic basis for species differentiation, and (f) deposition of the type strain in at least two permanently established culture collections in two different countries (84). Genome sequencing is not currently required for the establishment of new microbial species, nor is genome sequencing alone sufficient to establish a new species. The List of Prokaryotic Names with Standing in Nomenclature (LPSN; http://www.bacterio.net) maintains an updated list of valid taxa (141).
4.2. Naming the Nameless
Given the emphasis placed on structural and functional properties of microbial isolates, taxa that cannot be grown in pure culture—either because their growth conditions are unknown or because they are parasitic and require the presence of other microbes to grow—are typically limited to candidatus (candidate) status. For example, the candidate phylum Saccharibacteria (formerly TM7), which includes at least 12 members in the human oral cavity, has proven very difficult to isolate in pure culture (21). The only successfully cultivated phylotype to date, provisionally named TM7x, was determined to be an epibiont (an organism that lives on the surface of another organism) of the host bacterium Actinomyces odontolyticus, suggesting that oral Saccharibacteria may play an important role in bacterial predation in the oral cavity (70). However, the apparent parasitic lifestyle of such taxa precludes attempts to classify them using conventional systematics criteria. Similar challenges face other microbial groups that are resistant to isolation in pure culture, making them difficult to discuss and study (21, 50). Additionally, such a standard could never be applied to ancient microbes, effectively shutting the door to the possibility of discovering and naming extinct species.
As a consequence of the high bar set by the ICSP for obtaining a valid species name, comparatively few microbial species have been officially named and validated—15,974 as of 2014 (141), compared with the >645,000 for which there is currently 16S rRNA gene sequence evidence (OTUs clustered at 99% in the SILVA SSU Ref NR 99 database, release 128; https://www.arb-silva.de) (152, 204). As a result, most microbial taxa are currently nameless but not necessarily unknown.
The challenge of how to devise a functioning nomenclature scheme for such a situation is clearly illustrated by the taxon table maintained by the Human Oral Microbiome Database (HOMD), a public scientific resource that curates an up-to-date list of human oral microbes (27). As of November 2016, the HOMD included 687 species-level oral taxa, of which 335 had both a valid species name and at least one sequenced genome, 36 had a valid species name and no sequenced genome, 88 had no valid species name but at least one sequenced genome, and 228 had no valid species name and no sequenced genome. The HOMD addressed this problem by developing a provisional naming scheme based on binning 16S rRNA gene sequences into unique phylotypes that are then assigned a Human Oral Taxon number. This number then allows phenotypic, phylogenetic, genomic, clinical, and other data types to be linked within the HOMD portal. This scheme, however, is limited to microbes of the human oral cavity and is not generalizable to other microbiomes, such as those of the human gut, soil, or ocean.
Alternatively, as of March 2017, GenBank (11, 131) contains 14,022 sequenced microbial genomes and maintains a taxonomy common tree of 23,653 named and candidatus microbial species (993 archaea and 22,660 bacteria) that “does not follow a single taxonomic treatise but rather attempts to incorporate phylogenetic and taxonomic knowledge from a variety of sources” (127). By taking this pragmatic approach, they are able to utilize a diverse range of existing phenotypic, genetic, and genomic microbial data in a common phylogenetic framework (165).
4.3. Taxonomy Versus Phylogeny
Although species names are practical entities that allow microbial taxa to be discussed and analyzed in a meaningful way, they can also be misleading. Ideally, taxonomy (microbial classification) should reflect phylogeny (evolutionary history), and species are periodically renamed to reflect improved understanding of phylogenetic relationships. However, there are also many well-known examples of named microbes for which the taxonomy is incongruent with phylogeny. In some cases, such discrepancies apply to clinically important taxa that differ from nonpathogenic taxa mainly because of horizontally transferred virulence factors that result in major phenotypic changes, as in the case of Yersinia pseudotuberculosis and Y. pestis (25). In other cases, genera that are clearly polyphyletic or paraphyletic, such as Klebsiella (Figure 1a), Clostridium, and Ruminococcus, persist despite repeated attempts at taxonomic reorganization (100, 133, 153). The ICSP has procedures for correcting such problems (98), and many taxa have been reclassified under this scheme. For example, Bacteroides forsythus