Since "married to a microorganism", we need Contig Binning to it!

Time:Jun 15/2019
Source:爱生生命
Share:


Microorganisms are closely related to human life. In addition to encountering all kinds of microorganisms in the environment, our body is inseparable from microorganisms. According to statistics, the intestinal tract of adults contains 100 trillion microorganisms, the number of genes is 100 times the number of human genes, the total weight is about 1.5 kg (Jesse D, 2013).

As a result, K Ray once jokingly stated, "We are more like microbes than humans, and we are not friends with intestinal flora, but closer - we have'married'them" (K Ray, 2012).




Since we have "married" microorganisms, we need to have a full understanding of microorganisms, and macrogenome technology is a powerful tool to understand the structure and function of microbial communities.

With the reduction of the cost of genome sequencing, more experts and scholars tend to analyze species diversity, functional genes and build metabolic pathways through large data. Now, however, the problem that traditional technology can't solve can be solved by using large amount of data of macrogenome, that is, the analysis method of Contig Binning can resolve the genome of those microorganisms that can't be cultured purely.


Contig Binning is based on the sequence of macrogenome. Contigs with similar or identical composition are clustered into the same species to complete the sketch assembly of single bacteria, and then the function of genome is analyzed by conventional bioinformatics analysis.Therefore, for those microorganisms that are difficult to cultivate, the genome sketches can be obtained by using Contig Binning's biological analysis method.

Following is an article published in recent years on the assembly of difficult-to-cultivate microorganisms by Contig Binning analysis method, which can be used for your reference:





Since Contig Binning's analysis strategy can complete the genome sketch of microorganisms that are difficult to cultivate, what are its research ideas? In fact, both environmental microorganisms and human microorganisms can be studied by Contig Binning, and high quality microbial genomes can be obtained.

As early as 2012, experts and scholars have assembled Contig Binning for intestinal microecological samples and obtained high quality microbial genomes. Mads published an article entitled Genome sequences of rare, uncultured bacteria obtained by differential overage binning of multiple metagenomes in Nature Biotechnology, and presented Contig Binning's assembly ideas:




Step1:Sample preparation

Step2:Two different methods of DNA extraction (hot phenol extraction and non-hot phenol extraction), so different microbial components have different relative abundance.

Step3:PE 150 sequencing (HP + and HP - obtained 29G and 57G high quality sequencing data, respectively)

Step4:Macro-genome assembly to obtain non-redundant scaffolds sequences

Step5-8:The abundance of each scaffolds in the two extraction methods was calculated and homogenized. The frequency of Teltranucleotide, GC content and conservative single-copy marker genes (such as 16S rDNA sequence) were also calculated as the reference criteria for sequence clustering.

Step9:The scaffolds abundances obtained by two extraction methods were used to construct the coordinate system. Contig Binning clustering was carried out according to the scaffolds abundances. The set of Contig Binning represented a hypothetical genome of a microbial community, in which the size of circles represented the length of the sequence, and different colors represented single copy genes. Because different species may have the same gene set with the same abundance, the clustering results of Contig Binning need to be analyzed by principal component analysis of Teltranucleotide frequencies in order to distinguish different species.

Step10:Screening repetitive elements or multiple copies of genes from sequencing data

Step11-12:Screening all reads related to Contig Binning clustering set, re-assembling the genome de novo, and obtaining a standard genome sketch. The assembly results can be validated according to conserved single copy genes, and the assembly or other genome structure problems can be identified. Repeated sequences and representative sequences of microbial diversity can be visualized by Cytoscape.

Through Contig Binning analysis, 31 bacterial genomes including genomes of species with relative abundance less than 1% were obtained, and 31 bacterial species were classified at phylum level.



Similarly, Mohamed used macrogenomic data to study microbial communities in eight locations of the Red Sea. Through sequence clustering and analysis of Contig Binning, 136 microbial genomes (Mohamed et al., 2016) were obtained. Unlike Mads, Mohamed mainly uses Teltranucleotide and relative abundance of sequences as reference indicators for Contig Binning clustering. Specific analysis strategies are as follows:


The 136 microbial genomes assembled were classified and analyzed by Mohamed. For archaea, the authors constructed phylogenetic tree based on 122 single-copy marker genes. For bacteria, the authors classified and analyzed 120 single-copy marker genes. The results are as follows:



Since Contig Binning's assembly strategy can help researchers explore microbial communities that are difficult to cultivate in samples, many experts and scholars have been studying the assembly methods of Contig Binning since Contig Binning's assembly strategy came out.

In a review published in the Journal Microbiome in 2016, Contig Binning's assembly strategies are classified into three categories: nucleic acid composition (NC), nucleic acid composition and abundance (NCA), integral abundance (DA), etc. However, each method has its own advantages and disadvantages (Naseer Sangwan, 2016). Combining nucleic acid composition with sequence abundance is considered to be a good strategy, which not only ensures the effect of Contigs Binning, but also saves computing resources. Specific information is shown in the following table:



In addition, this review also summarizes the software used for Contig Binning assembly and its functional introduction. MetaBAT is the Contig Binning assembly software used by Mohamed in A Catalogue of 136 Microbial Draft Genmes from Red Sea Metagenomes:

Similarly, this review summarizes the overall thinking of Contig Binning. Taste carefully. The ideas summarized in this review actually coincide with those advocated by Mads in Genome sequences of rare, uncultured bacteria obtained by differential coverage binning of multiple metagenomes. Both approaches advocate genome assembly of Contig Binning based on sequence abundance and nucleotide component analysis (nucleotide frequency, GC content, necessary single copy genes). Specific ideas can be referred to in the following figure:


It should be noted that for different studies, or according to the size of different samples, or for different sample types, the selection of Contig Binning assembly strategy will have subtle differences. Therefore, we also need to be cautious in choosing Contig Binning to assemble microbial genomes that cannot be cultured.

Reference:

[1] Jesse D. Aitken and Andrew T. Gewirt. Toward understanding and manipulating the microbiome to   treat intestinal disease. 2013

[2] K Ray .Gut microbiota: married to our gut microbiota. Nature Reviews Gastroenterology & Hepatology. 2012

[3] Mads Albertsen et al. Genome sequences of rare, uncultured bacteria obtained by differential coverage binning of multiple metagenomes. Nature Biotechnology, 2012

[4] Mohamed et al. A catalogue of 136 microbial draft genomes from Red Sea metagenomes. Scientific Data, 2016

[5] Naseer Sangwan. Recovering complete and draft population genomes from metagenome datasets. Microbiome, 2016



(This article is reproduced from Wood Bug)