肠道和口服微生物的人对人的传播景观

  A total of 9,715 samples from 31 human metagenomic datasets (total: 5.17 × 1011 reads, average: 5.32 × 107 reads per sample) with available metadata to enable assessment of microbiome transmission between healthy mothers and offspring, households, twin pairs, villages and populations (that is, cohabitation information) were selected for inclusion in this study (Supplementary Tables 1 and 2).我们还包括了公开可用的粪便shot弹枪宏基因组数据集,其中至少有15个健康个体的样本,他们没有进行干预(例如抗生素或药物治疗或特定饮食) ,其中至少有2个相距不到6个月的样本以评估受试者内部菌株内部保留和设置25个数据的特定物种定义,该菌株的运作是公开的,该定义是公开的。(Ferrettip_20189) ,32(Ghana DataSet34)和61(Tanzania DataSet34)样本。根据原始出版物中描述的协议收集并处理了新纳入的样本 。此外,在本研究的背景下,新收集并测序了八个数据集(总计:2,800个样本) ,如下所述,使用类似的方法(尽管样品处理,DNA提取和测序文库制备的差异并不直接影响我们用于推断应变分享的系统发育距离)。   我们在样本和主题标识符 ,时间点,参与者的年龄,性别 ,分娩方式(阴道或剖腹产) ,家庭识别仪,家庭关系,双胞胎Zygosity和Twins分开 ,乡村和乡村从策划的Metagenogenomicicdata 3.0.0(参考文献61(参考文献61)中均包含在资源的材料中,并从该材料中添加的材料,我们的元素是在资源和材料中包含的元素。所有元基因组的元基因(包括新测序样品)均在策展的基因组制定格式中进行了策划和组织 ,并在补充表2中可用 。将合作伙伴定义为共享家庭的夫妻。Populations were classified on the basis of their westernization status (westernized or non-westernized), considered as the adoption of a westernized lifestyle and not in geographical terms, and defined as intake of diets typically rich in highly processed foods (with high fat content, low in complex carbohydrates and rich in refined sugars and salt), access to healthcare and pharmaceutical products, hygiene and sanitation conditions, reduced exposure to livestock,并增加了人口密度。该分类基于有关研究中包括的人群的可用信息在上述标准上以及如何在原始出版物中报告样本 。尽管我们承认这种二进制分类具有明显的限制62,但它可以深入了解人与人的微生物组与主机生活方式的关联。   在阿根廷的农村地区,共有14位母亲(16-37岁)及其1岁以下的婴儿(Villa Minetti Villa Minetti村庄 ,Esteban Rams,Pozo Borrado,Pozo Borrado ,Las Arenas,Cuatro Bocas,cuatro Bocas ,logroño ,logrounition,Montefiore,Montefiore和Belgrano; Shanta Feester; and Santa feestip; and and and and and and and and consider;在研究中。按照制造商的说明 ,使用Qiaamp DNA粪便(Qiagen)从粪便样品中提取DNA 。按照制造商的指南,使用Nextera DNA flex库制备套件(Illumina)制备测序库。根据制造商的协议,在Illumina Novaseq 6000平台上进行了测序。   来自哥伦比亚加勒比地区的Wayouu族裔社区(Etkishimana ,Koustshachon,Paraiso,Paraiso ,Invasión,Invasión,Invasión ,Tocomana,Tocomana,Warruptamana and Wayawikat; Addsement Table tos and-westers and-wester)的总体 ,总的来说 ,总共有12名母亲(15-40岁)及其6个月以下的12个以下的婴儿(6个月以下学习 。按照制造商的说明,使用船长的DNA提取试剂盒(Epider)提取粪便样品的DNA,并按照以下修饰处理:用溶菌酶(20 mg mL-1)和肌蛋白蛋白(5 u ml-1)在37°C下进行3--板的3--轴轴均衡 ,并在37°C的3--板上进行3-板的速度1分钟(5 u ml-1)处理样品 。由珠子搅拌器FastPrep 24-5G均质器(MP BioMedicals)。根据制造商的说明,使用DNA纯化试剂盒(Macherey -Nagel)进行DNA的纯化。使用Qubit 2.0荧光计(Life Technologies)测量DNA浓度以进行进一步分析 。按照制造商的指南,使用Nextera DNA flex库制备套件(Illumina)制备测序库。根据制造商的协议 ,在Illumina Novaseq 6000平台上进行了测序。   在Qidong(江苏省,中国省)(在这里被认为是西方人的人群),共有116名非生殖器和百岁老人(97名女性 ,19名男性,194-105岁)和231名后代(79名女性,152名男性 ,50-85岁) 。在纳入时,所有参与者都没有主要疾病。收集新鲜的粪便样品在上海第十医院收集,并在收集后储存在-20°C下。按照制造商的说明 ,使用EZNA粪便DNA试剂盒(Omega Bio-Tek)提取DNA 。通过1%琼脂糖凝胶电泳评估DNA完整性和大小 ,并用纳米体(Thermo Fisher Scientific)确定的DNA浓度。DNA文库根据TRUSEQ DNA样品Prep V2指南(Illumina)构建,具有2μg基因组DNA,平均插入量为500 bp。使用DNA LabChip 1000套件(Agilent Technologies)评估了图书馆质量 。测序是在Illumina Hiseq 4000平台上进行的 ,具有150 bp配对的读取长度。   在中国的一个农村人口(中国西北部省县的本县)中,共有8名母亲和19名婴儿在1岁以下的婴儿(Bin County),作为一项更大的研究的一部分(ClinicalTrials.gov NCT02537392);他们在这里被视为非西方人口。用Qiaamp快速DNA粪便迷你试剂盒(Qiagen)提取DNA ,并用乙醇沉淀 。按照制造商的指南,使用Nextera DNA flex库制备套件(Illumina)制备测序库 。根据制造商的协议,在Illumina Novaseq 6000平台上进行了测序。   来自Bubaque岛(Bijagos Archipelago ,Guinea-Bissau)的74个家庭中342名志愿者(0-85岁)的样品(在这里被视为一个非西方人群),被收集并作为先前研究的一部分提取了DNA。简而言之,在参考实验室将样品在-20°C下冷冻 。均质化和洗涤后 ,使用具有自定义修饰的Dneasy Powersoil Pro Kit(Qiagen)提取DNA64。按照制造商的指南,使用Nextera DNA flex库制备套件(Illumina)制备测序库。根据制造商的协议,在Illumina Novaseq 6000平台上进行了测序 。   共有4个母亲(37-46岁)及其8个孩子(0-2岁)在意大利特伦托的圣基亚拉医院招收;他们在这里被认为是西方人的人口。医院工作人员使用粪便材料收集管(Sarstedt)收集了母粪样品。母亲收集婴儿粪便样品 ,收集后在-20°C下冷冻 ,并在一周内移至-80°C的设施 。总共收集了48个样品(补充表2)。如HMP方案(人类微生物组项目联盟)中所述,使用Powersoil DNA分离试剂盒(Mobio Laboratories)提取DNA,并在添加初步加热步骤(65°C 10分钟 ,95°C持续10分钟)。根据制造商的说明,在10 mM Tris pH 7.4中回收DNA,并使用Qubit 2.0(Thermo Fisher Scientific)荧光计进行定量 。根据制造商的指南 ,使用Nexteraxt DNA库制备套件(Illumina)制备测序库。测序是在Illumina Hiseq 2500平台上进行的。   作为一项较大的研究的一部分,总共有19名母亲(30-47岁)和37名健康儿童(0-11岁)在意大利热那亚的Irccs Istituto Giannina Gaslini招收,在这里被认为是西方人群 。将粪便样品收集在DNA/RNA屏蔽粪收集管(Zymoresearch)中 ,并存储在-80°C下直至DNA提取 。根据制造商的程序,使用Dneasy Powersoil Pro Kit(Qiagen)进行DNA提取。使用Nanodrop分光光度计(Thermo Fisher Scientific)测量DNA浓度,并存储在-20°C下。根据制造商的指南 ,使用Nexteraxt DNA库制备套件(Illumina)制备测序库 。根据制造商的协议,在Illumina Novaseq 6000平台上进行了测序。   来自纽约州基因组中心的646个家庭的共有1,929个唾液样本(西部IRB(西部IRB)(https://wwwww.wcgirb.com/),协议跟踪编号:WIRB20151664 ,在这里被认为是西方人口的WIRB20151664 ,包括640年的父亲(22-55岁)(22-55岁)(22-55岁)(22-55岁)旧的)和658个通常发展后代的样本(0-18岁)。使用OGD-500试剂盒(DNA Genotek)收集唾液,并使用化学量MSM1/360 DNA提取仪器提取DNA,并在预防仪(Marshfield)中洗脱为110ul TE Buffer 。按照制造商的指南 ,使用Illumina DNA无PCR库预备套件(Illumina)制备测序库。使用S2/S4流动单元和以下制造商协议在Illumina Novaseq 6000平台上进行测序。   使用在https://github.com/segatalab/preprocessing上描述的管道预处理新测序的粪便样品 。不久,元基因组读数是质量控制的,并且质量低(质量得分) 2 ambiguous nucleotides were removed with Trim Galore (v0.6.6). Contaminant and host DNA was identified with Bowtie2 (v2.3.4.3)66 using the -sensitive-local parameter, allowing confident removal of the phiX 174 Illumina spike-in and human-associated reads (hg19 human genome release). Remaining high-quality reads were sorted and split to create standard forward, reverse and unpaired reads output files for each metagenome.   Newly sequenced saliva samples were pre-processed using a custom version of the pipeline described in https://github.com/SegataLab/preprocessing. Shortly, metagenomic reads were quality-controlled, removing reads of low quality (quality score 2 ambiguous nucleotides. Contaminant and host DNA was identified with Bowtie2 (v2.3.5.1)66 in ‘end-to-end’ global mode, allowing confident removal of human-associated reads (hg19). Remaining high-quality reads were sorted and split to create standard forward, reverse and unpaired reads output files for each metagenome.   Read statistics of stool and saliva samples (number of reads, number of bases, minimum and median read length per sample) are detailed in Supplementary Table 2. Metagenomes with ≥3 million reads were included in the analysis (n = 7,646 stool, n = 2,069 oral), while metagenomes with insufficient sequencing depth were excluded (n = 97 stool, n = 0 oral).   A custom database containing 160,267 MAGs and 75,446 isolate sequencing genomes was retrieved from ref. 30, and expanded with 184 MAGs from the Italian mother–infant dataset9 expanded in the current study, 1,439 MAGs from Italian centenarians67, 3,584 MAGs obtained from stool samples of individuals in non-westernized populations34, 2,985 MAGs from stool samples of non-human primates68, 20,404 MAGs from cow rumen69, 14,097 MAGs from mouse samples70,71,72,73,74,75,76,77,78,79,80,81,82,83, 1,235 MAGs from termites (PRJNA365052, PRJNA365053, PRJNA365054, PRJNA365049, PRJNA365050, PRJNA365051, PRJNA405700, PRJNA405701, PRJNA405702, PRJNA405782, PRJNA405783, PRJNA366373, PRJNA366374, PRJNA366375, PRJNA366251, PRJNA405703, PRJNA366252, PRJNA366766, PRJNA366357, PRJNA366358, PRJNA366361, PRJNA366362, PRJNA366363, PRJNA366255, PRJNA366256, PRJNA366257, PRJNA366253, PRJNA405704, PRJNA366254 and PRJNA405781), 7,760 MAGs available from a previous catalogue84, 2,137 MAGs from NCBI GenBank, and 63,142 reference genomes from NCBI GenBank (see https://github.com/SegataLab/MetaRefSGB for details). MAGs from the Italian mother–infant dataset, and those of non-human hosts were assembled using MEGAHIT85, while those of the Italian centenarian dataset and non-westernized populations were assembled with metaSPAdes86, using default parameters in both cases.   For the newly added MAGs we employed the following protocol on the metagenomic assemblies. Assembled contigs longer than 1,500 nucleotides were binned into MAGs using MetaBAT287. Quality control of all genomes was performed with CheckM version 1.1.3 (ref. 88), and only medium- and high-quality genomes (completeness ≥50% and contamination ≤5%) were included in the database. Prokka version 1.12 and 1.13 (ref. 89) were used to annotate open reading frames of the genomes. Coding sequences were then assigned to a UniRef90 cluster90 by performing a Diamond search (version 0.9.24)91 of the coding sequences against the UniRef90 database (version 201906) and assigning a UniRef90 ID if the mean sequence identity to the centroid sequence was above 90% and covered more than 80% of the centroid sequence. Protein sequences that could not be assigned to any UniRef90 cluster were de novo clustered using MMseqs292 within SGBs following the Uniclust90 criteria93.   Genomes were clustered into species-level genome bins (SGBs) spanning ≤5% genetic diversity, and those to genus-level genome bins (GGBs, 15% distance) and family-level genome bins (FGBs, 30% distance), as described in ref. 30. MAGs were assigned to SGBs by applying ‘phylophlan_metagenomic’, a subroutine of PhyloPhlAn 3 (ref. 94), which uses Mash95 to compute the whole-genome average nucleotide identity among genomes. When no SGB was below 5% genetic distance to a genome, new SGBs were defined, based on the average linkage assignment and hierarchical clustering (allowing a 5% genetic distance among genomes in the dendrogram). The same procedure was followed to assign SGBs to novel GGBs and FGBs when those were not yet defined.   SGBs containing at least one reference genome (kSGBs) were assigned the taxonomy of the reference genomes following a majority rule, up to the species level. SGBs with no reference genomes (uSGBs) were assigned the taxonomy of its corresponding GGB (up to the genus level) if this contained reference genomes, and of its corresponding FGB (up to the family level) if the latter contained reference genomes. If no reference genomes were present in the FGB, a phylum was assigned based on the majority rule applied on up to 100 closest reference genomes to the MAGs in the SGB as provided by ‘phylophlan_metagenomic’. Taxonomic assignment of SGBs profiled at strain level in this study can be found in Supplementary Tables 3 and 4.   Species-level profiling was performed on all the 9,715 samples with MetaPhlAn 4 (refs. 38,39) with default parameters and the custom SGB database. uSGBs with less than 5 MAGs were discarded as potential assembly artefacts or chimeric sequences and unlikely to reach the prevalence thresholds in the profiling. SGB core genes were defined as open reading frames in an existing UniRef90 or in a de novo clustered gene family (following the Uniclust90 clustering procedure93) present in at least half of the genomes (that is, ‘coreness’ 50%) of the SGB. Core genes were further optimized by selecting the highest coreness threshold that allowed retrieval of at least 800 core genes. Core genes of each SGBs were then screened to identify marker genes by checking their presence in other SGBs. This was done by a procedure that first divided core genes into fragments of 150 nt and then aligned the fragments against the genomes of all SGBs using Bowtie2 (version 2.3.5.1; -sensitive option)66. Marker genes were defined as core genes with no fragments found in at least 99% of the genomes of any other SGB. For SGBs with less than 10 marker genes, conflicts were defined as occurrences of more than 200 core genes of an SGB in more than 1% of genomes of another SGB, and conflict graphs were generated by retrieving all conflicts for that SGB. Each conflict graph was processed iteratively, retrieving all the possible merging scenarios, in order to get the optimal merges for the conflict that both minimize the number of merged SGBs and maximize the number of markers retrieved. Finally, for each SGB, a maximum of 200 marker genes were selected based first on their uniqueness and then on their size (bigger first), and SGBs still with less than 10 markers were discarded. Merged gut and oral SGBs (SGB_group) can be found in Supplementary Tables 3 and 4, respectively. The resulting 3.3M marker genes (189 ± 34marker genes per SGB(mean ± s.d.)) were used as a new reference database for MetaPhlAn and StrainPhlAn profiling.   Strain profiling was performed with StrainPhlAn438,39 using the custom SGB marker database, with parameters “marker_in_n_samples 1 -sample_with_n_markers 10 –phylophlan_mode accurate -mutation_rates”. To reduce noise, only SGBs detected in ≥20 samples and at least 10% of samples in a dataset with ≥10 markers (-print_clades_only argument in StrainPhlAn) were selected for strain-level profiling (n = 646 and n = 252 SGBs in stool and oral samples respectively). The total of 200 marker genes was available for the majority of SGBs (n = 481/646 gut SGBs and n = 148/252 oral SGBs). The average coverage across SGBs was 1.3×. For the SGBs potentially derived from fermented foods, sequences of MAGs assembled in ref. 40 were added using parameter “-r”. Compared to an assembly based approach (high-quality MAGs defined as >90%的完整性和 <5% contamination; assembly method reported in the section “Expanded SGB database ” above), strain-level profiling with StrainPhlAn allowed strain-sharing assessment among species in many more samples (median of 355 strain-level profiles per SGB and interquartile range (IQR) = [185, 806] versus median of 69 high-quality MAGs per SGB and IQR = [7, 60]).   To detect strain-sharing events, we first set SGB-specific normalized phylogenetic distance (nGD) thresholds that optimally separated same-individual longitudinal strain retention (same strain) from unrelated-individual (different strain) nGD distributions in five published stool metagenomic datasets from four different countries (Germany, Kazakhstan, Spain and United States) on three continents20,22,27,28,31. nGDs were calculated as leaf-to-leaf branch lengths normalized by total tree branch length in phylogenetic trees produced by StrainPhlAn, which are built on marker gene alignments on positions with at least 1% variability. For SGBs detected in at least 50 pairs of same-individual stool samples obtained no more than 6 months apart (n = 145 SGBs; the two samples for a certain individual in which the species could be profiled at the strain level and that were closest in time were selected), nGD thresholds were defined based on maximizing Youden’s index, and limiting at 5% the fraction of unrelated individuals to share the same strain as a bound on a false discovery rate (Extended Data Fig. 3). The assumption of frequent strain persistence in an individual for at least 6 months is supported by the distribution of phylogenetic distances in the longitudinal sets: for all species this has a peak at nGD approaching 0 (Extended Data Fig. 3), notably higher than that observed for inter-individual sample comparisons. For SGBs detected in less than 50 same-individual close pairs (n = 501) and in oral samples (n = 252), for which species-specific nGD cannot be reliably estimated, the nGD corresponding to the 3rd percentile of the unrelated individual nGD distribution was used. This value is the median percentile of the inter-individual nGD distribution corresponding to the nGD maximizing the Youden’s index of SGBs with at least 50 same-individual comparisons. The three sets of thresholds are thus three technical definitions of the same principle—that is, the individual specificity and the persistence of strains in the gut microbiome, and did not lead to significant differences in nGD values (Kruskal–Wallis test, χ2 = 2.34, P = 0.31; Extended Data Fig. 10a). nGD thresholds also did not significantly differ by phylum (Extended Data Fig. 10b), and those set in stool and oral samples were similar (median nGD difference = 0.006). If not limiting at 5% the fraction of unrelated individuals to share the same strain as a bound on a false discovery rate, the resulting percentile would only be of a median of 8.2% (range = [5.2–22.3%]) on these 38 SGBs (Supplementary Table 4). When using single metagenomic datasets instead of the five datasets we included to set the strain identity thresholds, often not enough longitudinal samples were available (<50 same-individual pairs) and some variation was observed (Extended Data Fig. 10c), which supports the use of the largest set of samples available.   Overall, the median SNV rate nGD thresholds corresponded to is 0.005, below the estimated >0.1% sequencing error rate by Illumina HiSeq and NovaSeq platforms96 (Supplementary Table 4). The nGD thresholds correspond to a SNV rate of 0 for some SGBs (n = 16 out of 646—that is, 2.5%), mostly those encompassing very low genetic variation (for example, B. animalis SGB17278). In SGB trees containing MAGs of microorganisms obtained from fermented foods, we identified and discarded any strains with high similarity (≤0.0015 SNV rate as determined by PhyloPhlAn 3 (https://github.com/biobakery/phylophlan/wiki#mutation-rates-table)—that is, the number of positions that have nucleotide differences divided by the length of the alignment) to food MAGs (Supplementary Table 6). For B. animalis (SGB17278), 62 strains profiled in 7 public mouse metagenome datasets73,75,97,98,99,100,101 were added to better assess its phylogenetic diversity. The trees produced by StrainPhlAn together with the SGB-specific nGD thresholds were used in StrainPhlAn4’s strain_transmission.py script (-threshold argument) (https://github.com/biobakery/MetaPhlAn/blob/master/metaphlan/utils/strain_transmission.py). Pairs of strains with pairwise nGD below the strain identity threshold were defined as strain-sharing events. Centred nGD is defined as the nGD divided by the median nGD in the phylogenetic tree. We opted for strain identity thresholds based on phylogenetic distances in contrast to SNV rates due to (1) the rather low coverage that we obtain for species in metagenomic samples even after passing our sequencing depth threshold (mean coverage = 7.2×, median = 0.69 and IQR = [0.14, 3.09]) that would add noise especially to SNV rate estimations; (2) the limited length of the marker gene alignment of some SGBs (mean trimmed alignment length = 74,348 nt, median = 70,879 and IQR = [42,513, 104,347]) that would make SNV rates rather unreliable; and (3) the valuable information on evolutionary models (for example, distinguishing synonymous from non-synonymous nucleotide changes) that is provided by phylogenetic trees.   We compared the new species-specific strain identity thresholds with the nGD = 0.1 threshold (that is, considering the lowest 10% phylogenetic distances to be between the same strains) used in some previous publications and StrainPhlAn versions prior to version 4 (refs. 9,32,102). We found that while the previous threshold would produce a median 44% mother–infant strain-sharing rate—in contrast to the 50% strain-sharing rate we obtain here—the novel method yields a lower strain-sharing rate between infants and unrelated mothers, which are likely to be false positives: 3.5% versus 4%. This supports the better performance of the species-specific strain identity thresholds as they detect—at the same time—more strain-sharing events between matched mothers and infants and fewer strain-sharing events between unrelated mother–infant pairs.   To assess the reproducibility of the species-specific strain identity thresholds on additional unrelated data, we used independent datasets of patients undergoing faecal microbiome transplantation (FMT). As we used the publicly available metagenomic cohorts with no intervention and longitudinal sampling20,22,27,28,31 to set the species-specific thresholds, we used for validation the completely independent FMT datasets as a distinct setting in which strain transmission can be expected. In FMT, part of the strains from a healthy donor are successfully transferred to a patient, while some strains from the donor’s original sample remain after the intervention. We included 1,371 samples from 25 different cohorts of patients undergoing FMT103,104,105,106,107,108,109,110,111,112,113,114,115,116,117,118,119,120,121,122,123 that were analysed as part of a meta-analysis124. In this evaluation, similar to what we did in the set of longitudinal samples, we assessed the separation between the distribution of the nGD distances of strains from the same SGB in the two following situations: (1) the strains are from samples of the same individual or from a FMT donor and their recipient after the FMT, and (2) the strains are from samples belonging to different FMT triads (defined by the samples from the donor, those of the patient before FMT, and those of the patients after FMT). We performed this analysis for each of the 95 SGBs of our set that were also profiled in the Ianiro et al study. We considered as true positives pairwise phylogenetic distance (nGD) values between samples in (1) that were below the species-specific strain identity threshold (defined on the independent longitudinal datasets), false positives as those from (2) that were below the threshold, true negatives as those from (2) above the threshold, and false negatives as those from (1) above the threshold. We found that StrainPhlAn4 with the species-specific strain identity thresholds defined here performed very well in distinguishing strains in the same individual or FMT triad from different strains in different FMT triads: median recall = 0.97 and IQR = [0.95,0.99], precision = 0.72 [0.67,0.82], F-score = 0.97 [0.96,0.98] (Supplementary Table 35).   Person-to-person strain-sharing rates were calculated as the number of strains shared between two individuals divided by the number of shared SGBs profiled by StrainPhlAn (number of shared strains/number of shared SGBs). When multiple samples were available for an individual, detection of strain or SGB sharing at any time point was considered as the strain or SGB was shared. For a robust calculation, person-to-person strain-sharing rates were only assessed when at least ten SGBs were shared between two individuals. The same calculation was used to assess same-individual strain retention between two time points in longitudinal datasets. Strain acquisition rates by the offspring (Extended Data Fig. 6a) were defined as the proportion of strains profiled in the offspring that were shared with the mother, thus putatively originating from her. For a robust calculation, strain acquisition rates by the offspring were only assessed when at least ten SGBs were shared between the mother and the offspring. As StrainPhlAn36,38,39 profiles the dominant strain for each species, the total number of strains shared between two samples ranges between 0 and the total number of shared profiled SGBs, whereas strain-sharing rates and strain acquisition rates by the offspring are bound between 0 and 1.   SGB transmissibility was defined as the number of strain-sharing events detected for an SGB divided by the total potential number of strain-sharing events based on the presence of a strain-level profile by StrainPhlAn4. When multiple samples were available for an individual, detection of strain sharing at any time point was considered as the strain was shared. For a robust calculation, SGB transmissibility was only assessed on SGBs with at least ten potential strain-sharing events in multiple datasets, and with at least three potential strain-sharing events for single dataset calculations. To assess concordance of SGB transmissibility among datasets, Spearman’s correlations (cor.test function in R (https://www.R-project.org/)) were performed between datasets with at least ten SGBs with assessed transmissibility. Highly transmitted SGBs were defined as those with SGB transmissibility >0.5且组内明显高于群体的传播性(卡方测试 ,PADJ <0.05)。我们发现SGB的传播性与修剪对齐的长度之间没有显着关联(Spearman的测试,ρ= 0.06,p = 0.13)。   我们评估了三种主要传播模式的应变共享:母亲 - 灌输(在母亲和其后代之间定义至一岁) ,家庭(定义为同居个人之间)和人口内(定义为在没有亲属证据的人群中的非居民之间定义的) 。   为了对微生物组组成数据进行适当的分析,使用Codaseq R软件包(v0.99.6)125中的Codaseq.clr函数对通过向上lan获得的物种水平的丰度矩阵进行了居中,使用每个分类孔的最小比例丰度。使用每个个体的一个随机选择的样品(n = 4,840个肠道样本 ,n = 2,069个口述样品),使用纵坐标和绘图函数在Aitchison距离上的主成分分析图。为了比较物种级别的相似性与应变共享率,将用素食r套件(v2.5–7)计算的beta多样性度量(Aitchison距离 ,Bray-Curtis差异和Jaccard二进制距离)转换为相似性独立性(1-(1-(距离或差异或不相差))) 。   用R包(v2.0.5) ,Igraph(v1.2.6)127和Tidygraph(v1.2.0)可视化基于共享菌株和物种的无监督网络,并显示了与≥5个共享菌株或≥50种共享物种(nodes)的连接的tidygraph(v1.2.0) 。   从微生物目录v2.0(参考文献128)中获取实验确定的细菌表型,并通过NCBI分类学标识符与KSGB匹配。在50%核心基因上 ,预测所有SGB(版本1.1.12)60(版本1.1.12)60(在扩展的SGB数据库中可用的50%的基因组中存在基因),预测所有SGB的表型性状与物种的传播3相关。仅保留了PHYPAT和PHYPAT+PGL分类器(包括有关表型增长和损失的额外进化信息)的注释 。与25%的可传播SGB相比,通过Wilcoxon秩和测试评估了SGB的传播性与微生物表型之间的关联 ,而25%的可传染性SGB。   使用包装素食(版本2.5-7),Thyloseq(V1.28.0)126,Quantpsyc(v1.5) ,GGPLOT2(v3.3.3),GGPUBR(v0.4.0)和Corrplot(V0.84)在R中进行统计分析和图形表示。在适当的情况下,对多次测试(Benjamini – Hochberg程序 ,PADJ)进行了校正,并在PADJ <0.05时定义了显着性 。除非另有说明,否则所有测试都是双面的。元数据变量与距离矩阵之间的关联与素食主义者的阿多尼斯功能评估。通过Wilcoxon级别测试评估两组之间的差异 。对于两组以上 ,使用了带有事后DUNN测试的Kruskal -Wallis测试。通过Spearman的测试评估相关性。为了评估变量之间的相关性 ,同时分散了潜在的混杂因素,GLM拟合了GLM R函数(高斯,链接=身份) 。使用LM.Beta R函数(QuantPsyc R软件包)计算标准化的GLM回归系数。通过对嵌套GLM进行对数可能性(卡方)测试来评估显着性。   所有研究程序均符合所有相关的道德法规 。该程序是按照赫尔辛基宣言进行的 。阿根廷队列的道德批准是由阿根廷CCT Santa Fe(29112019)的伦理和安全委员会(CEYSTE)授予的。哥伦比亚队列获得了哥伦比亚大都会大学研究生物伦理学委员会的批准(NIT 890105361-5)。中国数据集研究方案得到了上海第十医院的伦理委员会 ,汤吉大学医学院(SHSY-EIEC-PAP-18-1),中国_2得到了中国Xi'an Jiotong大学健康科学中心伦理委员会的批准(中国中国,2016-114) 。几内亚 - 比索的研究得到了卫生伦理国家委员会(Comitênacionaldaéticana saude)的批准 ,几内亚 - 比索(076/CNES/INASA/2017)和伦敦卫生学院和热带医学委员会(参考编号22898)。意大利_1数据集研究协议已由意大利特伦托的圣基亚拉医院伦理委员会(2014年7月30日)和意大利特伦托大学的伦理委员会和意大利的伦理委员会批准,由意大利Liguria区域伦理委员会(意大利意大利)意大利(006/2019)。Western IRB(https://www.wcgirb.com/)授予了美国数据集的道德批准,并带有协议跟踪编号WIRB20151664 。所有成年参与者和非成年参与者的父母获得了书面知情同意。   有关研究设计的更多信息可在与本文有关的自然投资组合报告摘要中获得。

本文来自作者[admin]投稿,不代表永利号立场,如若转载,请注明出处:http://www.siyonli.com/zshi/202506-1336.html

(9)

文章推荐

  • 二十不惑姜小果段振宇接吻是哪一集 二十不惑姜小果和谁在一起

    二十不惑姜小果段振宇接吻是哪一集二十不惑姜小果段振宇接吻是24集。剧情简介权力、长相、金钱都没有的三无女生姜小果,一直羡慕着自己的三个室友。来自商人家庭的段家宝富裕却单纯,只知追星和美食,无忧无虑。罗艳虽然讨厌妈妈一手遮天的安排,但至少有人给安排工作,后顾无忧。冷美人梁爽漂亮又善钻营,大四已经在社会

    2025年04月14日
    48315
  • 单异位元素的原子量

      感谢您访问Nature.com。您使用的是浏览器版本对CSS的支持有限。获得  最佳体验,我们建议您使用更多最新的浏览器(或关闭兼容模式  InternetExplorer)。同时,为了确保继续支持,我们正在展示网站,没有样式  和JavaScript。

    2025年06月18日
    11323
  • 【滨江样板房,滨江两岸样板间】

    广州HDD室内设计机构我们的客户广州HDD室内设计机构拥有广泛的客户群体,涵盖了国内外多个行业。华为MateBook14:性价比高,配置合理,适合初学者。联想小新Pro13:搭载AMDRyzen5处理器,屏幕分辨率高,适合室内设计。戴尔XPS13:轻薄便携,性能强大,适合移动办公和设计。

    2025年06月21日
    8302
  • 关于开发小鼠卵的孤立囊泡的实验

      感谢您访问Nature.com。您使用的是浏览器版本对CSS的支持有限。获得  最佳体验,我们建议您使用更多最新的浏览器(或关闭兼容模式  InternetExplorer)。同时,为了确保继续支持,我们正在展示网站,没有样式  和JavaScript。

    2025年06月21日
    9313
  • 用“ Ovitron”收集的果蝇的过早鸡蛋的渗透性

      感谢您访问Nature.com。您使用的是浏览器版本对CSS的支持有限。获得  最佳体验,我们建议您使用更多最新的浏览器(或关闭兼容模式  InternetExplorer)。同时,为了确保继续支持,我们正在展示网站,没有样式  和JavaScript。

    2025年06月21日
    7320
  • 正式服炼金攻略/tbc 炼金攻略

    不思议迷宫无女王稳拿斯巴达之圣锻炼金套路分析1、该套路的思路就是:炼金出战,冒险系走圣锻,战士系走骑士,法师系走元素,制作大德鲁伊斗篷造成高额伤害。2、新手没有竖琴——角色成长慢后期玩家用惯了竖琴,称号成长迅速,60层左右三系称号几乎全满,一气呵成简简单单。但新手玩家往往是——即使到了101层,称

    2025年06月22日
    10305
  • 石家庄首例确诊患者发声(石家庄首例确诊病例轨迹)

    今年本土疫情基本均1个月左右清零?月30日,云南省瑞丽新增6例本土确诊病例。这是今年春节后我国第一次出现地方散发性疫情。截至4月11日,云南省本轮疫情共感染本土病例86例。4月22日,云南省出现本土确诊病例零新增,在接下来一周连续零新增。从首例患者到零新增,云南瑞丽本轮疫情防控用了23天。无新增确

    2025年06月22日
    13306
  • 【丰田霸道2700价格,丰田霸道2700价格跌破43万】

    丰田霸道2700绿色的2008年5月份的车跑了1万多公里,能卖多少钱二手车丰田霸道2700的价格通常在20-30万元之间。以下是对二手车丰田霸道2700价格及其相关特点的详细分析:价格区间二手丰田霸道2700的价格因车辆年份、里程数、车况等因素而有所不同,但总体来说,其价格主要集中在20-30万

    2025年06月22日
    8317
  • 海外疫情最新数据/海外疫情最新动态消息

    海外市场强势上涨,A股继续回暖反弹高年5月6日A股三大股指迎来五月开门红,主要有以下原因:国际市场表现良好:五一假期期间,美股三大股指持续反弹,标普500、纳斯达克、道琼斯指数分别上涨9%、4%和0%,港股恒生指数周涨38%,恒生科技指数大涨超5%,为A股营造了积极氛围。市场点评:海外市场强势上涨

    2025年06月23日
    6306
  • 【山东青岛3例无症状感染者,青岛2例无症状感染】

    青岛开始全员核酸检测是哪一年开始的1、青岛的全员核酸检测是从2020年10月11日开始的。该城市在发现3例无症状感染者后,迅速启动了全员检测。从10月11日发现病例至10月16日下午,青岛用了不到5天时间完成了这项任务。2、青岛开始全员核酸检测是2020年10月11日开始的。青岛从10月11号

    2025年06月23日
    5321
  • 一个卧室能放两个床吗/一个卧室放两张床风水

    主卧能不能放两张床1、主卧可以放两张床,但从风水角度看,夫妻主卧室不宜摆放两张床。从空间和居住需求来看,若主卧面积足够大,居住者有放置两张床的需求,如家中有小孩需要与父母分床睡但仍在同一房间照顾,或者有客人留宿的可能,放两张床是可行的。然而,从风水方面考虑,主卧摆放两张床存在一些不利因素。2、从风

    2025年06月24日
    3305
  • 在零弹道分散的情况下,模式或单个瞄准点释放的武器的有效性

      感谢您访问Nature.com。您使用的是浏览器版本对CSS的支持有限。获得  最佳体验,我们建议您使用更多最新的浏览器(或关闭兼容模式  InternetExplorer)。同时,为了确保继续支持,我们正在展示网站,没有样式  和JavaScript。

    2025年06月20日
    7321

发表回复

本站作者才能评论

评论列表(3条)

  • admin的头像
    admin 2025年06月20日

    我是永利号的签约作者“admin”

  • admin
    admin 2025年06月20日

    本文概览:  A total of 9,715 samples from 31 human metagenomic datasets (total: 5.17 × 1011 reads,...

  • admin
    用户062002 2025年06月20日

    文章不错《肠道和口服微生物的人对人的传播景观》内容很有帮助