本文来自作者[admin]投稿,不代表永利号立场,如若转载,请注明出处:http://www.siyonli.com/zshi/202506-2231.html
为了研究蛋白质编码基因的分子演化,我们使用PAML75估计了六个melanogaster群中8,510个单拷贝直系同源物中同义和非同义替代的速率(补充信息第11.1节);同义站点饱和度阻止了对更分歧的比较分析 。我们仅研究单拷贝直系同源物 ,因为当包括旁系同源物时,对齐越来越有问题。还计算了所有12种物种中单拷贝直系同源物的氨基酸差异的速率;这些结果在很大程度上与Melanogaster组中非同义词差异的分析一致,并且没有进一步讨论。
为了了解基因功能类别的差异和约束的全局模式 ,我们检查了ω(= DN/DS,非同义词与同义差异的比率)在基因本体论类别之间(GO)76(GO)76,不包括基于电子支持(补充信息11.2) 。大多数基因的功能类别受到强烈约束 ,中位估计值远小于一个。通常,功能相似的基因受到类似的限制:31.8%的GO类别在ω中的差异明显低于预期(q值真实阳性测试77)。只有11%的GO类别在统计学上显着升高ω(相对于所有带有GO注释的基因的中位数),以5%的错误分辨率率(FDR) ,表明选择性约束的阳性选择或降低 。具有升高的GO类别包括生物过程术语“防御反应”,“蛋白水解 ”,“ DNA代谢过程”和“对生物刺激的反应”;分子函数术语“转录因子活性 ”,“肽酶活性” ,“受体结合”,“气味结合 ”,“ DNA结合” ,“受体活性”和“ G蛋白偶联受体活性 ”;和细胞位置术语“细胞外”(图4和补充表12)。当比较GO类别的DN时,获得了类似的结果,这表明在大多数情况下 ,GO类别中的ω差异是由氨基酸而不是同义位点取代的。两个例外是分子函数项“转录因子活性”和“ DNA结合活性”,我们观察到了显着减速的DS(两者的FDR = 7.2×10-4;补充信息第11.2节),在DN中没有显着差异 。
为了区分可能的阳性选择和放松的约束 ,我们使用基于密码子基于密码子的分子进化的可能性模型,在PAML78,79(补充信息第11.1节)中明确测试了具有阳性选择签名的密码子的基因。尽管该测试通常被视为阳性选择的保守测试,但在同义地点的选择可能会混淆。但是 ,在同义站点的选择(即密码子偏差,见下文)非常薄弱。此外,此处介绍的ω变异性倾向于反映DN的变异性 。因此,我们认为将同义位点视为几乎中性的同义位点 ,并且具有ω> 1的位点是与正选择一致的。尽管有许多功能类别具有升高ω的证据,但“解旋酶活性 ”是唯一更有可能被积极选择的功能类别(置换测试,p = 2×10-4 ,fdr = 0.007;补充表12);补充表12);这一发现的生物学意义值得进一步研究。此外,在每个GO类中,基因的阳性选择概率之间存在更大的分散体 ,而不是在ω的估计中(MWU单尾,p = 0.011;补充信息第11.1节),这表明尽管功能上相似的基因在功能上相似的基因共享约束模式 ,但它们不一定显示出相似的正面选择模式(图4) 。
有趣的是,在GO数据库中没有注释(“未知”)功能的蛋白质编码基因似乎不太受约束(排列测试,P< 1 × 10-4, FDR = 0.006)80 and to have on average lower P-values for the test of positive selection than genes with annotated functions (permutation test, P = 0.001, FDR = 0.058). It is unlikely that this observation results entirely from an over-representation of mis-annotated or non-protein-coding genes in the ‘unknown’ functional class, because this finding is robust to the removal of all D. melanogaster genes predicted to be non-protein-coding in ref. 8. The bias in the way biological function is ascribed to genes (to laboratory-induced, easily scorable functions) leaves open the possibility that unannotated biological functions may have an important role in evolution. Indeed, genes with characterized mutant alleles in FlyBase evolve significantly more slowly than other genes (median ωwith alleles = 0.0525 and ωwithout alleles = 0.0701; MWU, P < 1 × 10-16).
Previous work has suggested that a substantial fraction of non-synonymous substitutions in Drosophila were fixed through positive selection81,82,83,84,85. We estimate that 33.1% of single-copy orthologues in the melanogaster group have experienced positive selection on at least a subset of codons (q-value true-positive tests77) (Supplementary Information section 11.1). This may be an underestimate, because we have only examined single-copy orthologues, owing to difficulties in producing accurate alignments of paralogues by automated methods. On the basis of the 878 genes inferred to have experienced positive selection with high confidence (FDR < 10%), we estimated that an average of 2% of codons in positively selected genes have ω > 1. Thus, several lines of evidence, based on different methodologies, suggest that patterns of amino acid fixation in Drosophila genomes have been shaped extensively by positive selection.
The presence of functional domains within a protein may lead to heterogeneity in patterns of constraint and adaptation along its length. Among genes inferred to be evolving by positive selection at a 10% FDR, 63.7% (q-value true-positive tests77) show evidence for spatial clustering of positively selected codons (Supplementary Information section 11.2). Spatial heterogeneity in constraint is further supported by contrasting ω for codons inside versus outside defined InterPro domains (genes lacking InterPro domains are treated as ‘outside’ a defined InterPro domain). Codons within InterPro domains were significantly more conserved than codons outside InterPro domains (median ω: 0.062 InterPro domains, 0.084 outside InterPro domains; MWU, P < 2.2 × 10-16; Supplementary Information section 11.2). Similarly, there were significantly more positively selected codons outside of InterPro domains than inside domains (FET P < 2.2 × 10-16), suggesting that in addition to being more constrained, codons in protein domains are less likely to be targets of positive selection (Supplementary Fig. 6).
The sequenced genomes of the melanogaster group provide unprecedented statistical power to identify factors affecting rates of protein evolution. Previous analyses have suggested that although the level of gene expression consistently seems to be a major determinant of variation in rates of evolution among proteins86,87, other factors probably play a significant, if perhaps minor, part88,89,90,91. In Drosophila, although highly expressed genes do evolve more slowly, breadth of expression across tissues, gene essentiality and intron number all also independently correlate with rates of protein evolution, suggesting that the additional complexities of multicellular organisms are important factors in modulating rates of protein evolution78. The presence of repetitive amino acid sequences has a role as well: non-repeat regions in proteins containing repeats evolve faster and show more evidence for positive selection than genes lacking repeats92.
These data also provide a unique opportunity to examine the impact of chromosomal location on evolutionary rates. Population genetic theory predicts that for new recessive mutations, both purifying and positive selection will be more efficient on the X chromosome given its hemizygosity in males93. In contrast, the lack of recombination on the small, mainly heterochromatic dot chromosome94,95 is expected to reduce the efficacy of selection96. Because codon bias, or the unequal usage of synonymous codons in protein-coding sequences, reflects weak but pervasive selection, it is a sensitive metric for evaluating the efficacy of purifying selection. Consistent with expectation, in all 12 species, we find significantly elevated levels of codon bias on the X chromosome and significantly reduced levels of codon bias on the dot chromosome97. Furthermore, X-chromosome-linked genes are marginally over-represented within the set of positively selected genes in the melanogaster group (FET, P = 0.055), which is consistent with increased rates of adaptive substitution on this chromosome. This analysis suggests that chromosomal context also serves to modulate rates of molecular evolution in protein-coding genes.
To examine further the impact of genomic location on protein evolution, we examined the subset of genes that have moved within or between chromosome arms32,98. Genes inferred to have moved between Muller elements have a significantly higher rate of protein evolution than genes inferred to have moved within a Muller element (MWU, P = 1.32 × 10-14) and genes that have maintained their genomic position (MWU, P = 0.008) (Supplementary Fig. 7). Interestingly, genes that move within Muller elements have a significantly lower rate of protein evolution than those for which genomic locations have been maintained (MWU, P = 3.85 × 10-14). It remains unclear whether these differences reflect underlying biases in the types of genes that move inter- versus intra-chromosomally, or whether they are due to in situ patterns of evolution in novel genomic contexts.
Codon bias is thought to enhance the efficiency and/or accuracy of translation99,100,101 and seems to be maintained by mutation–selection–drift balance101,102,103,104. Across the 12 Drosophila genomes, there is more codon bias in the Sophophora subgenus than in the Drosophila subgenus, and a previously noted105,106,107,108,109 striking reduction in codon bias in D. willistoni110,111 (Fig. 5). However, with only minor exceptions, codon preferences for each amino acid seem to be conserved across 11 of the 12 species. The striking exception is D. willistoni, in which codon usage for 6 of 18 redundant amino acids has diverged (Fig. 5). Mutation alone is not sufficient to explain codon-usage bias in D. willistoni, which is suggestive of a lineage-specific shift in codon preferences111,112. We found evidence for a lineage-specific genomic reduction in codon bias in D. melanogaster (Fig. 5), as has been suggested previously113,114,115,116,117,118,119. In addition, maximum-likelihood estimation of the strength of selection on synonymous sites in 8,510 melanogaster group single-copy orthologues revealed a marked reduction in the number of genes under selection for increased codon bias in D. melanogaster relative to its sister species D. sechellia120.
Given the ecological and environmental diversity encompassed by the 12 Drosophila species, we examined the evolution of genes and gene families associated with ecology and reproduction. Specifically, we selected genes with roles in chemoreception, detoxification/metabolism, immunity/defence, and sex/reproduction for more detailed study.
Drosophila species have complex olfactory and gustatory systems used to identify food sources, hazards and mates, which depend on odorant-binding proteins, and olfactory/odorant and gustatory receptors (Ors and Grs). The D. melanogaster genome has approximately 60 Ors, 60 Grs and 50 odorant-binding protein genes. Despite overall conservation of gene number across the 12 species and widespread evidence for purifying selection within the melanogaster group, there is evidence that a subset of Or and Gr genes experiences positive selection121,122,123. Furthermore, clear lineage-specific differences are detectable between generalist and specialist species within the melanogaster subgroup. First, the two independently evolved specialists (D. sechellia and D. erecta) are losing Gr genes approximately five times more rapidly than the generalist species121,124. We believe this result is robust to sequence quality, because all pseudogenes and deletions were verified by direct re-sequencing and synteny-based orthologue searches, respectively. Generalists are expected to encounter the most diverse set of tastants and seem to have maintained the greatest diversity of gustatory receptors. Second, Or and Gr genes that remain intact in D. sechellia and D. erecta evolve significantly more rapidly along these two lineages (ω = 0.1556 for Ors and 0.1874 for Grs) than along the generalist lineages (ω = 0.1049 for Ors and 0.1658 for Grs; paired Wilcoxon, P = 0.0003 and 0.003, respectively124). There is some evidence that odorant-binding protein genes also evolve significantly faster in specialists compared to generalists122. This elevated ω reflects a trend observed throughout the genomes of the two specialists and is likely to result, at least in part, from demographic phenomena. However, the difference between specialist and generalist ω for Or/Gr genes (0.0292) is significantly greater than the difference for genes across the genome (0.0091; MWU, P = 0.0052)121, suggesting a change in selective regime. Moreover, the observation that elevated ω as well as accelerated gene loss disproportionately affect groups of Or and Gr genes that respond to specific chemical ligands and/or are expressed during specific life stages suggests that rapid evolution at Or/Gr loci in specialists is related to the ecological shifts these species have sustained121.
The larval food sources for many Drosophila species contain a cocktail of toxic compounds, and consequently Drosophila genomes encode a wide variety of detoxification proteins. These include members of the cytochrome P450 (P450), carboxyl/choline-esterase (CCE) and glutathione S-transferase (GST) multigene families, all of which also have critical roles in resistance to insecticides125,126,127. Among the P450s, the five enzymes associated with insecticide resistance are highly dynamic across the phylogeny, with 24 duplication events and 4 loss events since the last common ancestor of the genus, which is in striking contrast to genes with known developmental roles, eight of which are present as a single copy in all 12 species (C. Robin, personal communication). As with chemoreceptors, specialists seem to lose detoxification genes at a faster rate than generalists. For instance, D. sechellia has lost the most P450 genes; these 14 losses comprise almost one-third of all P450 loss events (Supplementary Table 13) (C. Robin, personal communication). Positive selection has been implicated in detoxification-gene evolution as well, because a search for positive selection among GSTs identified the parallel evolution of a radical glycine to lysine amino acid change in GSTD1, an enzyme known to degrade DDT128. Finally, although metabolic enzymes in general are highly constrained (median ω = 0.045 for enzymes, 0.066 for non-enzymes; MWU, P = 5.7 × 10-24), enzymes involved in xenobiotic metabolism evolve significantly faster than other enzymes (median ω = 0.05 for the xenobiotic group versus ω = 0.045 overall, two-tailed permutation test, P = 0.0110; A. J. Greenberg, personal communication).
Metazoans deal with excess selenium in the diet by sequestration in selenoproteins, which incorporate the rare amino acid selenocysteine (Sec) at sites specified by the TGA codon. The recoding of the normally terminating signal TGA as a Sec codon is mediated by the selenocystein insertion sequence (SECIS), a secondary structure in the 3′ UTR of selenoprotein messenger RNAs. All animals examined so far have selenoproteins; three have been identified in D. melanogaster (SELG, SELM and SPS2129,130). Interestingly, although the three known melanogaster selenoproteins are all present in the genomes of the other Drosophila species, in D. willistoni the TGA Sec codons have been substituted by cysteine codons (TGT/TGC). Consistent with this finding, analysis of the seven genes implicated to date in selenoprotein synthesis including the Sec-specific tRNA suggests that most of these genes are absent in D. willistoni (R. Guigo, personal communication). D. willistoni thus seems to be the first animal known to lack selenoproteins. If correct, this observation is all the more remarkable given the ubiquity of selenoproteins and the selenoprotein biosynthesis machinery in metazoans, the toxicity of excess selenium, and the protection from oxidative stress mediated by selenoproteins. However, it remains possible that this species encodes selenoproteins in a different way, and this represents an exciting avenue of future research.
Drosophila, like all insects, possesses an innate immune system with many components analogous to the innate immune pathways of mammals, although it lacks an antibody-mediated adaptive immune system131. Immune system genes often evolve rapidly and adaptively, driven by selection pressures from pathogens and parasites132,133,134. The genus Drosophila is no exception: immune system genes evolve more rapidly than non-immune genes, showing both high total divergence rates and specific signs of positive selection135. In particular, 29% of receptor genes involved in phagocytosis seem to evolve under positive selection, suggesting that molecular co-evolution between Drosophila pattern recognition receptors and pathogen antigens is driving adaptation in the immune system135. Somewhat surprisingly, genes encoding effector proteins such as antimicrobial peptides are far less likely to exhibit adaptive sequence evolution. Only 5% of effector genes (and no antimicrobial peptides) show evidence of adaptive evolution, compared to 10% of genes genome-wide. Instead, effector genes seem to evolve by rapid duplication and deletion. Whereas 49% of genes genome-wide, 63% of genes involved in pathogen recognition and 81% of genes implicated in immune-related signal transduction can be found as single-copy orthologues in all 12 species, only 40% of effector genes exist as single-copy orthologues across the genus (χ2 = 41.13, P = 2.53 × 10-8), suggesting rapid radiation of effector protein classes along particular lineages135. Thus, much of the Drosophila immune system seems to evolve rapidly, although the mode of evolution varies across immune-gene functional classes.
Genes encoding sex- and reproduction-related proteins are subject to a wide array of selective forces, including sexual conflict, sperm competition and cryptic female choice, and to the extent that these selective forces are of evolutionary consequence, this should lead to rapid evolution in these genes136 (for an overview see refs 137, 138). The analysis of 2,505 sex- and reproduction-related genes within the melanogaster group indicated that male sex- and reproduction-related genes evolve more rapidly at the protein level than genes not involved in sex or reproduction or than female sex- and reproduction-related genes (Supplementary Fig. 8). Positive selection seems to be at least partially responsible for these patterns, because genes involved in spermatogenesis have significantly stronger evidence for positive selection than do non-spermatogenesis genes (permutation test, P = 0.0053). Similarly, genes that encode components of seminal fluid have significantly stronger evidence for positive selection than ‘non-sex’ genes139. Moreover, protein-coding genes involved in male reproduction, especially seminal fluid and testis genes, are particularly likely to be lost or gained across Drosophila species29,139.
Functional elements in mtDNA are strongly conserved, as expected: tRNAs are relatively more conserved than the mtDNA overall (average pairwise nucleotide distance = 0.055 substitutions per site for tRNAs versus 0.125 substitutions per site overall). We observe a deficit of substitutions occurring in the stem regions of the stem-loop structure in tRNAs, consistent with strong selective pressure to maintain RNA secondary structure, and there is a strong signature of purifying selection in protein-coding genes13. However, despite their shared role in aerobic respiration, there is marked heterogeneity in the rates of amino acid divergence between the oxidative phosphorylation enzyme complexes across the 12 species (NADH dehydrogenase, 0.059 > ATPase, 0.042 >Cytb ,0.037>细胞色素氧化酶,0.020;平均成对DN),这与同义替代率的相对同质性形成对比。每个酶复合物而不是单个速率具有不同替代速率的模型为数据提供了明显的拟合(p <0.0001),这表明线粒体突变的复杂特异性选择性效应13。
赞 (9)
评论列表(3条)
我是永利号的签约作者“admin”
本文概览: 为了研究蛋白质编码基因的分子演化,我们使用PAML75估计了六个melanogaster群中8,510个单拷贝直系同源物中同义和非同义替代的速率(补充信息第11.1节);同...
文章不错《基因和基因组在果蝇系统发育上的进化》内容很有帮助