The free energy profile for the translocation of the compound through the membrane

We further showed that, while most highorder combinations are trivial extensions of their subsets, there are indeed high-order combinations in real datasets and they have stronger associations with some disease phenotypes beyond single SNPs and low-order SNP combinations. We also evaluated the effect of two strategies for enhancing the statistical power of highorder SNP combination search: filtering out SNP combinations with lower or similar discriminative power than their subsets and constraining the search space with known biological gene sets. Further leveraging the improved statistical power of this framework, we explored the functional interactions within the SNP combinations discovered from three real case-control datasets and revealed a positive connection between the increase of discriminative power of a SNP combination over its subsets and the functional coherence among the genes covered by the combination. Last but not least, we investigated two representative high-order SNP combinations discovered from a lung cancer case-control GDC-0199 dataset and a kidney transplant-rejection case-control dataset respectively, and showed that the genes covered by the two patterns are enriched with molecular interaction networks that are highly relevant to the risk of lung cancer and risk of rejection after kidney transplant, respectively. These results demonstrate the ability of our approach to find statistically significant and biologically relevant high-order, patterns, but we likely find only a subset of all possible SNP patterns of interest. In particular, some interesting patterns could be eliminated during the discriminative pattern mining step or in the x2 jump filtering step. Other existing approaches may discover some of these missed patterns, but likely miss many of the highorder patterns we find. Thus, what we provide is a well-founded and efficient approach to pattern discovery in SNP datasets. Given that there has been a lack of tools for higher-order combination analysis due to computational and statistical challenges, the proposed framework is expected to help discover novel genotype-phenotype associations missed by existing approaches that mostly take the route of univariate analysis, pathway/network enrichment analyses that are based on univariate statistics, or epistasis analysis of low-order SNP combinations. In addition to the proposed framework itself, some general observations made in this study could also help the development of other computational techniques that search for high-order SNP combination and exploit functional insights, namely two strategies for enhancing statistical power to cope with multiple GSK212 hypothesis testing in the combinatorial search could be leveraged by other approaches.

Leave a Reply

Your email address will not be published.