In the analysis of gene expression data, when there are two or more disease conditions/groups (e.g., diseased and normal, responder and nonresponder, and multiple stages/subtypes), differential analysis has been extensively conducted to identify key differences and has important implications. Network analysis takes a system perspective and can be more informative than that limited to simple statistics such as mean and variance. In differential network analysis, a common practice is to first estimate a gene expression network for each condition/group, and then spectral clustering can be applied to the network difference(s) to identify key genes and biological mechanisms that lead to the differences. Compared to “simple” analysis such as regression, differential network analysis can be more challenging with the significantly larger number of parameters. In this study, taking advantage of the increasing popularity of multidimensional profiling data, we develop an assisted analysis strategy and propose incorporating regulator information to improve the identification of key genes (that lead to the differences in gene expression networks). An effective computational algorithm is developed. Comprehensive simulation is conducted, showing that the proposed approach can outperform the benchmark alternatives in identification accuracy. With the The Cancer Genome Atlas lung adenocarcinoma data, we analyze the expressions of genes in the KEGG cell cycle pathway, assisted by copy number variation data. The proposed assisted analysis leads to identification results similar to the alternatives but different estimations. Overall, this study can deliver an efficient and cost-effective way of improving differential network analysis.
Comments (0)