Screening of Key Genes in Pre-eclampsia and Construction of a Risk-Assessment Model Based on Machine-Learning Algorithms
摘要Objective:To identify potential key genes associated with pre-eclamp-sia through bioinformatics analysis,construct predictive models using machine-learning algorithms,and evaluate the models'performance in predicting pre-eclampsia.Meth-ods:Gene-expression microarray datasets GSE10588,GSE66273,and GSE30186 re-lated to pre-eclampsia were downloaded from the gene expression omnibus(GEO).Data were normalized using R,and differentially expressed genes(DEGs)were identi-fied.LASSO regression was applied to further filter DEGs.Based on the selected DEGs,six machine-learning models-logistic regression(LR),random forest(RF),sup-port vector machine(SVM),K-nearest neighbors(KNN),neural network(NN),and eXtreme gradient boosting(XGBoost)were built in R,and their performance was vali-dated.Results:From the three datasets,a total of 1,363 genes were extracted.LASSO regression narrowed these to 265 candidate key genes.Multivariate analysis ultimately identified four genes closely associated with pre-eclampsia:EVI5,GCLM,LEP,and SYNPO2L.Using these four key genes,six machine-learning models were construct-ed.Receiver operating characteristic(ROC)analysis showed that all models achieved AUC>0.9:LR(AUC=0.983,95%CI=0.942-0.998),RF(AUC=0.961,95%CI=0.912-0.987),SVM(AUC=0.936,95%CI=0.879-0.972),KNN(AUC=0.970,95%CI=0.924-0.992),NN(AUC=0.916,95%CI=0.854-0.958),and XGBoost(AUC=0.952,95%CI=0.900-0.982).There was no statistically significant difference among the AUCs of the models(P>0.05).Conclusion:This study identified four key genes linked to pre-eclampsia through integrated bioinformatics analysis.Predictive models built on these genes can accurately forecast the occurrence of pre-eclampsia,suggesting that the four genes may serve as potential biomarkers for early diagnosis and therapeutic tar-getting of pre-eclampsia.
更多相关知识
- 浏览2
- 被引0
- 下载0

相似文献
- 中文期刊
- 外文期刊
- 学位论文
- 会议论文


换一批



