拉曼光谱结合机器学习算法对乳腺癌及其分子亚型的识别
Identification of breast cancer and its molecular sub-types via Raman spectroscopy combined with machine learning algorithms
摘要目的:开发一种简单、快速且便捷的检测方法用于乳腺癌及其分子亚型的识别。方法:采用激光共聚焦拉曼仪采集乳腺正常细胞与乳腺癌不同分子亚型细胞的拉曼光谱图,对其拉曼光谱峰进行物质归属。先选用Savitzky-Golay平滑(窗口大小为9)对拉曼光谱图进行平滑和去噪处理,后采用迭代自适应加权惩罚最小二乘法对其进行基线矫正,并通过主成分分析剔除异常值。借助偏最小二乘判别分析(PLS-DA)、K最邻近(KNN)和支持向量机(SVM)3种原理不同的算法建立乳腺正常细胞与乳腺癌细胞的识别模型及乳腺癌不同分子亚型细胞的识别模型。结果:乳腺正常细胞与乳腺癌细胞的谱图形状以及拉曼光谱峰位移相似,但强度却存在较大差异。机器学习的模型结果显示,PLS-DA和SVM算法对乳腺正常细胞与乳腺癌细胞区分的识别准确度分别在92.03%、90.67%以上。PLS-DA和SVM算法对乳腺癌不同分子亚型细胞的识别准确度分别为(83.66±2.77)%、(90.55±0.06)%。结论:拉曼光谱结合机器学习算法可实现乳腺正常细胞与乳腺癌细胞、不同分子亚型的乳腺癌细胞的准确识别。
更多相关知识
abstractsObjective:To develop a simple, rapid, and convenient analysis method for the identification of breast cancer and its molecular sub-types.Methods:A laser confocal Raman spectrometer was used to collect Raman spectrograms of normal breast cells and different molecular sub-types of breast cancer cells, and assign the material origin of the Raman spectral peaks. First, Savitzky-Golay smoothing (with a window size of 9) was selected to perform smoothing and denoising on the Raman spectrogram. Subsequently, an iterative adaptive weighted penalty least squares method was employed for baseline correction, and principal component analysis was used to eliminate outliers. The recognition model of normal breast cells and breast cancer cells and the recognition model of different molecular sub-types of breast cancer cells were established by using three algorithms with different principles, including partial least squares discriminant analysis (PLS-DA), K-nearest neighbor (KNN), and support vector machine (SVM).Results:The Raman spectrogram and Raman peak shifts of normal breast cells and breast cancer cells were similar, but there were significant differences in intensity. The results of the machine learning models showed that the recognition accuracy of PLS-DA and SVM algorithms for distinguishing between normal breast cells and breast cancer cells was above 92.03% and 90.67%, respectively. The recognition accuracy of PLS-DA and SVM algorithms for different molecular sub-types of breast cancer cells was (83.66 ± 2.77)% and (90.55 ± 0.06)%, respectively.Conclusions:The combination of Raman spectroscopy and machine learning algorithms can achieve accurate identification of normal breast cells, breast cancer cells, and different molecular sub-types of breast cancer cells.
More相关知识
- 浏览16
- 被引0
- 下载0

相似文献
- 中文期刊
- 外文期刊
- 学位论文
- 会议论文