Journal of Northeastern University(Natural Science) ›› 2024, Vol. 45 ›› Issue (12): 1706-1716.DOI: 10.12068/j.issn.1005-3026.2024.12.005
• Information & Control • Previous Articles
Jie LIU, Wen-jing TAN, Zhan-shan LI()
Received:
2023-06-09
Online:
2024-12-10
Published:
2025-03-18
Contact:
Zhan-shan LI
CLC Number:
Jie LIU, Wen-jing TAN, Zhan-shan LI. Unsupervised Feature Selection Based on Sparse Self-representation with Manifold Regularization[J]. Journal of Northeastern University(Natural Science), 2024, 45(12): 1706-1716.
数据集 | 样本数 | 特征数 | 分类数 |
---|---|---|---|
DBworld | 64 | 4 702 | 2 |
PCMAC | 1 943 | 3 289 | 2 |
TOX-171 | 1 715 | 748 | 4 |
lung | 203 | 3 312 | 5 |
lymphoma | 96 | 4 026 | 9 |
nci9 | 60 | 9 712 | 9 |
JAFFE | 213 | 1 024 | 10 |
warpPIE10P | 210 | 2 420 | 10 |
Isolet | 1 560 | 617 | 26 |
Table1 Description of the experimental dataset
数据集 | 样本数 | 特征数 | 分类数 |
---|---|---|---|
DBworld | 64 | 4 702 | 2 |
PCMAC | 1 943 | 3 289 | 2 |
TOX-171 | 1 715 | 748 | 4 |
lung | 203 | 3 312 | 5 |
lymphoma | 96 | 4 026 | 9 |
nci9 | 60 | 9 712 | 9 |
JAFFE | 213 | 1 024 | 10 |
warpPIE10P | 210 | 2 420 | 10 |
Isolet | 1 560 | 617 | 26 |
数据集 | Baseline | LS | MCFS | SPEC | UDFS | RSR | SCFS | NOVRSR | SSRMR |
---|---|---|---|---|---|---|---|---|---|
DBworld | 66.953 | 82.891 | 88.047 | 57.344 | 90.703 | 93.516 | 90.938 | 92.188 | |
±12.503 | ±4.514 | ±1.584 | ±1.116 | ±0.341 | ±0.558 | ±1.683 | ±0 | ||
(4702) | (180) | (120) | (200) | (200) | (200) | (200) | (200) | (200) | |
PCMAC | 50.54 | 50.466 | 50.525 | 55.288 | 50.587 | 50.592 | 50.849 | 62.198 | |
±0.033 | ±0.164 | ±0.024 | ±1.077 | ±0.022 | ±0 | ±0 | ±0.104 | ||
(3289) | (200) | (200) | (60) | (20) | (60) | (200) | (60) | (200) | |
TOX171 | 42.69 | 40.029 | 41.316 | 42.456 | 41.345 | 41.55 | 46.345 | 53.509 | |
±2.661 | ±2.071 | ±0.425 | ±0.726 | ±1.373 | ±1.414 | ±3.537 | ±1.649 | ||
(5748) | (200) | (20) | (20) | (160) | (100) | (200) | (200) | (100) | |
lung | 76.626 | 67.4631 | 85.813 | 57.512 | 62.192 | 76.502 | 87.094 | 78.202 | |
±6.922 | ±0.802 | ±1.13 | ±3.384 | ±4.975 | ±6.893 | ±0.251 | ±8.175 | ||
(3312) | (140) | (200) | (180) | (200) | (200) | (180) | (200) | (200) | |
lymphoma | 59.375 | 53.333 | 61.615 | 46.25 | 58.333 | 54.063 | 65.521 | 60.365 | |
±4.576 | ±5.586 | ±3.642 | ±2.244 | ±5.322 | ±2.256 | ±5.511 | ±5.869 | ||
(4026) | (200) | (60) | (120) | (200) | (180) | (200) | (200) | (160) | |
nci9 | 43.083 | 40.917 | 43.833 | 40 | 47.333 | 39.417 | 43.833 | 55.333 | |
±3.13 | ±3.183 | ±3.617 | ±3.456 | ±2.759 | ±4.058 | ±3.578 | ±3.317 | ||
(9712) | (140) | (40) | (180) | (20) | (200) | (120) | (200) | (200) | |
JAFFE | 85.962 | 70.493 | 82.324 | 66.737 | 80.329 | 85.164 | 88.568 | 95.328 | |
±2.729 | ±4.223 | ±4.839 | ±4.467 | ±5.596 | ±2.229 | ±2.970 | ±1.201 | ||
(1024) | (180) | (180) | (200) | (180) | (100) | (160) | (200) | (100) | |
warpPIE10P | 26.738 | 35.31 | 44.69 | 30.69 | 41.262 | 30.071 | 26.595 | 48.31 | |
±2.016 | ±2.627 | ±1.912 | ±2.354 | ±1.834 | ±2.462 | ±1.349 | ±4.223 | ||
(2420) | (80) | (80) | (200) | (80) | (60) | (20) | (200) | (160) | |
Isolet | 62.083 | 57.205 | 52.375 | 56.644 | 55.705 | 67.631 | 72.356 | 70.737 | |
±2.678 | ±1.759 | ±1.818 | ±2.125 | ±2.385 | ±2.165 | ±2.0 | ±2.132 | ||
(617) | (200) | (180) | (200) | (200) | (200) | (180) | (200) | (200) |
Table 2 Clustering results ACC on different datasets
数据集 | Baseline | LS | MCFS | SPEC | UDFS | RSR | SCFS | NOVRSR | SSRMR |
---|---|---|---|---|---|---|---|---|---|
DBworld | 66.953 | 82.891 | 88.047 | 57.344 | 90.703 | 93.516 | 90.938 | 92.188 | |
±12.503 | ±4.514 | ±1.584 | ±1.116 | ±0.341 | ±0.558 | ±1.683 | ±0 | ||
(4702) | (180) | (120) | (200) | (200) | (200) | (200) | (200) | (200) | |
PCMAC | 50.54 | 50.466 | 50.525 | 55.288 | 50.587 | 50.592 | 50.849 | 62.198 | |
±0.033 | ±0.164 | ±0.024 | ±1.077 | ±0.022 | ±0 | ±0 | ±0.104 | ||
(3289) | (200) | (200) | (60) | (20) | (60) | (200) | (60) | (200) | |
TOX171 | 42.69 | 40.029 | 41.316 | 42.456 | 41.345 | 41.55 | 46.345 | 53.509 | |
±2.661 | ±2.071 | ±0.425 | ±0.726 | ±1.373 | ±1.414 | ±3.537 | ±1.649 | ||
(5748) | (200) | (20) | (20) | (160) | (100) | (200) | (200) | (100) | |
lung | 76.626 | 67.4631 | 85.813 | 57.512 | 62.192 | 76.502 | 87.094 | 78.202 | |
±6.922 | ±0.802 | ±1.13 | ±3.384 | ±4.975 | ±6.893 | ±0.251 | ±8.175 | ||
(3312) | (140) | (200) | (180) | (200) | (200) | (180) | (200) | (200) | |
lymphoma | 59.375 | 53.333 | 61.615 | 46.25 | 58.333 | 54.063 | 65.521 | 60.365 | |
±4.576 | ±5.586 | ±3.642 | ±2.244 | ±5.322 | ±2.256 | ±5.511 | ±5.869 | ||
(4026) | (200) | (60) | (120) | (200) | (180) | (200) | (200) | (160) | |
nci9 | 43.083 | 40.917 | 43.833 | 40 | 47.333 | 39.417 | 43.833 | 55.333 | |
±3.13 | ±3.183 | ±3.617 | ±3.456 | ±2.759 | ±4.058 | ±3.578 | ±3.317 | ||
(9712) | (140) | (40) | (180) | (20) | (200) | (120) | (200) | (200) | |
JAFFE | 85.962 | 70.493 | 82.324 | 66.737 | 80.329 | 85.164 | 88.568 | 95.328 | |
±2.729 | ±4.223 | ±4.839 | ±4.467 | ±5.596 | ±2.229 | ±2.970 | ±1.201 | ||
(1024) | (180) | (180) | (200) | (180) | (100) | (160) | (200) | (100) | |
warpPIE10P | 26.738 | 35.31 | 44.69 | 30.69 | 41.262 | 30.071 | 26.595 | 48.31 | |
±2.016 | ±2.627 | ±1.912 | ±2.354 | ±1.834 | ±2.462 | ±1.349 | ±4.223 | ||
(2420) | (80) | (80) | (200) | (80) | (60) | (20) | (200) | (160) | |
Isolet | 62.083 | 57.205 | 52.375 | 56.644 | 55.705 | 67.631 | 72.356 | 70.737 | |
±2.678 | ±1.759 | ±1.818 | ±2.125 | ±2.385 | ±2.165 | ±2.0 | ±2.132 | ||
(617) | (200) | (180) | (200) | (200) | (200) | (180) | (200) | (200) |
数据集 | Baseline | LS | MCFS | SPEC | UDFS | RSR | SCFS | NOVRSR | SSRMR |
---|---|---|---|---|---|---|---|---|---|
DBworld | 17.375 | 44.727 | 47.326 | 2.479 | 63.825 | 56.353 | |||
±15.971 | ±6.811 | ±4.008 | ±0 | ±0.882 | ±5.407 | ||||
(4 702) | (120) | (120) | (40) | (200) | (160) | (200) | (160) | (200) | |
PCMAC | 0.008 | 1.381 | 0.471 | 2.033 | 6.926 | 0.03 | 0.104 | 1.592 | |
±0.011 | ±0.236 | ±0 | ±0 | ±1.382 | ±0.016 | ±0 | ±0 | ||
(3 289) | (80) | (80) | (60) | (20) | (60) | (200) | (20) | (40) | |
TOX171 | 14.688 | 12.98 | 9.874 | 10.148 | 13.272 | 12.177 | 23.294 | 31.578 | |
±2.74 | ±0.921 | ±0.034 | ±0.971 | ±0.762 | ±0.987 | ±7.503 | ±0.512 | ||
(5 748) | (200) | (20) | (20) | (180) | (140) | (200) | (200) | (200) | |
lung | 65.168 | 53.709 | 67.189 | 45.702 | 50.832 | 60.323 | 70.218 | 63.707 | |
±2.776 | ±0.9 | ±1.793 | ±4.042 | ±2.446 | ±2.805 | ±0.351 | ±2.07 | ||
(3 312) | (140) | (200) | (200) | (200) | (180) | (180) | (200) | (200) | |
lymphoma | 69.043 | 58.015 | 47.799 | 65.767 | 66.604 | 70.126 | 69.171 | 71.931 | |
±3.416 | ±3.49 | ±3.515 | ±4.057 | ±1.985 | ±3.126 | ±3.523 | ±2.616 | ||
(4 026) | (120) | (200) | (120) | (160) | (180) | (180) | (200) | (200) | |
nci9 | 44.395 | 41.921 | 45.858 | 41.222 | 39.115 | 50.2 | 44.908 | 56.22 | |
±3.41 | ±3.811 | ±3.146 | ±2.96 | ±2.84 | ±2.0 | ±3.086 | ±2.949 | ||
(9 712) | (160) | (60) | (180) | (20) | (200) | (120) | (200) | (200) | |
JAFFE | 86.014 | 69.179 | 85.84 | 69.808 | 85.426 | 83.381 | 88.24 | 93.202 | |
±1.560 | ±2.074 | ±1.601 | ±2.818 | ±1.707 | ±2.019 | ±1.486 | ±1.238 | ||
(1 024) | (180) | (40) | (200) | (160) | (180) | (160) | (200) | (100) | |
warpPIE10P | 26.221 | 33.778 | 31.614 | 41.078 | 27.508 | 26.595 | 47.854 | 58.906 | |
±3.363 | ±1.906 | ±1.814 | ±2.233 | ±2.584 | ±1.392 | ±1.267 | ±3.268 | ||
(2 420) | (180) | (80) | (200) | (80) | (180) | (200) | (200) | (160) | |
Isolet | 77.268 | 73.573 | 69.561 | 66.886 | 71.741 | 79.497 | 79.362 | 80.156 | |
±1.288 | ±0.704 | ±1.122 | ±0.703 | ±1.321 | ±1.056 | ±0.807 | ±1.052 | ||
(617) | (200) | (180) | (200) | (200) | (200) | (180) | (200) | (200) |
Table 3 Clustering results NMI on different datasets
数据集 | Baseline | LS | MCFS | SPEC | UDFS | RSR | SCFS | NOVRSR | SSRMR |
---|---|---|---|---|---|---|---|---|---|
DBworld | 17.375 | 44.727 | 47.326 | 2.479 | 63.825 | 56.353 | |||
±15.971 | ±6.811 | ±4.008 | ±0 | ±0.882 | ±5.407 | ||||
(4 702) | (120) | (120) | (40) | (200) | (160) | (200) | (160) | (200) | |
PCMAC | 0.008 | 1.381 | 0.471 | 2.033 | 6.926 | 0.03 | 0.104 | 1.592 | |
±0.011 | ±0.236 | ±0 | ±0 | ±1.382 | ±0.016 | ±0 | ±0 | ||
(3 289) | (80) | (80) | (60) | (20) | (60) | (200) | (20) | (40) | |
TOX171 | 14.688 | 12.98 | 9.874 | 10.148 | 13.272 | 12.177 | 23.294 | 31.578 | |
±2.74 | ±0.921 | ±0.034 | ±0.971 | ±0.762 | ±0.987 | ±7.503 | ±0.512 | ||
(5 748) | (200) | (20) | (20) | (180) | (140) | (200) | (200) | (200) | |
lung | 65.168 | 53.709 | 67.189 | 45.702 | 50.832 | 60.323 | 70.218 | 63.707 | |
±2.776 | ±0.9 | ±1.793 | ±4.042 | ±2.446 | ±2.805 | ±0.351 | ±2.07 | ||
(3 312) | (140) | (200) | (200) | (200) | (180) | (180) | (200) | (200) | |
lymphoma | 69.043 | 58.015 | 47.799 | 65.767 | 66.604 | 70.126 | 69.171 | 71.931 | |
±3.416 | ±3.49 | ±3.515 | ±4.057 | ±1.985 | ±3.126 | ±3.523 | ±2.616 | ||
(4 026) | (120) | (200) | (120) | (160) | (180) | (180) | (200) | (200) | |
nci9 | 44.395 | 41.921 | 45.858 | 41.222 | 39.115 | 50.2 | 44.908 | 56.22 | |
±3.41 | ±3.811 | ±3.146 | ±2.96 | ±2.84 | ±2.0 | ±3.086 | ±2.949 | ||
(9 712) | (160) | (60) | (180) | (20) | (200) | (120) | (200) | (200) | |
JAFFE | 86.014 | 69.179 | 85.84 | 69.808 | 85.426 | 83.381 | 88.24 | 93.202 | |
±1.560 | ±2.074 | ±1.601 | ±2.818 | ±1.707 | ±2.019 | ±1.486 | ±1.238 | ||
(1 024) | (180) | (40) | (200) | (160) | (180) | (160) | (200) | (100) | |
warpPIE10P | 26.221 | 33.778 | 31.614 | 41.078 | 27.508 | 26.595 | 47.854 | 58.906 | |
±3.363 | ±1.906 | ±1.814 | ±2.233 | ±2.584 | ±1.392 | ±1.267 | ±3.268 | ||
(2 420) | (180) | (80) | (200) | (80) | (180) | (200) | (200) | (160) | |
Isolet | 77.268 | 73.573 | 69.561 | 66.886 | 71.741 | 79.497 | 79.362 | 80.156 | |
±1.288 | ±0.704 | ±1.122 | ±0.703 | ±1.321 | ±1.056 | ±0.807 | ±1.052 | ||
(617) | (200) | (180) | (200) | (200) | (200) | (180) | (200) | (200) |
方法 | DBworld | PCMAC | TOX171 | lung | lymphoma | nci9 | JAFFE | warpPIE10P | Isolet |
---|---|---|---|---|---|---|---|---|---|
SCFS | 44.337 | 37.518 | 332.559 | 14.911 | 357.831 | 1 413.197 | 1.520 | 15.725 | 3.805 |
NOVRSR | 13.204 | 234.375 | 722.079 | 180.455 | 291.301 | 3 033.359 | 11.20 | 80.941 | 2.963 |
SSRMR | 124.265 | 23.767 | 8 707.735 | 80.199 | 10.971 | 148.519 | 2.937 | 3.268 | 45.972 |
Table 4 Running time comparison of different methods
方法 | DBworld | PCMAC | TOX171 | lung | lymphoma | nci9 | JAFFE | warpPIE10P | Isolet |
---|---|---|---|---|---|---|---|---|---|
SCFS | 44.337 | 37.518 | 332.559 | 14.911 | 357.831 | 1 413.197 | 1.520 | 15.725 | 3.805 |
NOVRSR | 13.204 | 234.375 | 722.079 | 180.455 | 291.301 | 3 033.359 | 11.20 | 80.941 | 2.963 |
SSRMR | 124.265 | 23.767 | 8 707.735 | 80.199 | 10.971 | 148.519 | 2.937 | 3.268 | 45.972 |
1 | Solorio‑Fernández S, Ariel J, Martínez‑Trinidad J.A review of unsupervised feature selection methods[J].The Artificial Intelligence Review,2020,53(2):907-948. |
2 | He X F, Cai D, Niyogi P.Laplacian score for feature selection[C]// Proceedings of the 18th International Conference on Neural Information Processing Systems.Vancouver,2005:507-514. |
3 | Yang Y, Shen H T, Ma Z G,et al .L 2,1‑norm regularized discriminative feature selection for unsupervised learning[C]// Proceedings of the Twenty‑Second International Joint Conference on Artificial Intelligence-Volume Two.Menlo Park:AAAI Press,2011:1589-1594. |
4 | Zhao Z, Liu H.Spectral feature selection for supervised and unsupervised learning[C]// Proceedings of the 24th International Conference on Machine Learning.Corvalis,2007:1151-1157. |
5 | Cai D, Zhang C, He X F.Unsupervised feature selection for multi‑cluster data[C]// Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.Washington DC,2010:333-342. |
6 | 李占山,刘兆赓,俞寅,等.量子化信息素蚁群优化特征选择算法[J].东北大学学报(自然科学版),2020,41(1):17-22. |
Li Zhan‑shaan, Liu Zhao‑geng, Yu Yin,et al.A quantized pheromone ant colony optimization algorithm for feature selection[J].Journal of Northeastern University(Natural Science),2020,41(1):17-22. | |
7 | Hu R Y, Zhu X F, Cheng D B,et al.Graph self‑representation method for unsupervised feature selection[J].Neurocomputing,2017,220:130-137. |
8 | Nie F P, Zhu W, Li X L.Unsupervised feature selection with structured graph optimization[C]//Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence.Phoenix,2016:1302-1308. |
9 | Lim H, Kim D.Pairwise dependence‑based unsupervised feature selection[J].Pattern Recognition,2021,111:107663. |
10 | Liu X W, Wang L, Zhang J,et al.Global and local structure preservation for feature selection[J].IEEE Transactions on Neural Networks and Learning Systems,2013,25(6):1083-1095. |
11 | Li W Y, Chen H M, Li T R,et al.Unsupervised feature selection via self‑paced learning and low‑redundant regularization[J].Knowledge‑Based Systems,2022,240:108150. |
12 | Parsa M, Zare H, Ghatee M.Unsupervised feature selection based on adaptive similarity learning and subspace clustering[J].Engineering Applications of Artificial Intelligence,2020,95:103855. |
13 | Zhu P F, Zuo W M, Zhang L,et al.Unsupervised feature selection by regularized self‑representation[J].Pattern Recognition,2015,48(2):438-446. |
14 | Miao J Y, Ping Y, Chen Z S,et al.Unsupervised feature selection by non‑convex regularized self‑representation[J].Expert Systems with Applications,2021,173:114643. |
15 | Shi Y, Miao J Y, Wang Z Y,et al.Feature selection with l 2,1 - 2 regularization[J].IEEE Transactions on Neural Networks and Learning Systems,2018,29(10):4967-4982. |
16 | Bottmer L, Croux C, Wilms I.Sparse regression for large data sets with outliers[J].European Journal of Operational Research,2022,297(2):782-794. |
17 | Yuille A L, Rangarajan A.The concave‑convex procedure[J].Neural Computation,2003,15(4):915-936. |
18 | Sriperumbudur B, Lanckriet G.On the convergence of the concave-convex procedure[C]// Proceedings of the 22nd International Conference on Neural Information Processing Systems.Vancouver,2009:1759-1767. |
19 | Boyd S, Parikh N, Chu E,et al.Distributed optimization and statistical learning via the alternating direction method of multipliers[J].Foundations and Trends® in Machine Learning,2011,3(1):1-122. |
[1] | Xiao-peng SHA, Jia-qi CAO, Wen-jing LI, Ye QIN. Three-Dimensional Reconstruction Method of Monocular Video Image Sequences [J]. Journal of Northeastern University(Natural Science), 2024, 45(12): 1680-1687. |
[2] | CHEN Long , LIU Qiao-bin, TAO Lei. Vehicle State Parameter Estimation Based on Graded Series Extended Kalman Filter Method [J]. Journal of Northeastern University(Natural Science), 2023, 44(8): 1144-1151. |
[3] | LIU Jun, HAO Li-ling, HE Guang-yu, XU Li-sheng. A Method of Model Parameters Subset Selection for Left Ventricle Pressure Waveform Individual Estimation [J]. Journal of Northeastern University(Natural Science), 2022, 43(8): 1080-1088. |
[4] | SHE Li-huang, LIU Ping-fan, ZHANG Shi, XU Fang-han. An Improved Root-MUSIC Algorithm with High Precision and Low Complexity [J]. Journal of Northeastern University(Natural Science), 2022, 43(4): 457-463. |
[5] | ZHANG Xue-feng, JIN Kai-jing. Admissibility and Robust Stabilization of Discrete Singular Systems Based on LMI [J]. Journal of Northeastern University(Natural Science), 2021, 42(4): 463-469. |
[6] | ZHANG Na, WANG Lu, CHENG Jun-na, TIAN Ji-rong. Adaptive Range-Gated 3D Imaging Based on Distributed Compressed Sensing [J]. Journal of Northeastern University(Natural Science), 2021, 42(4): 516-523. |
[7] | YE Ning, SONG Jin-chun, GAO Xi-ying, YU Zhong-liang. Output Feedback Adaptive Command Filtered Control of Electrohydraulic Actuator [J]. Journal of Northeastern University Natural Science, 2020, 41(9): 1310-1315. |
[8] | CAI Yan, ZHOU Jie, CHEN Jie, SONG Jin-chun. Positioning Accuracy Research on Bilateral Remote Control of Electro-Hydrostatic Actuators [J]. Journal of Northeastern University Natural Science, 2020, 41(5): 710-715. |
[9] | ZHAO Hai, ZHOU Bing-ling, ZHU Hong-bo, DOU Sheng-chang. Fast Segmentation Algorithm of 3D Lung Parenchyma Based on Continuous Max-Flow [J]. Journal of Northeastern University Natural Science, 2020, 41(4): 470-474. |
[10] | KANG Cheng-ming, ZHAO Chun-yu, FU Li-xin. Thermal Error Modeling of Machining Center Spindle Based on Physical Modeling Method [J]. Journal of Northeastern University Natural Science, 2020, 41(4): 528-533. |
[11] | ZHANG Chun-lei, DAI Li, LIU Yu, LI He. Patient Registration for Surgical Navigation System Based on Three-Point Method and ICP Algorithm [J]. Journal of Northeastern University Natural Science, 2020, 41(11): 1584-1590. |
[12] | ZHENG Yan, GAO Shuang. Speech Endpoint Detection Based on Fractal Dimension with Adaptive Threshold [J]. Journal of Northeastern University Natural Science, 2020, 41(1): 7-11. |
[13] | LI Wu-jie, CHEN Cong-gen, GUO Li-xin. Robust H○∞ Control of Active Suspension Based on Differential Geometry [J]. Journal of Northeastern University Natural Science, 2019, 40(5): 716-721. |
[14] | LI Bo-bo, YUAN Hui-qun, WANG Guang-ding, SUN Hong-yun. Optimization Design of Heavy-Duty Tractor Based on Sub-structure Modal Synthesis Method [J]. Journal of Northeastern University Natural Science, 2019, 40(4): 531-537. |
[15] | FAN Li , XIE Li-yang, ZHANG Na. Fatigue Robustness and Lightweight Design of Driving Axle Housing for Heavy Truck [J]. Journal of Northeastern University Natural Science, 2019, 40(3): 365-369. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||