The paper presents the results on the dimensionality reduction technique which is based on radial basis function (RBF) theory. The technique uses RBF for mapping multidimensional data points into a low-dimensional space by interpolating the previously calculated position of so-called control points. This paper analyses various ways of selection of control points (regularized orthogonal least squares method, random and stratified selections). The experiments have been carried out with 8 real and artificial data sets. Positions of the control points in a low-dimensional space are found by principal component analysis. Combinations of RBF technique with random and stratified selections outperformed RBF with regularized orthogonal least squares algorithm regarding to computation time analysing all data sets. We demonstrate that random and stratified selections of control points are efficient and acceptable in terms of balance between projection error (stress) and time-consumption.
Breast cancer is the most frequent women cancer form and one of the leading mortality causes among women around the world. Patients with pathological mutation of a BRCA gene have 65% lifelong breast cancer probability. It is known that such patients have different cause of illness. In this study, we have proposed a new approach for the prediction of BRCA mutation carriers by methodically applying knowledge discovery steps and utilizing data mining methods. An alternative BRCA risk assessment model has been created utilizing decision tree classifier model. The biggest challenge was a very small size and imbalanced nature of the initial dataset, which have been collected by clinicians during 4 years of clinical trial. Iterative optimization of initial dataset, optimal algorithms selection and their parameterization have resulted in higher classifier model performance, with acceptable prediction accuracy for the clinical usage. In this study, three data mining problems have been analyzed using eleven data mining algorithms.