Title: | A new fuzzy clustering-based imputation method |
Author(s): | Le T. |
Keywords: | Distribution based imputation; Fuzzy c-means; Gene expression analysis; Missing data imputation |
Abstract: | Fuzzy clustering has been used in numerous research disciplines and commercial applications to identify groups of real-world objects. Most fuzzy clustering algorithms require complete datasets; however, real-world datasets may have missing values due to technical limitations. To address this problem, we present a new algorithm where data are clustered using the Fuzzy C-Means algorithm, followed by approximating the fuzzy partition by a probabilistic data distribution model which is then used for missing value imputation as well as for defuzzification. Using distribution-based approach, our method is most appropriate for datasets where the data are non-uniform. We show that our method outperforms seven popular imputation algorithms on uniform and non-uniform artificial datasets as well as real datasets with unknown data distribution model. |
Issue Date: | 2018 |
Publisher: | Institute of Electrical and Electronics Engineers Inc. |
URI: | http://digital.lib.ueh.edu.vn/handle/UEH/62273 |
DOI: | https://doi.org/10.1109/CSCI46756.2018.00265 |
ISBN: | 9781728113609 |
Appears in Collections: | Conference Papers
|