A central problem in customer relation management (CRM) is to cluster customers into meaningful groups. The problem is often called customer segmentation and is of paramount importance in the twenty-first century due to the rapid development of E-commerce which generates databases containing millions of customers. Recent algorithms in machine learning have been successful in clustering a wide range of datasets such as images, text documents, news and so on. Inspired by those accomplishments, we design a new segmentation model based on a combination of a deep neural network and a self-supervised probabilistic clustering technique. The new model is more flexible and more adaptive to the diversity of customer datasets compared to current heuristic algorithms in CRM. Moreover, feature engineering is the process to clean, prepare and transform raw data into features which are then fed into a model to produce clusters. To perform feature engineering, we combine a novel categorical encoding method in economics and an autoencoder, a recent machine learning data transformation method, to extract useful patterns from the original data. Our experiments with the full model on a set of retail transaction data from a supermarket chain in Ho Chi Minh city, Vietnam, show the capabilities of our algorithm to produce useful, explainable customer clusters.
Springer Science and Business Media Deutschland GmbH