ANALYSIS OF CUSTOMER REVIEWS USING NLP AND MACHINE LEARNING METHODS

АНАЛИЗ ОТЗЫВОВ ПОКУПАТЕЛЕЙ С ИСПОЛЬЗОВАНИЕМ МЕТОДОВ NLP И МАШИННОГО ОБУЧЕНИЯ
Tynysbekov P. Kultyshev E.
Цитировать:
Tynysbekov P., Kultyshev E. ANALYSIS OF CUSTOMER REVIEWS USING NLP AND MACHINE LEARNING METHODS // Universum: технические науки : электрон. научн. журн. 2025. 5(134). URL: https://7universum.com/ru/tech/archive/item/20095 (дата обращения: 05.12.2025).
Прочитать статью:
DOI - 10.32743/UniTech.2025.134.5.20095

 

ABSTRACT

In the modern digital economy, online product reviews have become a critical source of information for customers and businesses alike. This study focuses on analyzing customer reviews for the product "Miracle Noodle Zero Carb, Gluten Free Shirataki Pasta, Spinach Angel Hair" available on Amazon. Using Natural Language Processing (NLP) techniques, including sentiment analysis with the VADER tool and topic modeling through Latent Dirichlet Allocation (LDA), the research aims to uncover key factors influencing customer satisfaction and dissatisfaction. The dataset was preprocessed using tokenization, lemmatization, and stopword removal to ensure data quality. Sentiment analysis results demonstrated a strong correlation between textual sentiment and star ratings, while topic modeling revealed critical aspects such as taste, texture, and smell influencing customer opinions. Furthermore, machine learning models, particularly logistic regression, achieved high accuracy in classifying review sentiment. The findings offer actionable insights for product improvement and marketing strategies. This study confirms that systematic NLP-based review analysis is a powerful tool for product optimization and strategic decision-making. 

АННОТАЦИЯ

В современной цифровой экономике отзывы покупателей стали важным источником информации для оценки качества продукции и уровня удовлетворенности клиентов. Данное исследование направлено на анализ отзывов о продукте "Miracle Noodle Zero Carb, Gluten Free Shirataki Pasta, Spinach Angel Hair" с использованием методов обработки естественного языка (NLP). Эмоциональная окраска отзывов была оценена с помощью алгоритма VADER, а основные темы выделены с использованием метода Latent Dirichlet Allocation (LDA). Данные были предварительно обработаны через токенизацию, лемматизацию и удаление неинформативных слов. Результаты продемонстрировали сильную корреляцию между эмоциональной окраской текста и пользовательскими оценками. Модели машинного обучения, в частности логистическая регрессия, показали высокую точность в классификации отзывов по эмоциональной окраске. Это исследование подчеркивает значение системного анализа пользовательских отзывов для оптимизации качества продукции и маркетинговых стратегий.

 

Keywords: Natural Language Processing; Sentiment Analysis; Topic Modeling; Customer Reviews; Machine Learning; Product Improvement; Text Mining; VADER; LDA.

Ключевые слова: обработка естественного языка; анализ отзывов; тематическое моделирование; отзывы клиентов; машинное обучение; улучшение продукта; текстовый майнинг; VADER; LDA.

 

I. INTRODUCTION

In In the modern digital economy, consumer-generated content, particularly online product reviews, serves as a critical source of information for both customers and businesses. These reviews reflect customers' experiences and perceptions, offering valuable insights into product quality, usability, and overall satisfaction. However, analyzing large volumes of such unstructured text data manually is impractical. To address this challenge, Natural Language Processing (NLP) and machine learning techniques provide effective tools for extracting meaningful patterns from textual data.

This paper focuses on the analysis of customer reviews for the product "Miracle Noodle Zero Carb, Gluten Free Shirataki Pasta, Spinach Angel Hair" available on Amazon. Despite having a significant number of positive reviews, the product also receives a considerable amount of negative feedback, particularly concerning its texture, odor, and preparation process. Understanding these contrasting opinions is vital for product improvement, customer satisfaction, and market competitiveness.

The objective of this study is to apply NLP methods to identify key themes in customer feedback, assess sentiment trends, and provide data-driven recommendations for product enhancement. By applying sentiment analysis using the VADER tool and topic modeling with Latent Dirichlet Allocation (LDA), we aim to uncover both the strengths and weaknesses of the product as perceived by customers. These insights are crucial for guiding decisions in product development, marketing, and customer experience strategies.

II. MATERIALS AND METHODS

2.1 Dataset and Product Selection

The data used in this study was sourced from the publicly available Amazon Reviews dataset provided by Jianmo Ni et al. [1], which contains over five million reviews in the "Grocery and Gourmet Food" category. In order to ensure both relevance and analytical depth, the dataset was filtered to include only reviews for the product Miracle Noodle Zero Carb, Gluten Free Shirataki Pasta, Spinach Angel Hair. This particular product was selected due to its substantial volume of consumer feedback, characterized by a notably high number of both five-star and one-star reviews. Such a distribution enabled a meaningful exploration of opposing customer opinions. Additionally, to enhance the validity of the analysis, only reviews marked as verified purchases were retained. The distribution of star ratings for this product revealed a large cluster of extremely positive reviews, contrasted with a significant number of highly negative ones, as shown in Figure 1. This disparity signaled the presence of polarizing attributes that warranted deeper investigation through natural language processing techniques. The trend of average ratings over time is illustrated in Figure 2.

 


Figure 1. Rating distribution for the selected product

 


Figure 2. Average rating per year

 

2.1.1 Rationale for Product and Dataset Selection

The selection of Miracle Noodle Zero Carb, Gluten Free Shirataki Pasta, Spinach Angel Hair was guided by several important considerations. First, the product belongs to a niche segment targeting health-conscious consumers, which typically generates strong polarized feedback due to specific dietary needs and sensory expectations. Second, the product amassed a large number of both highly positive and highly negative reviews, offering a rich source of data for sentiment and topic analysis. Products with predominantly neutral or homogeneous reviews would not provide sufficient variability for robust NLP modeling. Additionally, customer feedback on grocery items tends to contain vivid language related to sensory attributes such as taste, smell, and texture, making it particularly suitable for sentiment analysis tools like VADER. By focusing on a product with strong emotional engagement and substantial review volume, we ensured that the analysis would yield meaningful, generalizable insights into consumer behavior in the specialized food market.

2.2 Text Preprocessing

Before applying machine learning and NLP algorithms, the review texts underwent preprocessing to ensure consistency and data quality. The steps included tokenization, lemmatization, stopword removal, and lowercasing. Tokenization split reviews into words, while lemmatization, using NLTK’s WordNetLemmatizer[8], reduced words to their base forms (e.g., "running" to "run"). Stopwords were removed using NLTK’s built-in list, with minor adjustments to retain food-related terms like "noodle" and "dish." Non-alphabetic characters were eliminated, and all text was converted to lowercase. This standardized preprocessing was crucial for improving the performance and accuracy of subsequent sentiment and topic modeling tasks.

2.3 Sentiment Analysis

To assess the emotional tone of the reviews, we employed the VADER (Valence Aware Dictionary and sEntiment Reasoner) algorithm, which is particularly well-suited for social media and product review texts. VADER computes sentiment scores across three dimensions—positive, neutral, and negative—and also provides a compound score that summarizes the overall polarity. The relationship between sentiment scores and star ratings is demonstrated in Figure 3. The impact of lemmatization on sentiment categories is shown in Figure 4, while the changes in average compound sentiment scores depending on preprocessing techniques are presented in Figure 5.

 


Figure 3. Distribution of sentiment scores across different star ratings

 


Figure 4. Sentiment group distribution before and after lemmatization

 


Figure 5. Comparison of average compound sentiment scores

 

2.4 Topic Modeling

In order to uncover recurring themes and concerns in customer feedback, we conducted topic modeling using the Latent Dirichlet Allocation (LDA) algorithm. LDA was applied separately to high-rating (4–5 stars) and low-rating (1–2 stars) subsets to identify the distinct topics that emerge in each sentiment group.

The structure of topics discovered in highly positive reviews is visualized in Figure 6, while the structure for low-rated reviews is presented in Figure 7.

 


Figure 6. LDA visualization for high-rated reviews

 


Figure 7. LDA visualization for low-rated reviews

 

2.5 Sentiment Classification

To complement the unsupervised approaches, we developed machine learning models for binary classification of sentiment polarity. The text data was vectorized using the Term Frequency-Inverse Document Frequency (TF-IDF)[9] method, transforming words into numerical features. Multiple classification algorithms were tested, including logistic regression, naïve Bayes, random forest, and support vector classifiers. The evaluation of model performance demonstrated that logistic regression achieved the best results. Classification metrics for the training and testing datasets are summarized in Figures 8 and 9, respectively.

 


Figure 8. Classification metrics for training data

 


Figure 9. Classification metrics for test data

 

All steps of the data analysis and modeling pipeline were implemented in Python using widely adopted libraries such as Scikit-learn, NLTK, and Gensim. Visualization was performed with Matplotlib and Seaborn, and the overall results are discussed in the next section.

III. RESULTS AND DISCUSSION

3.1 Sentiment Analysis Results

The application of VADER sentiment analysis [2] demonstrated a strong and consistent correlation between computed sentiment scores and customer-assigned star ratings. Positive sentiment predominated in five-star reviews, while negative sentiment was characteristic of one-star evaluations, validating the method’s effectiveness. Preprocessing, particularly lemmatization, significantly improved sentiment differentiation by reducing the proportion of neutral texts and clarifying emotional signals. Initially, many reviews clustered around neutrality; however, after lemmatization, a clearer separation into positive and negative sentiments was observed. These results underscore the critical importance of robust preprocessing in enhancing the accuracy and interpretability of sentiment analysis outcomes.

3.2 Topic Modeling Interpretation

Latent Dirichlet Allocation (LDA) [3] was applied to identify dominant themes in positive and negative reviews. Positive feedback centered on health benefits, compatibility with low-carb diets, and taste satisfaction, with frequent mentions of terms like “love,” “great,” “carb,” and “pasta.” In contrast, negative reviews highlighted sensory issues, such as unpleasant texture, odor, and preparation difficulty, with keywords like “rubbery,” “fish,” and “slimy.” This thematic contrast confirms that customer evaluations are driven largely by sensory perceptions, where the same product attributes are praised by some and criticized by others.

3.3 Sentiment Classification Performance

In the sentiment classification task, logistic regression achieved high predictive performance. As shown in Figures 8 and 9, the model exhibited strong precision, recall, and F1-scores on both the training and testing datasets[5]. These results validate the suitability of logistic regression for binary sentiment classification of customer reviews.

Overall, the results of both unsupervised (sentiment analysis, topic modeling) and supervised (machine learning classification) approaches converged on the same insight: customer satisfaction with the product is polarized, primarily due to sensory properties and preparation complexity.

 


Figure 8. Logistic regression performance on training data

 


Figure 9. Logistic regression performance on test data

 

Overall, the results from both unsupervised and supervised approaches converge toward the same insights: consumer satisfaction with the product is highly polarized, driven largely by texture, smell, and ease of preparation. These factors consistently emerge as the primary differentiators between positive and negative experiences, suggesting concrete avenues for product reformulation and customer communication.

3.4 Evaluation Metrics and Research Limitations

In addition to overall accuracy, the performance of classification models was assessed using precision, recall, and F1-score. Precision measures the proportion of true positive predictions among all positive predictions, reflecting the model's ability to avoid false positives. Recall indicates the proportion of true positive predictions among all actual positive instances, emphasizing the model's sensitivity. The F1-score, as the harmonic mean of precision and recall, provides a balanced assessment of model performance, particularly valuable when dealing with imbalanced datasets. In this study, high F1-scores across both positive and negative sentiment classes demonstrated the robustness of the logistic regression model. However, several limitations must be acknowledged. First, the dataset, although sufficiently large for initial modeling, may not capture the full diversity of customer experiences, particularly for niche markets. Second, the reliance on English-language reviews may introduce cultural biases in sentiment interpretation. Third, while VADER and LDA offer powerful analytical tools, more sophisticated models such as BERT-based sentiment analysis could provide deeper semantic understanding in future studies[7]. These limitations suggest avenues for expanding and refining the approach in subsequent research.

3.5 Potential Applications and Future Work

The insights derived from customer review analysis using NLP techniques can be effectively applied in several domains. For product development teams, systematically extracted themes and sentiments enable targeted improvements in product features, such as addressing recurring complaints about texture and smell. In marketing strategies, understanding positive themes, such as health benefits and dietary compatibility, allows companies to highlight product strengths in promotional campaigns[7]. Furthermore, the same methodology can be scaled to monitor customer feedback across different product lines or brands, enabling continuous quality improvement. Future work may involve the integration of deep learning models, such as transformers (e.g., BERT, RoBERTa), to capture more nuanced emotional expressions and contextual information. Expanding the analysis to multilingual datasets could also offer insights into regional and cultural variations in customer perceptions, providing a more comprehensive view of consumer sentiment at a global scale.

IV. CONCLUSION. This study demonstrated the effectiveness of Natural Language Processing (NLP) techniques in extracting meaningful insights from customer reviews of the product Miracle Noodle Zero Carb, Gluten Free Shirataki Pasta. By combining sentiment analysis using the VADER algorithm and topic modeling via Latent Dirichlet Allocation (LDA), we were able to uncover both the emotional polarity and thematic focus of consumer feedback.

The results confirmed a strong correlation between sentiment scores and user-assigned star ratings, indicating that textual sentiment analysis can serve as a reliable proxy for customer satisfaction. The topic modeling further revealed distinct patterns in positive and negative reviews: satisfied customers emphasized health benefits and dietary alignment, while dissatisfied customers primarily complained about texture, smell, and preparation complexity. Supervised machine learning models, particularly logistic regression with TF-IDF vectorization, successfully classified review sentiment with high accuracy. This finding supports the hypothesis that textual data from online reviews contains sufficient signal to predict customer satisfaction and dissatisfaction automatically. The key practical implications of this research are clear. Product teams should focus on improving the texture and reducing the odor of the noodles, as these are the most frequently cited causes of negative sentiment. Additionally, simplifying preparation instructions or offering ready-to-eat options may further enhance user satisfaction. From a marketing perspective, emphasizing the product’s compatibility with low-carb and ketogenic diets, as well as its health-conscious positioning, can reinforce its appeal to its primary audience. In conclusion, this research highlights the potential of data-driven analysis in guiding product improvement and customer engagement strategies. By systematically processing and interpreting user-generated content, businesses can align their offerings more closely with customer expectations and address critical issues proactively. The approach outlined here can be readily extended to other product categories, offering a scalable solution for continuous feedback monitoring and consumer insight generation. Beyond the specific findings related to Miracle Noodle, this research highlights the broader potential of using NLP for continuous product feedback analysis. Companies operating in diverse industries—from food and beverage to consumer electronics—can adopt similar approaches to systematically monitor and respond to customer sentiment in real time. Implementing automated feedback systems based on machine learning models can significantly enhance product development cycles and customer relationship management. In future studies, combining NLP techniques with structured customer survey data and sales metrics could create a holistic framework for predicting market success and optimizing business strategies. Overall, this study lays the groundwork for integrating AI-driven customer voice analysis into strategic decision-making processes across industries.

 

References:

  1. Ni, J., Li, J., & McAuley, J. (2019). Justifying Recommendations using Distantly-Labeled Reviews and Fine-Grained Aspects. Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing (EMNLP).
  2. Hutto, C. J., & Gilbert, E. (2014). VADER: A Parsimonious Rule-based Model for Sentiment Analysis of Social Media Text. Proceedings of the Eighth International AAAI Conference on Weblogs and Social Media.
  3. Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent Dirichlet Allocation. Journal of Machine Learning Research, 3, 993–1022.
  4. Mikolov, T., Chen, K., Corrado, G., & Dean, J. (2013). Efficient Estimation of Word Representations in Vector Space. arXiv preprint arXiv:1301.3781.
  5. Liu, B. (2012). Sentiment Analysis and Opinion Mining. Synthesis Lectures on Human Language Technologies, 5(1), 1–167.
  6. Pang, B., & Lee, L. (2008). Opinion Mining and Sentiment Analysis. Foundations and Trends in Information Retrieval, 2(1–2), 1–135.
  7. Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2019). BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. Proceedings of NAACL-HLT.
  8. Loria, S. (2018). TextBlob: Simplified Text Processing. TextBlob Documentation. https://textblob.readthedocs.io/en/dev/
  9. Rajaraman, A. (2011). Mining of Massive Datasets. Cambridge University Press.
  10. Cambria, E., Schuller, B., Xia, Y., & Havasi, C. (2013). New Avenues in Opinion Mining and Sentiment Analysis. IEEE Intelligent Systems, 28(2), 15–21.
Информация об авторах

Master`s student of Kazakh-British Technical University, Kazakhstan, Almaty

магистрант Казахско-Британский технический университет, Казахстан, г. Алматы

PhD Candidate, Kazakh-British Technical University, Kazakhstan, Almaty

аспирант, Казахстанско-Британский технический университет, Казахстан, г. Алматы

Журнал зарегистрирован Федеральной службой по надзору в сфере связи, информационных технологий и массовых коммуникаций (Роскомнадзор), регистрационный номер ЭЛ №ФС77-54434 от 17.06.2013
Учредитель журнала - ООО «МЦНО»
Главный редактор - Звездина Марина Юрьевна.
Top