PERSONALIZED RECOMMENDATIONS USING NEIGHBORHOOD-BASED ITEM COLLABORATIVE FILTERING

ПЕРСОНАЛИЗИРОВАННЫЕ РЕКОМЕНДАЦИИ С ИСПОЛЬЗОВАНИЕМ СОВМЕСТНОЙ ФИЛЬТРАЦИИ ЭЛЕМЕНТОВ НА ОСНОВЕ СОСЕДСТВА

Tamabayeva K.

27.11.2024 290

11(128)

10. Информатика, вычислительная техника и управление

Цитировать:

Tamabayeva K. PERSONALIZED RECOMMENDATIONS USING NEIGHBORHOOD-BASED ITEM COLLABORATIVE FILTERING // Universum: технические науки : электрон. научн. журн. 2024. 11(128). URL: https://7universum.com/ru/tech/archive/item/18745 (дата обращения: 05.12.2025).

Прочитать статью:

DOI - 10.32743/UniTech.2024.128.11.18745

ABSTRACT

This article discusses the use of machine learning methods to develop an information system designed to provide user recommendations for choosing products and services. Recommender systems are now an essential tool in e-commerce helping users to explore proposed website products. In this paper, we present a practical method for item-based collaborative filtering that evaluates the relevance of various items. Our suggested approach can be useful for various e-commerce purposes. We tested the suggested method on a sizable real-world dataset with a complete cold-start scenario, and the results showed high-quality results.

АННОТАЦИЯ

В данной статье рассматривается использование методов машинного обучения для разработки информационной системы, предназначенной для предоставления пользователю рекомендаций по выбору товаров и услуг. Рекомендательные системы в настоящее время являются важным инструментом электронной коммерции, помогающим пользователям изучать предлагаемые на сайте товары. В этой статье мы представляем практический метод совместной фильтрации на основе элементов, который оценивает релевантность различных элементов. Предложенный нами подход может быть полезен для различных целей электронной коммерции. Мы протестировали предложенный метод на большом наборе данных реального мира с полным сценарием «холодного старта», и результаты показали высокое качество.

Keywords: user recommendations, collaborative filtering, machine learning, features, neighborhood-based models, data processing.

Ключевые слова: пользовательские рекомендации, коллаборативная фильтрация, машинное обучение, характеристики, модели ближайших соседей, обработка данных.

Introduction: Nowadays, most e-commerce projects include recommendation algorithms to increase user experience satisfaction. Increasing annual sales and revenue is one of a company’s primary goals. Therefore, the recommendation system approaches are rapidly developing year over year. While e-commerce has grown quickly in recent years, recommendation algorithms had a lot of improvements. Increasing customer experience using individualized recommendations that are based on previous feedback is a common goal for the recommendation systems. Collaborative Filtering model is widely used method in recommendation systems. We will focus on item-based CF and analyze the results in the case of an application of machine learning. The key problem in personalized recommendation is to design an efficient algorithm that can satisfy users’ requirements [1].

Personalized user experiences in online solutions have boosted revenue by 6% to 10% in the past year [2]. This personalized experience is made possible by data on customer preferences, including likes and ratings of products on these sites. This is why services like Netflix, Facebook, and Amazon, among others, have built recommendation algorithms that use the ratings of users as a score to provide recommended items to similar users. Combined effects of customization and recommendation have helped Netflix save over $1 billion annually and content recommendation has an impact on more than 75% of content viewers.

However, the relationship between recommendation performance and customer satisfaction has not received much attention in research. To account for these causal relationships, more sophisticated research methodologies are required, as recommendation performance and customer satisfaction are likely to form intricate causal relationships [17].

Both item-based and user-based methods are used in collaborative filtering. User-based CF predicts the probable rating of a certain by the target user. This algorithm considers user representation as a graph containing the users’ preferences so that a certain item will be recommended using similar users’ history. Similarly, item-based CF suggests a specific item based on data from related items.

Collaborative filtering systems recommend products based on metrics of item and/or user similarity. Collaborative filtering can be neighborhood-based or model-based. Among the neighborhood-based collaborative filtering examples that have been researched the most are user-based and item-based approaches [3] [4].

Collaborative Filtering is frequently used in early efforts on recommendation methods to predict users’ preferences based on their interaction histories [5] [6]. Matrix factorization (MF), the most popular of the numerous CF techniques, projects people and items into a shared vector space and determines a user’s preference for a particular item using the inner product of each vector’s corresponding values [5] [6]. Matrix factorization is a collaborative filtering algorithm used in recommendation systems. Matrix factorization algorithms decompose the user-item interaction matrix into the product of two smaller rectangular matrices [7]. During the Netflix prize challenge, this family of techniques gained widespread recognition. In his 2006 blog post Fun, Simon Funk reported on the efficacy of his findings and shared them with the research community.

Another line of work is item-based neighborhood methods Kabbur et al. [9] [7] [10]. Using a pre-calculated item-to-item similarity matrix, they evaluate its similarities with the items in user’s interaction history using a pre-computed item-to-item similarity matrix.

The collaborative filtering method has several benefits, including simple implementation and strong adaptability.

Methodology: The most common approach to collaborative filtering is based on neighborhood models. Its initial form, which was shared by essentially every older CF system, is user-user based [11]. User-user techniques use the ratings of comparable users to estimate unknown ratings. Later, an analogue for item-item strategy [10] [12] gained popularity. In those methods, a rating is calculated based on previous ratings given to related things. The item-item technique is frequently preferable because of its better accuracy and scalability [12]. Additionally, item-item approaches make it easier to explain how predictions were made. We will focus mostly on item-item approaches.

Figure 1. Methodology Figure

This dataset is an updated version of the 2014 release of the Amazon review dataset. The dataset contains links (also viewed/also bought graphs), product metadata (descriptions, category information, price, brand, and image attributes), and reviews (ratings, text, helpfulness votes), same like the previous iteration [13]. We used a slice of the reviews data containing user information, rating and review information.

Ratings are provided for n products from m customers (users). To distinguish between users and objects, we reserve special indexing letters for users u and for items i. A rating rui shows the user u’s preference for item i, with higher ratings indicating a stronger preference. For example, values could be integers with values ranging from 1, which indicates little interest, to 5, which indicates a strong interest. By using the notation rui for the predicted value of rui, we can distinguish between predicted ratings and known ratings.

CF analyzes relationships between users and interdependencies among items, in order to identify new user-item associations [14] [15].

For instance, some CF systems discover pairs of things that frequently receive similar ratings or like-minded people with similar histories of rating or purchase in order to infer unidentified linkages between users and items. The only information that is need is historical user activity, such as previous transactions or product ratings. One of the main benefits of CF is that it may handle parts of the data that are frequently challenging to profile using convention-based techniques, yet being domain-free [14].

A similarity measure between items, where sij stands for the similarity of i and j, is at the core of most item-oriented techniques. Frequently, the Pearson correlation coefficient is the foundation. Predicting rui, the value that user u has not yet observed for item i, is our objective. We determine the k things assessed by u that are most similar to i using the similarity metric. A weighted average of the ratings for nearby items is used to determine the predicted value of :

Collaborative Filtering is based on enumerating the similarity between user and items. So one of the important steps is to identify what similarity metric will be used. Our method will include Pearson correlation-based similarity metric. Using Pearson Correlation-based similarity, the strategy is to identify instances where users scored both items i and j, then collect the users that scored both items i and j [16].

The main focus of neighborhood approaches is on calculating the relationships between users or, alternately, between objects. An item-oriented approach assesses a user’s preference for an item based on their evaluations of related things. These techniques, which see users as baskets of evaluated products, turn users into the item space. By doing this, we can immediately relate items to items rather than having to compare users to items. The best correlations are those that are very localized, as detected by neighborhood models. They frequently disregard the vast majority of user reviews in favor of a select few influential neighborhood connections [7].

One of the metrics that we used for the evaluating the recommendation systems was MAE:

The recommendation system predicts user ratings more accurately when the MAE is lower. Other statistical accuracy metrics include correlation and root mean squared error (RMSE).

RMSE is a quadratic scoring metric that also measures average magnitude and additionally gets the value from the square root. In our case we will check the collaborative recommendation models for both RMSE and MAE metrics.

By comparing the neighborhood model (Table 1) based CF results we can see that KNN based models do not differ much except that KNNWithMeans is a little better. We have to consider that the results may vary because of the dataset size.

Table 1.

Comparison of KNN models

Model	RMSE	MAE
KNNBasic	1.0013	0.6875
KNNWithMeans	1.0055	0.6672
KNNWithZScore	1.0167	0.6728
KNNBaseline	0.9663	0.6618

Precision, recall, and F1 score are commonly employed metrics for evaluating recommendation systems. These metrics assess the accuracy, completeness, and balance of the recommendations provided by the system.

Precision measures the proportion of correct recommendations out of the total recommendations made, focusing on the relevancy of the suggestions. Recall, on the other hand, measures the proportion of correctly recommended items out of all the relevant items, focusing on the coverage of the recommendations. F1 score is a balanced metric that combines precision and recall into a single value, providing an overall measure of the system’s performance.

These metrics are calculated using a test dataset where the relevance of items is known. By comparing the recom- mended items with the known relevant items, precision, recall, and F1 score provide a quantitative assessment of how well the recommendation system performs in terms of providing relevant and comprehensive recommendations.

It’s important to note that these metrics are just a few examples among many others used in evaluating recommen- dation systems. Depending on the specific requirements and context, additional metrics like mean average precision, normalized discounted cumulative gain, and hit rate may be utilized. The selection of evaluation metrics depends on the goals and characteristics of the recommendation system being evaluated.

We set a threshold that will evaluate if the recommended item was relevant for the user. So that if the predicted rating is more than 3.5 threshold then we can say that it’s relatable for the user. Otherwise, it is not recommended.

The precision at k metric of the CF-based algorithm are highly dependent on the neighborhood size k [4]. Outcome prediction accuracy and recommendation precision show extrema in the neighborhood of around k = 4 as the figure below shows.

In the figure below we can see the metrics evaluation. We have to consider the fact that the dataset includes around ten thousands rows. It has not normalized data because the majority of ratings are around 4-5 rate score.

Figure 2. Diagram of precision at k and recall

Table 2.

Results

Threshold	TP	FP	TN	FN	Precision	Recall	F1-score
1	2500	0	0	0	1.0	1.0	1.0
2	2390	75	6	21	0.96	0.99	0.98
3	2318	119	13	50	0.95	0.98	0.96
4	2042	263	54	141	0.88	0.93	0.91
5	237	39	720	1504	0.86	0.13	0.23

The results are shown in Table 2.

Results and discussion: We introduced a workable approach to item-based collaborative filtering in e-commerce recommendation systems in this paper. Accuracy and performance were both excellent outcomes of the recommended method. By effectively evaluating the relevance of various items, the method can provide personalized product recommendations to users, enhancing their browsing experience and increasing sales.

The evaluation metrics, including MAE, RMSE, and correlation, indicated the accuracy and effectiveness of the rec- ommendation system. The low MAE and RMSE values and high correlation value showed that the method accurately predicted user preferences and established a strong linear relationship between predicted and actual ratings.

The suggested method can be applied in various e-commerce domains to improve recommendation systems and enhance user experience. Future research can focus on further optimizing the method, exploring different similarity metrics, and considering additional factors such as item popularity and user demographics to enhance the personal- ization of recommendations.

Overall, the presented method contributes to the field of recommendation systems and provides a practical approach for item-based collaborative filtering in e-commerce. Its potential impact on enhancing user satisfaction and increasing sales makes it a valuable tool for e-commerce companies.

References:

Lin Z. An empirical investigation of user and system recommendations in e-commerce // Decision Support Systems. – 2014. – Vol. 68. – P. 111–124.
Gräßer F. Neighborhood-based collaborative filtering for therapy decision support // 2017.
Cai Y. Typicality-based collaborative filtering recommendation // IEEE Transactions on Knowledge and Data Engineering. – 2014. – Vol. 26. – P. 766–779.
Parameswaran S., Luo E., Nguyen T. Q. Patch matching for image denoising using neighborhood-based collaborative filtering // IEEE Transactions on Circuits and Systems for Video Technology. – 2018. – Vol. 28. – P. 392–401.
Koren Y., Bell R. Advances in collaborative filtering // Recommender Systems Handbook. – 2011. – P. 145–186. [Electronic resource]. – Access mode: https://link.springer.com/chapter/10.1007/978-0-387-85820-3_5 (accessed: 20.11.2024).
Koren Y., Bell R., Volinsky C. Matrix factorization techniques for recommender systems // Computer. – 2009. – Vol. 42. – P. 30–37.
Koren Y. Factorization meets the neighborhood: A multifaceted collaborative filtering model // Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. – 2008. – P. 426–434.
Netflix update: Try this at home / [Electronic resource]. – Access mode: https://sifter.org/simon/journal/20061211.html (accessed: 20.11.2024).
Kabbur S., Ning X., Karypis G. FISM: Factored item similarity models for top-n recommender systems // Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. – 2013. – Vol. Part F128815. – P. 659–667.
Linden G., Smith B., York J. Amazon.com recommendations: Item-to-item collaborative filtering // IEEE Internet Computing. – 2003. – Vol. 7. – P. 76–80.
Herlocker J. L., Konstan J. A., Borchers A., Riedl J. An algorithmic framework for performing collaborative filtering // Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. – 1999. – P. 230–237. [Electronic resource]. – Access mode: https://dl.acm.org/doi/10.1145/312624.312682 (accessed: 20.11.2024).
Sarwar B., Karypis G., Konstan J., Riedl J. Item-based collaborative filtering recommendation algorithms // Proceedings of the 10th International Conference on World Wide Web. – 2001. – P. 285–295.
Ni J., Li J., Mcauley J. Justifying recommendations using distantly-labeled reviews and fine-grained aspects // 2017.
Hu Y., Volinsky C., Koren Y. Collaborative filtering for implicit feedback datasets // Proceedings - IEEE International Conference on Data Mining, ICDM. – 2008. – P. 263–272.
Wang X. ItemSilkRoad: Recommending items from information domains to social users // SIGIR 2017 – Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval. – 2017. – P. 185–194.
Xia J. E-commerce product recommendation method based on collaborative filtering technology // Proceedings - 2016 International Conference on Smart Grid and Electrical Automation, ICSGEA 2016. – 2016. – P. 90–93.
Kim J., Choi I., Li Q. Customer satisfaction of recommender systems: Examining accuracy and diversity in several types of recommendation approaches // 2017.