Supervised by: Ujjayanta Bhaumik MSC, B.Tech (Hons). Ujjayanta Bhaumik is currently pursuing his PhD in Computer Science and Physics with a focus on Virtual Reality in the Light and Lighting Lab, KU Leuven. He worked as a creative software developer for over 2 years. He has a masters in Computer Graphics, Vision and Imaging from University College London.
With increasing choices available to consumers on the internet, recommendation systems have become prevalent as they significantly shorten the distance between a customer’s needs and satisfaction. Used in a wide variety of platforms, new methods have been designed to seek and predict user content preference through, most notably, content-based and collaborative filtering.
This paper assesses the strengths and weaknesses of both recommendation systems algorithms. By using a combination of different metrics, including efficiency, cost, and data accessibility, we discuss their relative advantages and disadvantages in different settings. Finally, we assess the viability of both models from a business perspective, evaluating whether one recommendation system outperforms the other.
Recommendation systems are a subclass of information filtering systems that attempt to assist a user in making decisions without adequate personal experience of the alternatives. By using these systems, businesses are able to increase profits as it allows the users of their platforms to have personalized content as well as a better product search experience, resulting in the user spending more time on their platform. However, it is essential for a business to decide which recommendation system to utilize as they have their strengths and weaknesses. This paper assesses the two main types of recommendation systems algorithms: content-based filtering and collaborative filtering.
The purpose of this paper is to define the different business settings in which a particular recommendation system is most useful. In a modern context, this process proves paramount because of the efficiency that can be derived from optimizing the recommendation system that a company uses. Practically, choosing a system that drives efficiency with a minimal cost almost always results in positive business outcomes. Specifically, it was found that “constructing accurate recommendation systems … is important for assisting users as well as increasing business profitability” (Sun and Lebanon 471).
We begin by explaining the algorithm’s underlying principles and methods for making predictions. Upon defining a variety of business settings and their respective characteristics, we discuss both algorithm’s respective caveats and pitfalls, determining how well each algorithm’s predictive accuracy is in each setting. The efficiency of the algorithm in predicting data, as well as business revenue generated in a given period of time are both taken into consideration, serving to address how close the algorithm’s estimated predictions are to the genuine user ratings as well as taking into account its ability to satisfy businesses who are implementing them (Lendave, 2021). Therefore, we evaluate whether content-based filtering or collaborative filtering should be favored in each business setting.
An Introduction to Content-Based and Collaborative Filtering Recommendation Systems
Many businesses have employed a wide range of techniques in order to enhance their services. In particular, satisfying a customer’s experience proves to be critical for generating a lot of their revenue when used appropriately. The use of recommendation systems also referred to as recommender systems, are algorithms present in many information technologies which are commonly used by businesses in order to increase satisfaction (in some cases, retention) rates, consequently gaining larger profits.
Although there are many recommendation systems in existence, the two main types prevalent are content-based and collaborative filtering. While both are designed to support consumers in their decision-making process – whether deciding on a movie or purchasing a new product online – they utilize different methods and data-gathering processes to generate their respective recommendations. Combining different methods of data collection, information delivery as well as analysis of the user’s needs, they must satisfy the following criteria in any business setting for it to be an ideal system (Knotzer, 2008):
- Consider and apply a number of different types of information in aid for its predictive analysis (including explicit and implicit methods).
- Contain valid reasons behind recommendations.
- Provide a rough estimate of their accuracy (in real life, these systems are unable to predict every single decision correctly due to their inability to estimate/take into account everything as well as the irrationality of human behavior).
- Show adequate response times in respect of the delivery of recommendations.
With this, we now look into the basic principles and methods of each recommendation system.
Content-Based Filtering Recommendation Systems
The main idea behind content-based filtering recommendation systems is to recommend items to a particular user in their platform similar to items previously rated highly by the same user. Take a movie platform, for example – if this system is implemented, it will target items such as movies with the same (or similar) genre (which can be measured by a scale), the director/actors, or even the length of the movie. An item profile is then built for each user, like Figure 1 shown below:
Figure 1. A sample user profile in a movie platform.
Notice that for each item (in this case, the movies 1 to X), the user has provided ‘ratings’ for each feature, whether through explicit or implicit methods. To construct a user profile – the process of analyzing and visualizing a particular knowledge of these findings – many methods are available to make new predictions for the user, the simplest being taking the weighted average of all the rated item profiles, placing significance on features that matter more (e.g. the genre may be considered as a better indicator than time duration; therefore it may be weighted higher against the latter). However, one can also normalize weights using average ratings of the user or other methods, including the construction of a utility matrix (which isn’t the primary focus of this paper).
We then make corresponding predictions for a new item, for example, a movie (X+1), based on its feature values. Here, the content-based recommendation system will then take a similarity measure – mathematical measures used to determine the degree of similarity of one item (commonly represented as a vector) to another. Likewise, there are many similarity metrics employed. Still, they mainly include cosine similarity (measuring the cosine angle between the vectors to determine how much one goes in the direction of another), euclidean distances (the squared distance between the two vectors, eliminating negative values by squaring), and many more. These range from simple to much more complex methods, but all have the same goal of constructing a model which constantly predicts and recommends the top items with the highest ratings (therefore, the highest percentage of user satisfaction) to the user (Wang, Liang et al., 2015).
Collaborative Filtering Recommendation Systems
Unlike content-based methods that use information about a user and the items themselves, collaborative filtering relies on user-item interactions, that is, the interactions between different users and items in order to find similarities between different users, which produce new recommendations (Roy, 2020).
Figure 2. Principal method of collaborative filtering
Formally speaking (refer to Figure 2), if one was to make recommendations to a user X using this algorithm, we find a group of users whose ratings are similar to X’s ratings (that is, users who share similar rating patterns with X) from the platform’s database. With that said, a notion must be defined for the term ‘similarity’ between users, which like content-based filtering, has many different options to choose from. Then, we can use other preferences from the set of users (that is, Items A, B, and C) in order to compute whether user X likes a particular item or not. In a real-world business setting, there will be many users and items to analyze; therefore, it is very common for a user-item interactions matrix to be utilized.
Examples of explicit data collection include a user’s ratings on a particular item, asking a user to rank a collection of items from most preferred to least preferred, and providing comments. Implicit data collection methods include mediums like page clicks or even purchasing records. Since they are processes of gathering and measuring information in this algorithm on public platforms, it is clear that the more capable it is in accurately predicting specific items to users without requiring an understanding of the item’s features, the more features it needs to consider. This may lead to efficient issues as well as even potential invasion of privacy since building user profiles may prove to be an issue of human privacy invasion. This will be covered later in detail with regard to viability.
It is also worth noting that the types of data collection vary from each business setting, in which the next section will set a clear definition of the types of business settings in order to make comprehensive judgments of each recommendation system’s performance in each.
Types of Business Settings and their Characteristics
In the present, numerous different business settings leverage the power of recommendation systems to fulfill their consumers’ wants and needs. Companies like Amazon, Facebook, and Netflix are primary leaders in this field as they harness these technologies to add value to their platforms. As with these organizations, three segments of the business world utilize recommendation systems, including the e-commerce markets, social media platforms, and streaming services. Each of these segments of the global economy plays massive roles and has its roots in recommendation systems, and will consequently be the focus of this section.
Electronic commerce, also known as e-commerce, refers to a business model that involves transactions between a seller and consumer via the internet, often through websites or apps. Currently, the United States Census estimates that from the first to the second quarter of 2022 alone, e-commerce grew 2.7%, indicating its large presence in the present and its clear future potential (U.S. Census Bureau, 2022). On these platforms, recommendation systems serve mainly to connect the customer to products that might match their interests based on other buying habits observed in other purchased items (Schafer et al., 1999). In a market dominated by few-click transactions, building customer loyalty through the recommendation of good products is integral to e-commerce success.
Social Media Businesses
Intuitively, a key component of all social media platforms is the feed because of the massive amount of time consumers spend on it. The effectiveness behind these feeds is recommendation systems that must suggest content relating to the user’s other activity. Furthermore, these systems are also used to recommend users to connect with to expand their network on sites like LinkedIn, Facebook, and Instagram (Fayyaz et al., 2020). Therefore, it is clearly understood the contribution of recommendation systems in social media businesses as drivers behind content-based feeds and the connections between platform customers.
Streaming Service Businesses
Streaming Services like Netflix, Hulu, and Spotify frequently have sections on their platforms recommending movies or shows based on what one has already watched on one’s account. Accordingly, these companies utilize collaborative filtering because they are taking inputted user data and producing a recommendation. Though nowadays many have taken issue with processes behind recommendation systems because they are controlled by a human that may be subject to bias, these systems are nonetheless useful to streaming services by captivating their audiences by understanding user preferences (O’Dair & Fry, 2019).
Advantages and Disadvantages of Content-Based Recommendation & Collaborative Filtering on Business Settings
Content-based filtering does not need data from other users, as all that is required is a user’s browser history and recent purchases. This makes it preferable for businesses that may not have access to many users (start-ups) and businesses in a niche. They are also able to recommend to users with unique, not-so-popular tastes, as content-based recommenders are highly tailored to a specific user’s preferences and tastes. Consequently, there is no first-rater problem (the ability to recommend new and unpopular items) as recommendations are transparent to the user. In addition to this, building a content-based filtering system is relatively straightforward. The nature of this algorithm utilizes features in items, thus would be more suitable in providing a more comprehensive explanation for the items recommended to users in contrast to collaborative filtering (Rajaraman, 2016). When a user questions the recommendations and their personal information, content-based recommendation systems would be able to back this up by listing the content features that cause that particular item to be recommended.
On the other hand, it is extremely unlikely for content-based recommendation engines to come up with diverse and unexpected results (Rajaraman, 2016), therefore making it unideal for items such as movies and restaurants as it does not have intrinsic content and does not take into consideration the fact that users will have multiple interests, so items outside the user’s content profile are never recommended. In addition, there is an element of difficulty in finding and gathering appropriate features to be utilized – especially in the case of streaming service businesses, features that make a particular user attracted to an item may be extremely difficult to analyze, therefore affecting prediction quality.
Due to the involvement of multiple users, collaborative filtering allows users to get broader exposure to many different items, which allows for the continual purchase of items as recommendations are not specific to one user. Unlike content-based filtering, it is also adaptive as the system is able to detect any changes in users’ interests, which makes it ideal for streaming services businesses as users are constantly exposed to different categories/genres. User profiles are also dependent on ratings rather than probability figures for every word (Rajaraman, 2016). Additionally, these system algorithms typically ignore other contextual information/features in providing item recommendations due to their nature (mentioned previously) (Ricci, 2015). This may be the major reason why streaming service businesses, including YouTube and Netflix, claim to have used a myriad of collaborative filtering recommendation system algorithms due to it not needing any feature selection.
However, there is a cold-start problem meaning that the system cannot recommend products to new users who have not had any interaction yet (Shahbazi & Byun, 2020). This is a result of there being a sparsity of data as recommendations are based on historical data of site interactions between users and items. Due to this, it will be difficult to provide customers with a positive impression of your company as it is unlikely that they will receive a wonderful, personalized shopping experience. This issue is experienced by nearly all businesses that use the collaborative filtering algorithm at the beginning, and smaller businesses’ predictive accuracy will be majorly affected due to the lack of users over a (potentially long) period of time – an item cold-start problem as for an item to be recommended, enough users are required for reasonable accuracy. While collaborative filtering is sometimes thought to increase the range of items a user gets exposed to, it is not common to do the opposite due to the presence of popularity bias, as it recommends items based on past ratings from other users (Rajaraman, 2016) – the more popular it is, the more likely it is going to be recommended. Unless some exclusive methods are designed to get around this, it will majorly affect businesses, especially in platforms in E-Commerce and Social Media where there may be a time of a huge surge in popularity by a few well-known items (e.g., movies, brands, or artists).
A Broad Overview of the Performance of Content-Based and Collaborative Filtering in Business
Content-Based Filtering Through E-Commerce and Streaming
In the business world, content-based filtering is present in companies that care only about individual users relative to their preferences, as opposed to organizations that aim to foster connections between multiple users at a time. This makes content-based filtering effective in the e-commerce and streaming industries because of the focus on single consumers at once. Companies like Netflix, Amazon, and Pandora all harness their power to drive sales by keeping users active on their platforms for longer periods of time, whether it be browsing for compelling items, content, or music. However, though seemingly a perfect solution for these corporations, certain drawbacks persist, primarily the difficulty in representing a user’s diverse interests. For example, just because an individual enjoys horror movies does not mean they do not want comedies or sports films at the same time, yet factoring in these other categories could be difficult if a user’s content profile is composed only of horror movies. This implies that a user must do the hard work of finding another couple of movies they enjoy outside of horror, which could dissuade possible customer engagement. Resultantly, these platforms, especially streaming services, can rely on content-based and collaborative filtering.
Collaborative Filtering Through Social Media
Converse to content-based filtering, collaborative filtering derives its value from recommendations based on what similar people have liked. This network-like feel has made it a perfect candidate for the social media realm, where its most prominent usage occurs. The main advantage of this method in the business world includes its ability to work for any selection, regardless of the content. Therefore, by not having to spend time analyzing content, a company could drive efficiency, which could reduce costs and increase profits. However, this system’s hindrances make it unsuitable for various companies for a few reasons. Primarily, because it is highly dependent on large user communities due to the massive amount of data that needs to be present in order for effective recommendations to be made, scalability is an issue. Additionally, outliers always exist, and without proper representation within the data, they will never be given strong recommendations that would incentivize them to stay on the platform, proving the innate business challenges of being dependent on the customer for recommendation systems.
Overall, recommendation systems are an integral part of the success of businesses across numerous industries. In accordance with our findings, we conclude that content-based filtering is more effective in business settings that cater to an individual and rarely rely on a group, making it ideal for companies looking to personalize a service or sell a product. This can be seen in industries like e-commerce, wherein an individual needs to be marketed items that fit their personalized profile. Conversely, in a collaborative environment, the network developed by interconnecting the preferences of various users is nearly impeccable for social media platforms that can leverage this connectedness to build community and advertise effectively. Lastly, we discussed various business services that use combinations of both content-based and collaborative filtering to enhance user experience, emphasizing that, typically, larger companies will implement both methods in tandem, even if one is more prevalent than another. As technology becomes an ever more prevalent and integrated part of the day-to-day, new, more innovative recommendation systems that can optimize user experience, drive efficiency, and reduce cost will ultimately be synonymous with business success, highlighting why companies must adopt the most useful ones in the present.
Aggarwal, C. C. (2016). Recommender Systems. Springer Publishing.
Collaborative Filtering Advantages & Disadvantages | Machine Learning |. (2022, July 18). Google Developers. https://developers.google.com/machine-learning/recommendation/collaborative/summary
Chen, R., Hua, Q., Chang, Y. S., Wang, B., Zhang, L., & Kong, X. (2018). A Survey of Collaborative Filtering-Based Recommender Systems: From Traditional Methods to Hybrid Methods Based on Social Networks. IEEE Access, 6, 64301–64320. https://doi.org/10.1109/access.2018.2877208
Fayyaz, Z., Ebrahimian, M., Nawara, D., Ibrahim, A., & Kashef, R. (2020). Recommendation Systems: Algorithms, Challenges, Metrics, and Business Opportunities. Applied Sciences, 10(21), 7748. https://doi.org/10.3390/app10217748
Holewa, K. (2022, July 6). We know what you like! Perks of recommendation systems in business. Miquido. https://www.miquido.com/blog/perks-of-recommendation-systems-in-business/
Knotzer, N. (2008). Product Recommendations in E-commerce Retailing Applications. Frankfurt am Main.
Lendave, V. (2021, October 23). How to Measure the Success of a Recommendation System? Analytics India Magazine. Retrieved August 7, 2022, from https://analyticsindiamag.com/how-to-measure-the-success-of-a-recommendation-system/#:%7E:text=your%20business%20goal.-,Common%20Metrics%20Used,evaluation%20metrics%20for%20recommender%20systems.
O’Dair, M., & Fry, A. (2019). Beyond the black box in music streaming: the impact of recommendation systems upon artists. Popular Communication, 18(1), 65–77. https://doi.org/10.1080/15405702.2019.1627548
Rajaraman, A. (2016, April 13). Lecture 42 — Content-Based Recommendations | Stanford University [Video]. YouTube. https://www.youtube.com/watch?v=2uxXPzm-7FY&list=PLLssT5z_DsK9JDLcT8T62VtzwyW9LNepV&index=42
Rajaraman, A. (2016b, April 13). Lecture 44 — Implementing Collaborative Filtering (Advanced) | Stanford University [Video]. YouTube. https://www.youtube.com/watch?v=6BTLobS7AU8&list=PLLssT5z_DsK9JDLcT8T62VtzwyW9LNepV&index=44
Ricci, F., Rokach, L., & Shapira, B. (2015). Recommender Systems Handbook (2nd ed. 2015 ed.). Springer.
Rocca, B. (2021, December 10). Introduction to recommender systems – Towards Data Science. Medium. https://towardsdatascience.com/introduction-to-recommender-systems-6c66cf15ada
Roy, A. (2021, December 15). Introduction To Recommender Systems- 1: Content-Based Filtering And Collaborative Filtering. Medium. https://towardsdatascience.com/introduction-to-recommender-systems-1-971bd274f421#:%7E:text=Content%2Dbased%20filtering%20does%20not,and%20items%20on%20its%20own
Schafer, J. B., Konstan, J., & Riedi, J. (1999). Recommender systems in e-commerce. Proceedings of the 1st ACM Conference on Electronic Commerce – EC ’99. https://doi.org/10.1145/336992.337035
Shahbazi, Z., & Byun, Y. (2020, September 22). Toward Improving the Prediction Accuracy of Product Recommendation System Using Extreme Gradient Boosting and Encoding Approaches. MDPI. Retrieved August 21, 2022, from https://www.mdpi.com/2073-8994/12/9/1566
Skovhøj, F. Z. (2022, February 9). Using Collaborative Filtering in E-Commerce: Advantages & Disadvantages. Clerk.Io. https://blog.clerk.io/collaborative-filtering
U.S. Census Bureau (2022). Quarterly Retail E-Commerce Sales. Retrieved from https://www.census.gov/retail/index.html.
(2022, May 20). Recommendation Systems Explained – Towards Data Science. Medium. https://towardsdatascience.com/recommendation-systems-explained-a42fc60591ed
Wang, D., Liang, Y., & Xu, D. (2015, May 25). Online Recommender Systems – How Does a Website Know What I Want? |. American Mathematical Society. Retrieved August 21, 2022, from https://blogs.ams.org/mathgradblog/2015/05/25/online-recommender-systems-website-want/