Icon close

The Hitchhiker’s Guide to Recommendation Engine

From the next show you might like on Netflix to the next audio book you listen to on Audible, Recommendation Engines are embedded in most of the applications that we use every single day. The Recommendation Engine market is much bigger than you would imagine. Since the creation of the first Recommendation Engine in 1992  (as part of a mailing system), it has never been out of the spotlight, driving growth and success for one major brand after another.

Whether you currently have one, debating if this is right for your business, or considering to obtain one, the sheer quantity of information and judgment calls such as the type of Recommendation Engine, approach and use-case, will undoubtedly overwhelm you. It’s easy to feel that the odds of figuring out what will work for your firm are so slim – which brings the question: “How do I choose the best recommendation engine for my company?”

Luckily, thanks to the contributions from countless numbers of hitchhikers and researchers, this wholly remarkable guide on Recommendation Engine is born. We, fellow hitchhikers, would like to take you on a journey and give you a comprehensive yet simple explanation of all the common recommendation engines, and their recommended use-cases (no pun intended). We promise there will be no code and mathematical equations at all! So, grab your bags and hop on a journey of fun and understanding!

Type I – Collaborative Filtering System

The Collaborative Filtering system is definitely the most sought-after and popular type. It is probably how we came to know about Recommendation Engine in the first place, from the success stories of major tech giants– Amazon, Netflix, and Spotify just to name a few.

Collaborative Filtering taps into the very nature of human beings – we are community animals! And whilst each person is unique, we are not that different from one another! This type of Recommendation Engine believes that people who agreed about a topic in the past will likely agree in the future and will presumably like similar kinds of objects as they liked in the past. In addition, it does not require an understanding of item attributes. It works well for complex items where tastes are diverse and hard to pinpoint (e.g. what is it about Bohemian Rhapsody that makes us so fond of it?!).

Collaborative Filtering performs best when you have explicit ratings (e.g. 6/10 ratings on a movie) or implicit ratings (e.g. no. of times playing a song). The items you are trying to recommend usually compete against one another as they satisfy the same need (e.g. two different  movies satisfy the same entertainment need for a 2-hour window). These items also tend to be almost single-use in nature (e.g. you will likely only buy the same book once). They are consumables (e.g. once you finish the book, you will need a new one). 

An overview of Collaborative Filtering:

Information Required: explicit or implicit rating 

Product Characteristics: near-single-use, consumables

Inter-item Relationship: competing

Suitable Use Cases: music, movies, books, games, articles, groceries etc

Techniques: memory-based, model-based

Type II – The Content-based System

Unlike the Collaborative Filtering system, Content-based systems are not about finding look-alike customers. The recommendation for a particular user is solely based on their preference regardless of what others might think. It breaks down a given product or service  into a set of attributes and recommends other products that share a similar set of attributes.  

For the same reason, a Content-based system requires item attributes  to work. In the context of a movie, it may be necessary to provide the cast members, the director, the year it was released, and the genre. For example, if I had watched Forrest Gump, Cast Away, and Saving Private Ryan, my next recommendation may be The Green Mile, because all of them are Hollywood blockbusters with Tom Hanks as a leading role! Items with little to no attributes  will never be suggested; if you are dealing with such items, you should use Collaborative Filtering instead!

While most Collaborative Filtering use cases will also work for Content-based systems (provided they have a solid description), ultimately the most suitable type depends heavily on the user experience. As an example, consider online dating services. Assume you “swiped right” on a couple of folks who had “outgoing,” “love music,” and “love pets” on their profiles. It’s far more natural (and less creepy) to see recommendations like “find individuals who also adore dogs…” rather than “users who swiped these people also swiped…”

Content-based systems, like Collaborative Filtering, are ideally suited for competing consumables, and it anticipates you will go through a lot of similar products and services.

An overview of Content-based System:

Information Required: item attributes, user attributes/interactions. 

Product Characteristics: consumables

Inter-item Relationship: competing

Suitable Use Cases: movies, online dating, candidate-job matching, articles etc

Techniques: vector space neighbour, vector space model, label only deterministic rank, label only probabilistic rank

Type III – The Segment-based System

So far, both Type I and Type II Recommendation Engines perform one-to-one recommendations, which are personalised but challenging to execute or scale. If you have a simpler recommendation requirement or wanting  to get started quickly, then Segment-based, one-to-many Recommendation Engines are the quickest to go to market. One common example is when you target different groups of market segmentation to improve the effectiveness of your marketing.

As it doesn’t require any user interactions, this type of system is suitable for capital goods or items that are not purchased often. In addition, they are also best-suited for general cold start scenarios where you have very little data. Furthermore, because we are providing the same treatment to an entire group of customers, the compute resource required may be quite minimal, which is advantageous for businesses with less advanced infrastructure. 

An overview of Segment-based System:

Information Required: user profile/preference, expert judgement  

Product Characteristics: consumables, capital goods

Inter-item Relationship: independent

Suitable Use Cases: fashion items, retirement home, email subscription

Techniques: organic segment, derived segment, self-subscribed segment

Type IV – The Knowledge-based System

The Knowledge-based system makes recommendations based on information about a user’s needs and preferences, as well as product assortment and recommendation criteria. In such a system, users are asked to express their wants and requirements directly, after which the system would employ domain knowledge about the items (such as their features and pricing) to suggest appropriate choices. The recommendation’s accuracy is determined by how relevant the recommended item is to the user, which can be accomplished by asking the question, “Is this knowledge useful?”

The Knowledge-based system necessitates a thorough understanding of why a given product would meet a specific demand; this knowledge must be created around the items before the system can be applied. Assume you’re on a travel website and type “island holiday” into the search field. The search result may be “Fiji 5 days package” or “Hawaii 7 days package” – the machine knows that both Fiji and Hawaii can meet your requirements!

Suppose we have domain knowledge of the items and how they relate to various demands. We can apply knowledge-based systems to capital products or commodities people do not often purchase. These might be real estate, household equipment with a life span of 10 to 20 years, electronics, or insurance (things you buy and forget).  Knowledge-based systems not only could recommend products that compete with each other (like the aforementioned travel example), they can also leverage our current product expertise to suggest items that complement one another. For example, suppose someone wants to purchase a DSLR camera. We may utilise our expertise to propose several lenses.

An overview of Knowledge-based System:

Information Required: user preference, expert judgement  

Product Characteristics: consumables, capital goods

Inter-item Relationship: competing, complementing

Suitable Use Cases: real estate search, home appliances, electronic, insurance

Techniques: constraint query, case-based reasoning, similarity metrics

Type V – The Utility-based System

A Utility-based system is rarely used in industry since it requires a utility function to work. The notion of utility is used in economics to describe worth or value – you can think of it as a measure of happiness, if you will. Humans, for the most part, make judgments that maximise their usefulness. When humans are confronted with two options,  we tend to  assess the pros and cons of both options, choose  the one that has better value (whatever that means to us), and we feel happier with that choice.

Let’s look at an example to understand the “utility” concept . Assume you are purchasing a DSLR camera; you are concerned with a few aspects of the camera: price, brand, and sensor size. Each of these factors would play a role in the ultimate selection. Still, you will eventually have to choose one over the other (a better brand, a bigger sensor size and a lower price . Utility functions (whether linear or exponential) attempt to capture this trade-off. Participants in a focus group will be tested on how sensitive they are to each attribute; i.e.how ready they are to trade it off. The relative weight of each trait will then be decided. We can then utilise the utility function to determine the utility for all items and provide suggestions that maximise the customers’ utility.

Now you might be thinking – this utility concept  seem realistic and doable, why isn’t this method used more frequently? This is due to difficulty in obtaining a utility function in most circumstances, and the weights obtained from a small sample may not generalise to the entire population. Furthermore, humans are not always rational! While we have objectives and criteria for decision-making, we are often influenced by biases such as the halo effect or prejudice.

An overview of Utility-based System:

Information Required: utility function.  

Product Characteristics: consumables, capital goods

Inter-item Relationship: competing

Suitable Use Cases: electronic, real estate, 

Techniques: Multi-Attribute Utility Theory

Type VI – The Basket-based System

Ever heard of  the “beer” and “diaper” anecdote from Walmart? On Friday nights, Walmart displays beers and diapers on the same shelf. Why? Because on Friday nights, Dads tend to come out to acquire diapers and will pick up a carton of beer while they’re at it, thus increasing sales! That, believe it or not, is a type of Recommendation System. This technology digs deep into a customer’s shopping cart, singles out each line item, and looks for things frequently purchased together. Such connections are then used to provide recommendations.

While the legend might have been fringing on urban myth (the original study was done at a day-and-time aggregated level, not basket level, so it only proves beer and diapers are purchased around the same time on Friday night, but it does not indicate the same person purchases them), the idea has a lot of merits we can learn from. Association Rules are an effective technique for analysing market baskets. Association Rules are a basic frequency-based strategy that tells us how correlated are the occurrences of two products.In other words, the purchase of one is frequently accompanied by the purchase of another. This might help us create physical and digital product assortments to improve complementary product exposure (i.e. temptation), leading to increased sales. 

An overview of Basket-based System:

Information Required: shopping basket line items.  

Product Characteristics: consumables

Inter-item Relationship: complementing

Suitable Use Cases: furniture, grocery, fashion

Techniques: Association Rules

Type VII – The Hybrid System

In practice, the basic types of Recommender systems are typically not used in isolation. Building a hybrid system allows us to combine the strengths of two or more models while minimising the weaknesses that arise when only one recommender system is employed. For example, because a Content-based system may suffer from tunnel vision, we could pair it with Collaborative Filtering at the same time to suggest a broader range of products to customers.

An overview of Hybrid-based System:

Information Required: anything applicable.  

Product Characteristics: consumables, capital goods

Inter-item Relationship: competing, complementing

Suitable Use Cases: anything applicable

Techniques: weighted, sequential, switching, union, intersection

TL;DR

A quick summary table of all the Recommendation System types we discussed today for those TLDR folks:

Link to spreadsheet to see a larger version

Stay up to date in the community!

We love talking with the community. Subscribe to our community emails to hear about the latest brown bag webinars, events we are hosting, guides and explainers.

Share

Meet the authors

Vivian Mai

  • Icon Person Data Scientist
  • Icon location Melbourne

Vivian Mai

Data Scientist

Versatile data scientist with a background in business. Vivian is passionate and experienced in helping businesses to identify problems, metrics that matter and build robust and relevant solutions that are data driven. She loves good times with friends, travel places and trying out local cuisine, curiously.

Reean Liao

  • Icon Person Data Scientist

Reean Liao

Data Scientist