“Recommender systems” are engines that capture relationships between users and items, with the aim of predicting or suggesting new connections. This information is valuable for a number of business reasons, not least because they help customers towards their next purchase. Your purchase history is full of bike parts? Perhaps you’d like a workstand, or maybe a track pump. Our records show you’re a keen gardener: how about a lovely gnome?
From a Machine Learning viewpoint, the recommendation problem looks very similar to building a search engine; a ‘query’ returns a list of ranked results. The queries and results look different depending on the purpose:
In this blog post we’ll explore one solution to the last problem: forming bundles for a given item, that are personalized for the user being recommended to. Many people buy strawberry jam and clotted cream with their scones; some would prefer black cherry jam and mascarpone (in the author’s opinion they would be wrong).
Behind the scenes, these systems are all learning from historical data connecting users and items, which form an implicit ‘rating’.Traditionally, users’ ratings on unseen items are inherited from users who have seen these items in a process called collaborative filtering: when users rate seen items similarly, they are given similar ratings when an item is unseen by one of them. This can be achieved through simple matrix factorization – compressing the ratings matrix through a low-dimensional space forces ‘phantom’ ratings to appear where there were previously none.
The matrix factorization method requires as input only the ratings for some user-item combinations – enough so that most users and items have at least a few ratings. But any underrepresented entries will get poor predictions (the “cold start” problem).
A further step is to predict the ratings from users’ behavioural features (e.g. time spent on site, number of pages visited) and items’ characteristic features (e.g. price, brand, colour). Now the problem is expressed as a regression – in the general sense – ready for tackling with neural networks, random forests and all the trendy algorithms you can bear.
In case it sounds like the user-user and item-item similarity information has been discarded, this is fortunately not terminal. With the correct loss function and feature processing, the model becomes a “Deep” Matrix Factorization, mimicking the process above. Likewise, it is fair to call the general case “Deep Collaborative Filtering”. There is a lot more room for model customization in this setup, and more information to train on, so one should expect to outperform the traditional approach with these methods.
So far, so good: personalized recommenders are just predictive models that depend on the user, and the recommendations are just the top predictions. The predicted user-item ratings give some indication of the likelihood that a user will associate with an item:
𝑝( item | user )
The recommendations here are independent of any contextual information; they are bespoke for the user, but do nothing to predict bundles. For that part of the problem we’d need item-item ratings. With minor adjustment, the model can be made to generate the target probability for personalized bundle recommendation:
𝑝( item 𝑖 | user 𝑢, item 𝑗 )
By modifying the output probability, both bundling and personalization are achieved in a single solution! From a user’s perspective, this means they will see suggested items that are tailored to them, while remaining appropriate to the items they are viewing at the time of serving recommendations.
As an example, imagine two users doing some online grocery shopping. They both add the same item to their cart – say, baking potatoes. Baking potatoes are often bought with baked beans, cheese and tuna. However, one of our users (“Alice”) tends to buy high-value produce, and another (“Bob”) is vegetarian, as can be seen in their historical profile. A personalized bundle, therefore, might suggest premium imported cheeses instead of store-brand cheddar for Alice, and hummus or red peppers instead of tuna for Bob.
Without taking context items (potatoes) into account, the recommendations would have been relevant to the user, but perhaps inappropriate for the basket in question. Conversely, recommending the standard bundle of beans, cheese and tuna would be suboptimal for these particular users.
The lesson here is that if both personalization and bundling are worthwhile solutions with your data, then you might consider offering personalized bundles to keep your recommendations as relevant as possible.
With thousands of items and potentially millions of users, recommendation problems can quickly scale into Terabytes of data. In a recent project, we used the on-demand capacity of Google Cloud to process and train a recommender using the following architecture.
The data handling and feature engineering is handled by BigQuery (an SQL server built for massively large datasets). After this, we run our model training through ML Engine, which removes any need for managing our own servers. The beauty of running models on ML Engine is that it gives us access to superfast computation when we need it, while only paying for as much as we use.
At this point the models’ outputs can be evaluated on demand to find which one is best by a variety of offline metrics. For this we can simply use a small Virtual Machine running on Compute Engine.
Once a suitable model has been found and trained, the model state sits in storage (central hexagon) for a prediction job to access when needed. Under this setup, making recommendations is as simple as an API call. No need to boot up any machines yourself!
If you are interested to know more about what we can do for your business or you are an AI enthusiast looking for an opportunity to unleash your passion, do not hesitate to Contact Us!