Drag

I’ve been expanding my knowledge in Data and ML technologies, experimenting with data analytics and machine learning models, with cloud services alongside along with my buddy Sanskriti Vohra. On April 9th, 2024, I had the opportunity to present our POC at the AWS Summit in Amsterdam on the Community Builders stage. I’m eager to share what I’ve learned from that experience so you can see just how influential recommender systems have become in your lives. Every song you select, movie you watch, or product you buy online leads to the next things popping up in your views. While it might seem like magic or manifestation, however it’s actually the work of recommender systems. Understanding how these systems operate, both generally and technically, is crucial. So, let’s dive in and get started.

What is recommendation & recommender system?

  • Dictionary meaning — The word “recommendation” refers to the act of suggesting something as a good option or the advice given to guide someone’s choice.
  • Technology Meaning — In the context of AI/ML — “recommendations” refer to personalized suggestions generated by recommendation systems.
  • Examples — Movie recommendations, Product & Features recommendations, Personalized Ads (though no one likes them)

What is the Science & Maths behind it?

Three major things that you should understand — Neural networks, Filtering Techniques and Matrix Factorization. Succinctly defined below:

Neural Networks

Neural networks are a crucial technology for building advanced recommendation systems. They imitate the structure and function of the human brain, enabling them to recognize complex patterns and connections between users and items. This leads to highly accurate predictions of user preferences e.g. you and next product you will find on Amazon shopping list.

Usually, a two-tower encoder (two separate neural networks called towers) setup uses separate neural networks to independently analyze users and items, then combines their outputs to improve recommendations. This approach captures unique user and item details for more precise and personalized suggestions.

Filtering Techniques (Collaborative and Content based)

Collaborative Filtering: This technique recommends items based on the preferences and behavior of similar users. Example: If Alice likes a specific book that Bob also enjoyed, a collaborative filtering system might recommend another book that Bob liked to Alice.

Content-Based Filtering: This method recommends items by analyzing the features of the items themselves and matching them to user preferences. Example: If Alice frequently watches science fiction movies, a content-based system might suggest other science fiction movies she hasn’t seen yet

PC: https://www.datacamp.com/tutorial/recommender-systems-python

Matrix Factorization

It allows for the decomposition of a large user-item interaction matrix into smaller, more manageable matrices. This not only helps in uncovering latent factors that influence user preferences but also in significantly improving the accuracy of recommendations. All of which are essential for parsing through data and identifying relevant patterns.

Picture Source: google images

There are few more things which are used in various variations by service providers to make best recommendation systems:

  1. Graph-Based Models: Explores relationships in network structures, such as user interactions or social networks.
  2. Reinforcement Learning: Adapts recommendations based on user responses.
  3. Rule Based Association and Filtering: Finds correlations between frequently co-occurring items and filters as per user needs

Alright, let’s set aside all the science and math for now and wrap this up with the cloud services we checked out. They are super easy to use and really quick to adapt to your specific needs and area of focus.

Popular Recommendation Systems/ Services

Machine learning models can uncover patterns and insights that are invisible to the human eye, enabling the delivery of personalized recommendations with unprecedented accuracy.

Some popular recommendation systems I explored are following:

  • Amazon Personalize — AWS offers personalized recommendations based on user data and behaviour based on various pre-availble recipes and models.
  • Google Cloud Recommendation AI — GCP provides tailored product recommendation and search results.
  • Microsoft Azure Personalizer — Azure cloud uses reinforcement learning for real-time, personalized experiences.

AWS Personalize vs. GCP Recommendations AI vs. Azure Personalizer

Here is the quick overview of these three popular recommender systems.

Case Study with Amazon personalize

The problem statement is vividly depicted in the image below. It highlights the common challenges faced when searching for universities and career paths. With so much information available, students often find themselves confused about whether to focus on selecting universities or improving their academic profiles and extracurricular activities. This very dilemma struck me and Sanskriti one evening over dinner, inspiring us to tackle it head-on. We decided to select and then use a specific cloud machine learning service. After our initial juggling we decided to go with Amazon Personalize, to develop a solution. Our aim was to create a University Selection Personalization Engine, starting with a proof of concept (POC) to identify the most suitable service for our needs. Here’s how our journey began.

University Selection Personalization Engine:

PC: Dall-E

Our use case was designed as highlighted below, and we used Amazon Personalize to bring this setup to life. By leveraging its capabilities, we tailored our recommendation system to align with our specific requirements, from data ingestion to delivering personalized results. This solution helped us customize the user experience effectively while addressing our design goals.

PC: Self — @Shweta and @Sanskriti Vohra Creation

Amazon Personalize

We used Amazon Personalize for our use case development and this is how overall it works:

PC: AWS Docs

Sample Output of POC

The sample output for this POC came in two formats as described below:

  1. We listed all universities ranked for a specific student and selected the one with the highest score to recommend.
  2. When given a user’s name, the system identified and listed the top three universities that best met the user’s criteria. (This test case is shown in snapshot below).
Personalized University Ranking for Students

This output can then be served using AWS Customized Personalization API and user interface can be refined the way you want to serve the output to the users (students in this case). Here is the sample code snippet and readable output:

PC: Self Personalized Ranking Served Using APIs

Essential Takeaways

We were able to achieve a good recommendation system with Amazon personalize. Though we were lacking extensive and genuine data that we are trying to get. However there are always takeaways from running such experimental use cases. Here are essential three. This should help you get started and understand some pains/preparations to make maximum use of these learnings while understanding or utilizing recommender systems.

  • Secure Interplay Between Domain Problem and Data

From my experience, the most challenging part was preparing the data. The data representation for the domain or problem statement was tricky. Its an iterative process and can introduce many mistakes and biases, making it crucial to understand how the data relates to the problem. This connection is essential for the recommendation engine to provide accurate suggestions. I could first hand realize it also poses risks because biases, data manipulation, and unethical practices can affect the models, leading to ethical concerns in AI/ML. No doubt with machine learning coming to life in this decade ethical and unbiased use of data preparation and model building are most important.

  • Data Preparation: Crucial & Challenging

Preparing data was challenging with Amazon Personalize, requiring many iterations and each iteration requires new dataset and schemas to be developed. Each new dataset requires new resources. I couldn’t succeed alone with Personalize service. Using SageMaker Data Wrangler (AWS Data preparation service) was so important for collecting, cleaning, and structuring data. It helped in reducing the complexity and I found this step was unavoidable.

  • Resource Usage and Cost

In my experience, implementing an effective recommendation system can be costly due to the high computational resources and storage required. The expenses quickly add up as you process vast amounts of data, train models, and conduct inference to provide accurate, real-time recommendations. While using this service though I had not planned but I found using SageMaker for data preparation and cleaning was essential, and it increased the costs further. Managing and optimizing resources is critical to ensure performance without breaking the budget.

Now its your turn to start relating whether these recommendations on your online experience or gadgets are manifested by you or its Recommender systems hiding behind and helping you. 😊

If you picked up a thing or two from this article and want to dive deeper into machine learning the easy and practical way, then don’t forget to follow 🔔 and hit the Clap icon 👏.

Happy reading!

Leave a Reply

Your email address will not be published. Required fields are marked *