Exclusive | Recommended system not only refers to the recommendation model

Author:Data School Thu Time:2022.07.30

Author: EvenoldRidge, Karl Byleen-Higley

Translation: Chen Zhiyan

School pair: ZRX

not

This article is about 2500 words, it is recommended to read for 10 minutes

This article shares you a recommendation mode covering the entire process of deploying the recommendation system.

Tags: Recommended System

The biggest challenge facing a novice when building a recommendation system is the lack of practical understanding of the recommendation system, concentrating the online content of most recommendation systems on the model, and is usually limited to a simple collaborative filter example. For new practitioners, there is a huge gap between the simple model examples of the recommendation system and the actual mass production system.

This blog will share a model with readers, covering the entire process of deploying the recommendation system. The sample program comes from Meta, Netflix and Pintery. This model is the core technology of the Nvidiamerlin team to construct the end -to -end system. I am glad to share and promote it in the community to help readers establish the concept and consensus of deployment recommendation systems (not just models). If you are interested in the content of this field, you can also participate in the keynote speeches of the KDD Industrial Recommendation System Workshop (KDD ’s Industrial Recomnger Systems Workshop).

Recommended model

The role of the recommendation model, whether it is a simple collaborative filter example, or a deep learning model like DLRM, the essence is sorting, or more accurately, it is a scoring system, which is interested in a group of data. Speed ​​score. However, these scores themselves are often not enough to provide users with reasonable recommendations in the real world. Before exploring solutions and constructing the final recommendation system, they will thoroughly study the following reasons.

The more data items, the more problems

The first problem encountered first is the number of data items in the recommended data. Under extreme circumstances, data project records can be as long as millions, hundreds of millions, or even billions. In most cases, it is not feasible to score for each data item, and the computing power of the score is extremely expensive. In practice, first of all, you need to quickly select the related subsets of these items, such as scoring one thousand or 10,000 data items.

Entering the second stage, before scoring the data item, a reasonable related collection is needed, which includes data items that users will eventually participate. This stage is usually called the candidate search stage, and it can also be called the candidate generation stage. There are many forms of retrieval models, including matrix decomposition models, dual -tower models, linear models, similar neighbor models and map traversal models. Generally, the search model is more efficient than the calculation of the scoring model.

YouTube has an excellent paper in 2016. It is one of the first public references of the architecture. At present, this method has been widely adopted and is commonly used in the industry. Eugeneyan has a wonderful blog post on this theme. His two -stage picture is the source of inspiration for our four -stage recommendation picture, which will be introduced in detail below. It is worth noting that using multiple candidates in the same recommendation system to present different candidates to users is also common, and then this theme will be preserved into another blog.

In addition to the second stage!

Although the two -stage large -scale recommendation model can solve most problems, the recommendation system also needs to support other constraints. In some scenarios, users do not want to display certain data items, such as: when the data item is not inventory, when the age is inappropriate, when the user has already used the content, or the user is not authorized to display it in the country, the user is displayed, the user is displayed in the country. I don't want to display these data items.

Relying on scores or retrieval models to infer business logic and properly recommend data items. In addition, you need to add a filtering phase to the recommendation system. Filtering is usually completed after the search phase, which can be integrated with it (filtering to ensure that there are sufficient candidates after the search), and even in some cases can be filtered after the score. During the filtering phase, business logic rules are applied. If the filtering is missing, the model is impossible (or at least very difficult) to perform business logic rules. In some cases, filtering is simply excluded inquiries, but in other cases, it may be very complicated. Like Bloom filters, it can be used to delete data items that have interacted with users.

Sort!

So far, three stages have been introduced: retrieval, filtering, and scores. These three stages provide a list of data items and their corresponding scores. These scores represent the guessing of the degree of interest of the scoring model for users. Recommended results are usually presented to users in the form of lists, which proposes an interesting problem: the optimal list is often incompletely consistent with the score of the data item. On the contrary, he hopes to provide users with a set of completely different data items to show them projects other than recommended candidates to explore the space they have not seen before and prevent filtering bubbles.

In some documents and examples, the third stage of the recommendation system is called sorting, but the final ranking (or position) of the recommended recommendation to users will rarely be aligned directly with the output of the model. The output of the model is aligned with other needs or constraints of business.

Fourth -stage recommendation system

Retrieval, filtering, rating, and sorting, these four stages constitute the design mode of the recommendation system, which covers almost every recommendation system. The following figure shows these four stages and shows how to build an example of each stage. It is much more complicated than the basic recommendation model. Especially considering the specific deployment of the recommendation system, it accurately represents the number of most current quantities today. The architecture of the production recommendation system.

Exemplary

With the description of the recommendation system mode, let's see how to build a recommendation system. First, look at the common RECSYS task example. At a higher level, it covers four stages of use cases and shows the unified mode of four stages.

Furthermore, you can look at the examples of the recommendation system in reality to see if you can identify four stages from it.

Meta ’s Instagram has a good article about the query language developed by them -driven by artificial intelligence: Instagram's recommendation system (IGQL query language). It can be seen from the examples they provided that this query language can be accurately mapped into the four stages of the recommendation mode:

Pinterest published a series of papers (Pinterest related content: the evolution of the real world recommendation system, 300 million+projects, and system recommendations for real -time users, deep learning related applications), a picture in the first article, right The recommended system architecture is described as the development process of time. Here, we reproduce the same model, but the subtle difference is that retrieval and filtering are considered the same stage.

Instacart shared this architecture in 2016 and directly put forward suggestions for following four stages. First retrieve the candidate object, then filter out the previously sorted data items, and then score the hottest candidate results, and re -sort the final results to improve the diversity of the final result presented to the user.

Complex system

In the four stages of this article, the components required during the inquiry time inquiry of training, deployment, and supporting the entire stage are clarified. This system is much more complicated than a single model. Those who search for recommendation system information through online search and only find the collaborative filter model. When they really try to build a complex recommendation system, they will look at a loss.

In the next blog post, the details of this complex model will be discussed in depth, and some solutions are proposed for the Merlin recommendation system framework. No, you can also communicate with us! We will continuously iterate and improve our ideas and libraries, and strive to provide the best solutions for the RECSys space. We will deeply thank you for your input.

Finally, if you are keen to build an open source library and simplify the construction and deployment of the recommendation system, you are welcome to communicate with you.

Original title:

Recomdynder systems, not just recomdynder models

Original link:

https://medium.com/nvidia-Merlin/recommender-systems- not-just-recomers-485c161c755e?source=explore ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- -----------8159457e_aca6_4e87_a7ec_578a4e826171-------15

Edit: Huang Jiyan

- END -

The environmental value behind "one net"

As a province and cities that develop agricultural film recycling earlier nationwi...

In 2022, Huanghua City, Datong City launched

On the morning of July 8th, the launching ceremony of the Datong Huanghua Harvest ...