Learning vector representations (aka. embeddings) of users and items lies at the core of modern recommender systems. Ranging from early matrix factorization to recently emerged deep learning based methods, existing efforts typically obtain a user’s (or an item’s) embedding by mapping from pre-existing features that describe the user (or the item), such as ID and attributes.
integrate the user-item interactions —more specifically the bipartite graph structure — into the embedding process.
Question: bipartite graph structure?develop a new recommendation framework Neural Graph Collaborative Filtering (NGCF), which exploits the user-item graph structure by propagating embeddings on it. This leads to the expressive modeling of high-order connectivity in user-item graph, effectively injecting the collaborative signal into the embedding process in an explicit manner.
Question: How to propagating embeddings on it?The difference between this and KG propagation?Collaborative filtering (CF) addresses it by assuming that behaviorally similar users would exhibit similar preference on items. Common paradigm is to parameterize users and items for reconstructing historical interactions, and predict user preference based on the parameters.
Tow key components in learnable CF models:
embeddings: transforms users and items to vectorized representationsinteraction modeling: reconstructs historical interactions based on the embeddings. modelexplanationMFdirectly embeds user/item ID as an vector and models user-item interaction with inner productcollaborative deep learningextends the MF embedding function by integrating the deep representations learned from rich sideneural collaborative filtering modelsreplace the MF interaction function of inner product with nonlinear neural networkstranslation-based CF modelsuse Euclidean distance metric as the interaction functionmost existing methods build the embedding function with the descriptive features only (e.g., ID and attributes), without considering the user-item interactions — which are only used to define the objective function for model training.
the scale of interactions can easily reach millions or even larger in real applications, making it difficult to distill the desired collaborative signal.
their method: exploiting the high-order connectivity from user-item interactions, a natural way that encodes collaborative signal in the interaction graph structure.d denotes the embedding size
theoretical basis: the users that consume an item can be treated as the item’s features and used to measure the collaborative similarity of two items.
two main processes: message construction & message aggregationmessage construction : the message from i i i to u u u as: m u ← i m_{u←i} mu←i is the message embedding (i.e., the information to be propagated). f ( ∗ ) f(*) f(∗) is the message encoding function, which takes embeddings e i e_{i} ei and e u e_{u} eu as input, and uses the coefficient p u i p_{ui } pui to control the decay factor on each propagation on edge ( u , i ) (u,i) (u,i). where W 1 , W 2 ∈ R d ′ × d W1,W2∈ R^{d′×d} W1,W2∈Rd′×d are the trainable weight matrices to distill useful information for propagation, and d ′ d′ d′ is the transformation size. ⊙ denotes the element-wise product. From the description, p u i = 1 / ∣ N u ∣ ∣ N i ∣ p_{ui} = 1 / \sqrt {|N_{u}||N_{i}|} pui=1/∣Nu∣∣Ni∣ , N u N_{u} Nu and N i N_{i} Ni denote the first-hop neighbors of user u u u and item i i i. 【 N u N_{u} Nu and N i N_{i} Ni are the amounts of their neighbors?】 From the viewpoint of representation learning, p u i p_{ui} pui reflects how much the historical item contributes the user preference. From the viewpoint of message passing, p u i p_{ui} pui can be interpreted as a discount factor, considering the messages being propagated should decay with the path length.Message Aggregation aggregate the messages propagated from u’s neighborhood to refine u’s representation. W 1 W_{1} W1 is the weight matrix shared with the one used in Equation above. m u ⬅ u m_{u⬅u} mu⬅uretains the information of original features. e u ( 1 ) e^{(1)}_{u} eu(1) denotes the representation of user u u u obtained after the first embedding propagation layer.we can obtain the representation e i ( 1 ) e^{(1)}_{i} ei(1) for item i i i by propagating information from its connected users. LeakyReLU ?
stack more embedding propagation layers to explore the high-order connectivity information. where W 1 ( l ) , W 2 ( l ) , ∈ R d l × d l − 1 W^{(l)}_{1},W^{(l)}_{2},∈ R^{d_{l}×d_{l−1}} W1(l),W2(l),∈Rdl×dl−1are the trainable transformation matrices, and d i d_{i} dis the transformation size; e i ( l − 1 ) e^{(l−1)}_{i} ei(l−1) is the item representation generated from the previous message-passing steps, memorizing the messages from its (l-1)-hop neighbors. Analogously, we can obtain the representation for item i at the layer l.
Propagation Rule in Matrix Form : refer to the original paperwhere ∥ ∥ ∥ is the concatenation operation. We can use all kinds of ways to aggregate these embeddings. The reason why use concatenation is that it is simplicity and quite effectively in graph neural networks which refers to layer-aggregation mechanism. Finally, we conduct the inner product to estimate the user’s preference towards the target item:
prevent overfitting. adopt two dropout techniques in NGCF: message dropout and node dropout.
message dropout: drop out the messages being propagated in Equation (6), with a probability p 1 p_{1} p1.As such, in the l-th propagation layer, only partial messages contribute to the refined representations. The message dropout endows the representations more robustness against the presence or absence of single connections between users and itemsnode dropout: randomly block a particular node and discard all its outgoing messages. For the l-th propagation layer, we randomly drop ( M + N ) p 2 (M + N)p_{2} (M+N)p2 nodes of the Laplacian matrix, where p 2 p_{2} p2 is the dropout ratio. the node dropout focuses on reducing the influences of particular users or itemsTo enhance the embedding function, much effort has been devoted to incorporate side information. While inner product can force user and item embeddings of an observed interaction close to each other, its linearity makes it insufficient to reveal the complex and nonlinear relationships between users and items. Recent efforts [11, 14, 15, 35] focus on exploiting deep learning techniques to enhance the interaction function, so as to capture the nonlinear feature interactions between users and items. For instance, neural CF models, such as NeuMF [14], employ nonlinear neural networks as the interaction function; meanwhile, translation-based CF models, such as LRML [28], instead model the interaction strength with Euclidean distance metrics. The design of the embedding function is insufficient to yield optimal embeddings for CF, since the CF signals are only implicitly captured. there is no guarantee that the indirectly connected users and items are close in the embedding space. Without an explicit encoding of the CF signals, it is hard to obtain embeddings that meet the desired properties.
【Knowledge of blind area】
Other methods use GCN on other sides or not make full use of high-order connectivities.
recall@K & ndcg@K , set K = 20
baseline model:
MF & NeuMF &CMN: 主要通过计算近似度来进行推荐,HOP-Rec:This is a state-of-the-art graph-based model, where the high-order neighbors derived from random walks are exploited to enrich the user-item interaction data.PinSage & GC-MC: 基于GCN,Q:whether exploiting connectivity information helps to alleviate this issue.
exploiting high-order connectivity greatly facilitates the representation learning for inactive users, as the collaborative signal can be effectively captured.andomly selected six users from Gowalla dataset, as well as their relevant items. We observe how their representations are influenced w.r.t. the depth of NGCF.
The connectivities of users and items are well reflected in the embedding space, that is, they are embedded into the near part of the space. In particular, the representations of NGCF-3 exhibit discernible clustering, meaning that the points with the same colors (i.e., the items consumed by the same users) tend to form the clusters.when stacking three embedding propagation layers, the embeddings of their historical items tend to be closer. It qualitatively verifies that the proposed embedding propagation layer is capable of injecting the explicit collaborative signal (via NGCF-3) into the representations.