论文周报[0715-0721] | 推荐系统领域最新研究进展(15篇)

科技   2024-07-22 08:00   新加坡  
嘿,记得给“机器学习与推荐算法”添加星标

本文精选了上周(0715-0721)最新发布的15篇推荐系统相关论文,主要研究方向包括可解释推荐、图跨域推荐、长尾多模态推荐、短视频推荐、合作序列推荐、长期序列推荐、图对比鲁棒推荐、推荐系统综述、强化学习推荐、旅游推荐、下一代深度交叉网络CTR预估、冷启动CTR预估等。

1.  Aligning Explanations for Recommendation with Rating and Feature via Maximizing Mutual Information

2.  Graph Signal Processing for Cross-Domain Recommendation
3.  GUME: Graphs and User Modalities Enhancement for Long-Tail Multimodal Recommendation
4.  Conditional Quantile Estimation for Uncertain Watch Time in Short-Video Recommendation
5.  Pacer and Runner: Cooperative Learning Framework between Single- and Cross-Domain Sequential Recommendation
6.  SEMINAR: Search Enhanced Multi-modal Interest Network and Approximate Retrieval for Lifelong Sequential Recommendation
7.  Towards Robust Recommendation via Decision Boundary-aware Graph Contrastive Learning
8.  A Comprehensive Review of Recommender Systems: Transitioning from Theory to Practice
9.  ROLeR: Effective Reward Shaping in Offline Reinforcement Learning for Recommender Systems
10.  On Causally Disentangled State Representation Learning for Reinforcement Learning based Recommender Systems
11.  On the Need for Configurable Travel Recommender Systems: A Systematic Mapping Study
12.  DCNv3: Towards Next Generation Deep Cross Network for CTR Prediction
13.  Warming Up Cold-Start CTR Prediction by Learning Item-Specific Feature Interactions
14.  Attention Overflow: Language Model Input Blur during Long-Context Missing Items Recommendation
15.  MLSA4Rec: Mamba Combined with Low-Rank Decomposed Self-Attention for Sequential Recommendation

1.  Aligning Explanations for Recommendation with Rating and Feature via Maximizing Mutual Information

Yurou Zhao, Yiding Sun, Ruidong Han, Fei Jiang, Lu Guan, Xiang Li, Wei Lin, Jiaxin Mao

https://arxiv.org/abs/2407.13274

Providing natural language-based explanations to justify recommendations helps to improve users' satisfaction and gain users' trust. However, as current explanation generation methods are commonly trained with an objective to mimic existing user reviews, the generated explanations are often not aligned with the predicted ratings or some important features of the recommended items, and thus, are suboptimal in helping users make informed decision on the recommendation platform. To tackle this problem, we propose a flexible model-agnostic method named MMI (Maximizing Mutual Information) framework to enhance the alignment between the generated natural language explanations and the predicted rating/important item features. Specifically, we propose to use mutual information (MI) as a measure for the alignment and train a neural MI estimator. Then, we treat a well-trained explanation generation model as the backbone model and further fine-tune it through reinforcement learning with guidance from the MI estimator, which rewards a generated explanation that is more aligned with the predicted rating or a pre-defined feature of the recommended item. Experiments on three datasets demonstrate that our MMI framework can boost different backbone models, enabling them to outperform existing baselines in terms of alignment with predicted ratings and item features. Additionally, user studies verify that MI-enhanced explanations indeed facilitate users' decisions and are favorable compared with other baselines due to their better alignment properties.

2.  Graph Signal Processing for Cross-Domain Recommendation

Jeongeun Lee, Seongku Kang, Won-Yong Shin, Jeongwhan Choi, Noseong Park, Dongha Lee

https://arxiv.org/abs/2407.12374

Cross-domain recommendation (CDR) extends conventional recommender systems by leveraging user-item interactions from dense domains to mitigate data sparsity and the cold start problem. While CDR offers substantial potential for enhancing recommendation performance, most existing CDR methods suffer from sensitivity to the ratio of overlapping users and intrinsic discrepancy between source and target domains. To overcome these limitations, in this work, we explore the application of graph signal processing (GSP) in CDR scenarios. We propose CGSP, a unified CDR framework based on GSP, which employs a cross-domain similarity graph constructed by flexibly combining target-only similarity and source-bridged similarity. By processing personalized graph signals computed for users from either the source or target domain, our framework effectively supports both inter-domain and intra-domain recommendations. Our empirical evaluation demonstrates that CGSP consistently outperforms various encoder-based CDR approaches in both intra-domain and inter-domain recommendation scenarios, especially when the ratio of overlapping users is low, highlighting its significant practical implication in real-world applications. https://github.com/ocryrtv/CGSP

3.  GUME: Graphs and User Modalities Enhancement for Long-Tail Multimodal Recommendation

Guojiao Lin, Zhen Meng, Dongjie Wang, Qingqing Long, Yuanchun Zhou, Meng Xiao

https://arxiv.org/abs/2407.12338

Multimodal recommendation systems (MMRS) have received considerable attention from the research community due to their ability to jointly utilize information from user behavior and product images and text. Previous research has two main issues. First, many long-tail items in recommendation systems have limited interaction data, making it difficult to learn comprehensive and informative representations. However, past MMRS studies have overlooked this issue. Secondly, users' modality preferences are crucial to their behavior. However, previous research has primarily focused on learning item modality representations, while user modality representations have remained relatively simplistic. To address these challenges, we propose a novel Graphs and User Modalities Enhancement (GUME) for long-tail multimodal recommendation. Specifically, we first enhance the user-item graph using multimodal similarity between items. This improves the connectivity of long-tail items and helps them learn high-quality representations through graph propagation. Then, we construct two types of user modalities: explicit interaction features and extended interest features. By using the user modality enhancement strategy to maximize mutual information between these two features, we improve the generalization ability of user modality representations. Additionally, we design an alignment strategy for modality data to remove noise from both internal and external perspectives. Extensive experiments on four publicly available datasets demonstrate the effectiveness of our approach. https://github.com/NanGongNingYi/GUME

4.  Conditional Quantile Estimation for Uncertain Watch Time in Short-Video Recommendation

Chengzhi Lin, Shuchang Liu, Chuyuan Wang, Yongqi Liu

https://arxiv.org/abs/2407.12223

Within the domain of short video recommendation, predicting users' watch time is a critical but challenging task. Prevailing deterministic solutions obtain accurate debiased statistical models, yet they neglect the intrinsic uncertainty inherent in user environments. In our observation, we found that this uncertainty could potentially limit these methods' accuracy in watch-time prediction on our online platform, despite that we have employed numerous features and complex network architectures. Consequently, we believe that a better solution is to model the conditional distribution of this uncertain watch time.

In this paper, we introduce a novel estimation technique -- Conditional Quantile Estimation (CQE), which utilizes quantile regression to capture the nuanced distribution of watch time. The learned distribution accounts for the stochastic nature of users, thereby it provides a more accurate and robust estimation. In addition, we also design several strategies to enhance the quantile prediction including conditional expectation, conservative estimation, and dynamic quantile combination. We verify the effectiveness of our method through extensive offline evaluations using public datasets as well as deployment in a real-world video application with over 300 million daily active users.

5.  Pacer and Runner: Cooperative Learning Framework between Single- and Cross-Domain Sequential Recommendation

Chung Park, Taesan Kim, Hyungjun Yoon, Junui Hong, Yelim Yu, Mincheol Cho, Minsung Choi, Jaegul Choo

https://arxiv.org/abs/2407.11245

Cross-Domain Sequential Recommendation (CDSR) improves recommendation performance by utilizing information from multiple domains, which contrasts with Single-Domain Sequential Recommendation (SDSR) that relies on a historical interaction within a specific domain. However, CDSR may underperform compared to the SDSR approach in certain domains due to negative transfer, which occurs when there is a lack of relation between domains or different levels of data sparsity. To address the issue of negative transfer, our proposed CDSR model estimates the degree of negative transfer of each domain and adaptively assigns it as a weight factor to the prediction loss, to control gradient flows through domains with significant negative transfer. To this end, our model compares the performance of a model trained on multiple domains (CDSR) with a model trained solely on the specific domain (SDSR) to evaluate the negative transfer of each domain using our asymmetric cooperative network. In addition, to facilitate the transfer of valuable cues between the SDSR and CDSR tasks, we developed an auxiliary loss that maximizes the mutual information between the representation pairs from both tasks on a per-domain basis. This cooperative learning between SDSR and CDSR tasks is similar to the collaborative dynamics between pacers and runners in a marathon. Our model outperformed numerous previous works in extensive experiments on two real-world industrial datasets across ten service domains. We also have deployed our model in the recommendation system of our personal assistant app service, resulting in 21.4% increase in click-through rate compared to existing models, which is valuable to real-world business. https://github.com/cpark88/SyNCRec

6.  SEMINAR: Search Enhanced Multi-modal Interest Network and Approximate Retrieval for Lifelong Sequential Recommendation

Kaiming Shen, Xichen Ding, Zixiang Zheng, Yuqi Gong, Qianqian Li, Zhongyi Liu, Guannan Zhang

https://arxiv.org/abs/2407.10714

The modeling of users' behaviors is crucial in modern recommendation systems. A lot of research focuses on modeling users' lifelong sequences, which can be extremely long and sometimes exceed thousands of items. These models use the target item to search for the most relevant items from the historical sequence. However, training lifelong sequences in click through rate (CTR) prediction or personalized search ranking (PSR) is extremely difficult due to the insufficient learning problem of ID embedding, especially when the IDs in the lifelong sequence features do not exist in the samples of training dataset. Additionally, existing target attention mechanisms struggle to learn the multi-modal representations of items in the sequence well. The distribution of multi-modal embedding (text, image and attributes) output of user's interacted items are not properly aligned and there exist divergence across modalities.

We also observe that users' search query sequences and item browsing sequences can fully depict users' intents and benefit from each other. To address these challenges, we propose a unified lifelong multi-modal sequence model called SEMINAR-Search Enhanced Multi-Modal Interest Network and Approximate Retrieval. Specifically, a network called Pretraining Search Unit (PSU) learns the lifelong sequences of multi-modal query-item pairs in a pretraining-finetuning manner with multiple objectives: multi-modal alignment, next query-item pair prediction, query-item relevance prediction, etc. After pretraining, the downstream model restores the pretrained embedding as initialization and finetunes the network. To accelerate the online retrieval speed of multi-modal embedding, we propose a multi-modal codebook-based product quantization strategy to approximate the exact attention calculati calculation and significantly reduce the time complexity. https://github.com/paper-submission-coder/SEMINAR

7.  Towards Robust Recommendation via Decision Boundary-aware Graph Contrastive Learning

Jiakai Tang, Sunhao Dai, Zexu Sun, Xu Chen, Jun Xu, Wenhui Yu, Lantao Hu, Peng Jiang, Han Li

https://arxiv.org/abs/2407.10184

In recent years, graph contrastive learning (GCL) has received increasing attention in recommender systems due to its effectiveness in reducing bias caused by data sparsity. However, most existing GCL models rely on heuristic approaches and usually assume entity independence when constructing contrastive views. We argue that these methods struggle to strike a balance between semantic invariance and view hardness across the dynamic training process, both of which are critical factors in graph contrastive learning.

To address the above issues, we propose a novel GCL-based recommendation framework RGCL, which effectively maintains the semantic invariance of contrastive pairs and dynamically adapts as the model capability evolves through the training process. Specifically, RGCL first introduces decision boundary-aware adversarial perturbations to constrain the exploration space of contrastive augmented views, avoiding the decrease of task-specific information. Furthermore, to incorporate global user-user and item-item collaboration relationships for guiding on the generation of hard contrastive views, we propose an adversarial-contrastive learning objective to construct a relation-aware view-generator. Besides, considering that unsupervised GCL could potentially narrower margins between data points and the decision boundary, resulting in decreased model robustness, we introduce the adversarial examples based on maximum perturbations to achieve margin maximization. We also provide theoretical analyses on the effectiveness of our designs. Through extensive experiments on five public datasets, we demonstrate the superiority of RGCL compared against twelve baseline models. https://cl4rec.github.io/RGCL/

8.  A Comprehensive Review of Recommender Systems: Transitioning from Theory to Practice

Shaina Raza, Mizanur Rahman, Safiullah Kamawal, Armin Toroghi, Ananya Raval, Farshad Navah, Amirmohammad Kazemeini

https://arxiv.org/abs/2407.13699

Recommender Systems (RS) play an integral role in enhancing user experiences by providing personalized item suggestions. This survey reviews the progress in RS inclusively from 2017 to 2024, effectively connecting theoretical advances with practical applications. We explore the development from traditional RS techniques like content-based and collaborative filtering to advanced methods involving deep learning, graph-based models, reinforcement learning, and large language models. We also discuss specialized systems such as context-aware, review-based, and fairness-aware RS. The primary goal of this survey is to bridge theory with practice. It addresses challenges across various sectors, including e-commerce, healthcare, and finance, emphasizing the need for scalable, real-time, and trustworthy solutions. Through this survey, we promote stronger partnerships between academic research and industry practices. The insights offered by this survey aim to guide industry professionals in optimizing RS deployment and to inspire future research directions, especially in addressing emerging technological and societal trends.

9.  ROLeR: Effective Reward Shaping in Offline Reinforcement Learning for Recommender Systems

Yi Zhang, Ruihong Qiu, Jiajun Liu, Sen Wang

https://arxiv.org/abs/2407.13163

Offline reinforcement learning (RL) is an effective tool for real-world recommender systems with its capacity to model the dynamic interest of users and its interactive nature. Most existing offline RL recommender systems focus on model-based RL through learning a world model from offline data and building the recommendation policy by interacting with this model. Although these methods have made progress in the recommendation performance, the effectiveness of model-based offline RL methods is often constrained by the accuracy of the estimation of the reward model and the model uncertainties, primarily due to the extreme discrepancy between offline logged data and real-world data in user interactions with online platforms. To fill this gap, a more accurate reward model and uncertainty estimation are needed for the model-based RL methods. In this paper, a novel model-based Reward Shaping in Offline Reinforcement Learning for Recommender Systems, ROLeR, is proposed for reward and uncertainty estimation in recommendation systems. Specifically, a non-parametric reward shaping method is designed to refine the reward model. In addition, a flexible and more representative uncertainty penalty is designed to fit the needs of recommendation systems. Extensive experiments conducted on four benchmark datasets showcase that ROLeR achieves state-of-the-art performance compared with existing baselines. The source code can be downloaded at https://github.com/ArronDZhang/ROLeR

10.  On Causally Disentangled State Representation Learning for Reinforcement Learning based Recommender Systems

Siyu Wang, Xiaocong Chen, Lina Yao

https://arxiv.org/abs/2407.13091

In Reinforcement Learning-based Recommender Systems (RLRS), the complexity and dynamism of user interactions often result in high-dimensional and noisy state spaces, making it challenging to discern which aspects of the state are truly influential in driving the decision-making process. This issue is exacerbated by the evolving nature of user preferences and behaviors, requiring the recommender system to adaptively focus on the most relevant information for decision-making while preserving generaliability. To tackle this problem, we introduce an innovative causal approach for decomposing the state and extracting Causal-InDispensable State Representations (CIDS) in RLRS.

Our method concentrates on identifying the Directly Action-Influenced State Variables (DAIS) and Action-Influence Ancestors (AIA), which are essential for making effective recommendations. By leveraging conditional mutual information, we develop a framework that not only discerns the causal relationships within the generative process but also isolates critical state variables from the typically dense and high-dimensional state representations. We provide theoretical evidence for the identifiability of these variables. Then, by making use of the identified causal relationship, we construct causal-indispensable state representations, enabling the training of policies over a more advantageous subset of the agent’s state space. We demonstrate the efficacy of our approach through extensive experiments, showcasing our method outperforms state-of-the-art methods.

11.  On the Need for Configurable Travel Recommender Systems: A Systematic Mapping Study

Rickson Simioni Pereira, Claudio Di Sipio, Martina De Sanctis, Ludovico Iovino

https://arxiv.org/abs/2407.11575

Travel Recommender Systems TRSs have been proposed to ease the burden of choice in the travel domain by providing valuable suggestions based on user preferences Despite the broad similarities in functionalities and data provided by TRSs these systems are significantly influenced by the diverse and heterogeneous contexts in which they operate This plays a crucial role in determining the accuracy and appropriateness of the travel recommendations they deliver For instance in contexts like smart cities and natural parks diverse runtime informationsuch as traffic conditions and trail status respectivelyshould be utilized to ensure the delivery of pertinent recommendations aligned with user preferences within the specific context However there is a trend to build TRSs from scratch for different contexts rather than supporting developers with configuration approaches that promote reuse minimize errors and accelerate timetomarket To illustrate this gap in this paper we conduct a systematic mapping study to examine the extent to which existing TRSs are configurable for different contexts The conducted analysis reveals the lack of configuration support assisting TRSs providers in developing TRSs closely tied to their operational context Our findings shed light on uncovered challenges in the domain thus fostering future research focused on providing new methodologies enabling providers to handle TRSs configurations.

12.  DCNv3: Towards Next Generation Deep Cross Network for CTR Prediction

Honghao Li, Yiwen Zhang, Yi Zhang, Hanwei Li, Lei Sang

https://arxiv.org/abs/2407.13349

Deep & Cross Network and its derivative models have become an important paradigm in click-through rate (CTR) prediction due to their effective balance between computational cost and performance. However, these models face four major limitations: (1) while most models claim to capture high-order feature interactions, they often do so implicitly and non-interpretably through deep neural networks (DNN), which limits the trustworthiness of the model's predictions; (2) the performance of existing explicit feature interaction methods is often weaker than that of implicit DNN, undermining their necessity; (3) many models fail to adaptively filter noise while enhancing the order of feature interactions; (4) the fusion methods of most models cannot provide suitable supervision signals for their different interaction methods.

To address the identified limitations, this paper proposes the next generation Deep Cross Network (DCNv3) and Shallow & Deep Cross Network (SDCNv3). These models ensure interpretability in feature interaction modeling while exponentially increasing the order of feature interactions to achieve genuine Deep Crossing rather than just Deep & Cross. Additionally, we employ a Self-Mask operation to filter noise and reduce the number of parameters in the cross network by half. In the fusion layer, we use a simple yet effective loss weight calculation method called Tri-BCE to provide appropriate supervision signals. Comprehensive experiments on six datasets demonstrate the effectiveness, efficiency, and interpretability of DCNv3 and SDCNv3. The code, running logs, and detailed hyperparameter configurations are available at: https://anonymous.4open.science/r/DCNv3-E352.

13.  Warming Up Cold-Start CTR Prediction by Learning Item-Specific Feature Interactions

Yaqing Wang, Hongming Piao, Daxiang Dong, Quanming Yao, Jingbo Zhou

https://arxiv.org/abs/2407.10112

In recommendation systems, new items are continuously introduced, initially lacking interaction records but gradually accumulating them over time. Accurately predicting the click-through rate (CTR) for these items is crucial for enhancing both revenue and user experience. While existing methods focus on enhancing item ID embeddings for new items within general CTR models, they tend to adopt a global feature interaction approach, often overshadowing new items with sparse data by those with abundant interactions. Addressing this, our work introduces EmerG, a novel approach that warms up cold-start CTR prediction by learning item-specific feature interaction patterns. EmerG utilizes hypernetworks to generate an item-specific feature graph based on item characteristics, which is then processed by a Graph Neural Network (GNN). This GNN is specially tailored to provably capture feature interactions at any order through a customized message passing mechanism. We further design a meta learning strategy that optimizes parameters of hypernetworks and GNN across various item CTR prediction tasks, while only adjusting a minimal set of item-specific parameters within each task. This strategy effectively reduces the risk of overfitting when dealing with limited data. Extensive experiments on benchmark datasets validate that EmerG consistently performs the best given no, a few and sufficient instances of new items. https://github.com/LARS-group/EmerG

14.  Attention Overflow: Language Model Input Blur during Long-Context Missing Items Recommendation

Damien Sileo

https://arxiv.org/abs/2407.13481

Large language models (LLMs) can suggest missing elements from items listed in a prompt, which can be used for list completion or recommendations based on users' history. However, their performance degrades when presented with too many items, as they start to suggest items already included in the input list. This occurs at around 100 items for mid-2024 flagship LLMs. We evaluate this phenomenon on both synthetic problems (e.g., finding missing numbers in a given range of shuffled integers) and realistic movie recommendation scenarios. We refer to this issue as attention overflow, as preventing repetition requires attending to all items simultaneously. Although iterative loops can mitigate this problem, their costs increase with the repetition rate, affecting the language models' ability to derive novelty from lengthy inputs. https://huggingface.co/datasets/sileod/missing-item-prediction

15.  MLSA4Rec: Mamba Combined with Low-Rank Decomposed Self-Attention for Sequential Recommendation

Jinzhao Su, Zhenhua Huang

https://arxiv.org/abs/2407.13135

In applications such as e-commerce, online education, and streaming services, sequential recommendation systems play a critical role. Despite the excellent performance of self-attention-based sequential recommendation models in capturing dependencies between items in user interaction history, their quadratic complexity and lack of structural bias limit their applicability. Recently, some works have replaced the self-attention module in sequential recommenders with Mamba, which has linear complexity and structural bias. However, these works have not noted the complementarity between the two approaches. To address this issue, this paper proposes a new hybrid recommendation framework, Mamba combined with Low-Rank decomposed Self-Attention for Sequential Recommendation (MLSA4Rec), whose complexity is linear with respect to the length of the user's historical interaction sequence. Specifically, MLSA4Rec designs an efficient Mamba-LSA interaction module. This module introduces a low-rank decomposed self-attention (LSA) module with linear complexity and injects structural bias into it through Mamba. The LSA module analyzes user preferences from a different perspective and dynamically guides Mamba to focus on important information in user historical interactions through a gated information transmission mechanism. Finally, MLSA4Rec combines user preference information refined by the Mamba and LSA modules to accurately predict the user's next possible interaction. To our knowledge, this is the first study to combine Mamba and self-attention in sequential recommendation systems. Experimental results show that MLSA4Rec outperforms existing self-attention and Mamba-based sequential recommendation models in recommendation accuracy on three real-world datasets, demonstrating the great potential of Mamba and self-attention working together.


欢迎干货投稿 \ 论文宣传 \ 合作交流

推荐阅读

SIGIR2024 | SelfGNN: 自监督图学习序列推荐
ACL2024推荐系统/大模型论文整理
SIGIR2024 | 大模型推荐的数据高效微调方法

由于公众号试行乱序推送,您可能不再准时收到机器学习与推荐算法的推送。为了第一时间收到本号的干货内容, 请将本号设为星标,以及常点文末右下角的“在看”。

喜欢的话点个在看吧👇

机器学习与推荐算法
专注于分享经典的推荐技术,致力于传播基础的机器学习、深度学习、数据挖掘等方面的知识。
 最新文章