论文合集|| 推荐系统论文推荐

发表时间:2023-05-31 16:16作者:沃恩智慧

今天,沃恩智慧小编和大家整理了一些推荐系统论文,希望对各位在科研路上砥砺前行的小伙伴有所帮助。

When Search Meets Recommendation: Learning Disentangled Search Representation for Recommendation

论文‍:http://arxiv.org/abs/2305.10822v1

Modern online service providers such as online shopping platforms often provide both search and recommendation (S&R) services to meet different user needs. Rarely has there been any effective means of incorporating user behavior data from both S&R services. Most existing approaches either simply treat S&R behaviors separately, or jointly optimize them by aggregating data from both services, ignoring the fact that user intents in S&R can be distinctively different. In our paper, we propose a Search-Enhanced framework for the Sequential Recommendation (SESRec) that leverages users' search interests for recommendation, by disentangling similar and dissimilar representations within S&R behaviors. Specifically, SESRec first aligns query and item embeddings based on users' query-item interactions for the computations of their similarities. Two transformer encoders are used to learn the contextual representations of S&R behaviors independently. Then a contrastive learning task is designed to supervise the disentanglement of similar and dissimilar representations from behavior sequences of S&R. Finally, we extract user interests by the attention mechanism from three perspectives, i.e., the contextual representations, the two separated behaviors containing similar and dissimilar interests. Extensive experiments on both industrial and public datasets demonstrate that SESRec consistently outperforms state-of-the-art models. Empirical studies further validate that SESRec successfully disentangle similar and dissimilar user interests from their S&R behaviors.

FedAds: A Benchmark for Privacy-Preserving CVR Estimation with Vertical Federated Learning

论文:http://arxiv.org/abs/2305.08328v1

Conversion rate (CVR) estimation aims to predict the probability of conversion event after a user has clicked an ad. Typically, online publisher has user browsing interests and click feedbacks, while demand-side advertising platform collects users' post-click behaviors such as dwell time and conversion decisions. To estimate CVR accurately and protect data privacy better, vertical federated learning (vFL) is a natural solution to combine two sides' advantages for training models, without exchanging raw data. Both CVR estimation and applied vFL algorithms have attracted increasing research attentions. However, standardized and systematical evaluations are missing: due to the lack of standardized datasets, existing studies adopt public datasets to simulate a vFL setting via hand-crafted feature partition, which brings challenges to fair comparison. We introduce FedAds, the first benchmark for CVR estimation with vFL, to facilitate standardized and systematical evaluations for vFL algorithms. It contains a large-scale real world dataset collected from Alibaba's advertising platform, as well as systematical evaluations for both effectiveness and privacy aspects of various vFL algorithms. Besides, we also explore to incorporate unaligned data in vFL to improve effectiveness, and develop perturbation operations to protect privacy well. We hope that future research work in vFL and CVR estimation benefits from the FedAds benchmark.

Dual Intent Enhanced Graph Neural Network for Session-based New Item Recommendation

论文:http://arxiv.org/abs/2305.05848v1

Recommender systems are essential to various fields, e.g., e-commerce, e-learning, and streaming media. At present, graph neural networks (GNNs) for session-based recommendations normally can only recommend items existing in users' historical sessions. As a result, these GNNs have difficulty recommending items that users have never interacted with (new items), which leads to a phenomenon of information cocoon. Therefore, it is necessary to recommend new items to users. As there is no interaction between new items and users, we cannot include new items when building session graphs for GNN session-based recommender systems. Thus, it is challenging to recommend new items for users when using GNN-based methods. We regard this challenge as '\textbf{G}NN \textbf{S}ession-based \textbf{N}ew \textbf{I}tem \textbf{R}ecommendation (GSNIR)'. To solve this problem, we propose a dual-intent enhanced graph neural network for it. Due to the fact that new items are not tied to historical sessions, the users' intent is difficult to predict. We design a dual-intent network to learn user intent from an attention mechanism and the distribution of historical data respectively, which can simulate users' decision-making process in interacting with a new item. To solve the challenge that new items cannot be learned by GNNs, inspired by zero-shot learning (ZSL), we infer the new item representation in GNN space by using their attributes. By outputting new item probabilities, which contain recommendation scores of the corresponding items, the new items with higher scores are recommended to users. Experiments on two representative real-world datasets show the superiority of our proposed method. The case study from the real-world verifies interpretability benefits brought by the dual-intent module and the new item reasoning module. The code is available at Github: https://github.com/Ee1s/NirGNN

Popularity Debiasing from Exposure to Interaction in Collaborative Filtering

论文:http://arxiv.org/abs/2305.05204v1

Recommender systems often suffer from popularity bias, where popular items are overly recommended while sacrificing unpopular items. Existing researches generally focus on ensuring the number of recommendations exposure of each item is equal or proportional, using inverse propensity weighting, causal intervention, or adversarial training. However, increasing the exposure of unpopular items may not bring more clicks or interactions, resulting in skewed benefits and failing in achieving real reasonable popularity debiasing. In this paper, we propose a new criterion for popularity debiasing, i.e., in an unbiased recommender system, both popular and unpopular items should receive Interactions Proportional to the number of users who Like it, namely IPL criterion. Under the guidance of the criterion, we then propose a debiasing framework with IPL regularization term which is theoretically shown to achieve a win-win situation of both popularity debiasing and recommendation performance. Experiments conducted on four public datasets demonstrate that when equipping two representative collaborative filtering models with our framework, the popularity bias is effectively alleviated while maintaining the recommendation performance.

Graph Masked Autoencoder for Sequential Recommendation

论文:http://arxiv.org/abs/2305.04619v2

While some powerful neural network architectures (e.g., Transformer, Graph Neural Networks) have achieved improved performance in sequential recommendation with high-order item dependency modeling, they may suffer from poor representation capability in label scarcity scenarios. To address the issue of insufficient labels, Contrastive Learning (CL) has attracted much attention in recent methods to perform data augmentation through embedding contrasting for self-supervision. However, due to the hand-crafted property of their contrastive view generation strategies, existing CL-enhanced models i) can hardly yield consistent performance on diverse sequential recommendation tasks; ii) may not be immune to user behavior data noise. In light of this, we propose a simple yet effective Graph Masked AutoEncoder-enhanced sequential Recommender system (MAERec) that adaptively and dynamically distills global item transitional information for self-supervised augmentation. It naturally avoids the above issue of heavy reliance on constructing high-quality embedding contrastive views. Instead, an adaptive data reconstruction paradigm is designed to be integrated with the long-range item dependency modeling, for informative augmentation in sequential recommendation. Extensive experiments demonstrate that our method significantly outperforms state-of-the-art baseline models and can learn more accurate representations against data noise and sparsity. Our implemented model code is available at https://github.com/HKUDS/MAERec.

Attacking Pre-trained Recommendation

论文:http://arxiv.org/abs/2305.03995v1

Recently, a series of pioneer studies have shown the potency of pre-trained models in sequential recommendation, illuminating the path of building an omniscient unified pre-trained recommendation model for different downstream recommendation tasks. Despite these advancements, the vulnerabilities of classical recommender systems also exist in pre-trained recommendation in a new form, while the security of pre-trained recommendation model is still unexplored, which may threaten its widely practical applications. In this study, we propose a novel framework for backdoor attacking in pre-trained recommendation. We demonstrate the provider of the pre-trained model can easily insert a backdoor in pre-training, thereby increasing the exposure rates of target items to target user groups. Specifically, we design two novel and effective backdoor attacks: basic replacement and prompt-enhanced, under various recommendation pre-training usage scenarios. Experimental results on real-world datasets show that our proposed attack strategies significantly improve the exposure rates of target items to target users by hundreds of times in comparison to the clean model.

NewsQuote: A Dataset Built on Quote Extraction and Attribution for Expert Recommendation in Fact-Checking

论文:http://arxiv.org/abs/2305.04825v1

To enhance the ability to find credible evidence in news articles, we propose a novel task of expert recommendation, which aims to identify trustworthy experts on a specific news topic. To achieve the aim, we describe the construction of a novel NewsQuote dataset consisting of 24,031 quote-speaker pairs that appeared on a COVID-19 news corpus. We demonstrate an automatic pipeline for speaker and quote extraction via a BERT-based Question Answering model. Then, we formulate expert recommendations as document retrieval task by retrieving relevant quotes first as an intermediate step for expert identification, and expert retrieval by directly retrieving sources based on the probability of a query conditional on a candidate expert. Experimental results on NewsQuote show that document retrieval is more effective in identifying relevant experts for a given news topic compared to expert retrieval


分享到: