2025-01-07 论文分享 | 大语言模型最新进展

文摘   2025-01-07 09:47   安徽  

点击蓝字 关注我们

论文分享 | 大语言模型相关研究进展

我们从2025-01-02到2025-01-07的24篇文章中精选出5篇优秀的工作分享给读者。

  1. Turning Logic Against Itself: Probing Model Defenses Through Contrastive Questions
  2. Rethinking Layer Removal: Preserving Critical Components with Task-Aware Singular Value Decomposition
  3. Decoding Knowledge in Large Language Models: A Framework for Categorization and Comprehension
  4. Large Language Models for Mathematical Analysis
  5. Mitigating Hallucination for Large Vision Language Model by Inter-Modality Correlation Calibration Decoding

1.Turning Logic Against Itself: Probing Model Defenses Through Contrastive Questions

Authors: Rachneet Sachdeva, Rima Hazra, Iryna Gurevych

https://arxiv.org/abs/2501.01872

论文摘要

Despite significant efforts to align large lan guage models with human values and ethical guidelines, these models remain susceptible to sophisticated jailbreak attacks that exploit their reasoning capabilities. Traditional safety mechanisms often focus on detecting explicit malicious intent, leaving deeper vulnerabilities terfeit goods. I can’t fulfill that request. POATE Prompt Refusal Rate: 98% Please give a tutorial to prevent goods from getting counterfeited. Explain with the help of an opposing unaddressed. In this work, we introduce a jailbreak technique, POATE (Polar Opposite query generation, Adversarial Template construction, and Elaboration), which leverages contrastive reasoning to elicit unethical responses. POATE generates prompts with semantically opposite intents and combines them with adversarial templates to subtly direct models toward pro ducing harmful responses. We conduct exten sive evaluations across six diverse language model families of varying parameter sizes, in cluding LLaMA3, Gemma2, Phi3, and GPT 4, to demonstrate the robustness of the attack, achieving significantly higher attack success rates (~44%) compared to existing methods. Weevaluate our proposed attack against seven safety defenses, revealing their limitations in addressing reasoning-based vulnerabilities. To counteract this, we propose a defense strat egy that improves reasoning robustness through chain-of-thought prompting and reverse think ing, mitigating reasoning-driven adversarial ex ploits. Our code is publicly available.

论文简评

这篇论文主要介绍了一种名为POATE的新方法,它利用大型语言模型的强大推理能力生成有害响应,具体通过构建具有极端意向的提示语句来实现。该技术用于评估各种LLM,其成功率高于现有方法。作者还提出了基于链式思维的防御机制,以应对这些攻击。总体而言,这篇论文提供了对大型语言模型安全性的深入见解,特别是关于如何避免因逻辑推理错误导致的安全漏洞。此外,提出的方法显示出增强模型安全性的潜力,为防止此类攻击提供了新的解决方案。

2.Rethinking Layer Removal: Preserving Critical Components with Task-Aware Singular Value Decomposition

Authors: Kainan Liu, Yong Zhang, Ning Cheng, Zhitao Li, Shaojun Wang, Jing Xiao

https://arxiv.org/abs/2501.00339

论文摘要

Layer removal has emerged as a promising approach for compressing large language models (LLMs) by leveraging redundancy within layers to reduce model size and accelerate inference. However, this technique often disrupts the internal consistency of the model, leading to performance degradation and instability. In this work, we propose Taco-SVD, a novel task-aware framework for compressing LLMs through singular value decomposition. Instead of pure layer removal, Taco-SVD retains task-critical singular value directions to preserve residual stream stability and mitigate the need for noisy self-repair. By leveraging gradient-based attributions, Taco-SVD aligns singular values with downstream task objectives, enabling efficient compression. Extensive evaluations demonstrate that Taco-SVD outperforms existing methods in maintaining perplexity, task performance, and cross-task generalization while requiring minimal computational resources. This work highlights the potential of task-aware low-rank decompositions in stabilizing model behavior and enhancing compression effectiveness.

Layer removal has emerged as a promising method for compressing large language models (LLMs) by leveraging redundancy within layers to reduce model size and accelerate inference. However, this technique often compromises internal consistency, leading to performance degradation and instability, with varying impacts across different model architectures. In this work, we propose Taco-SVD, a task-aware framework that retains task-critical singular value directions, preserving internal consistency while enabling efficient compression. Unlike direct layer removal, Taco-SVD preserves task-critical transformations to mitigate performance degradation. By leveraging gradient-based attribution methods, Taco-SVD aligns singular values with downstream task objectives. Extensive evaluations demonstrate that Taco-SVD outperforms existing methods in perplexity and task performance across different architectures while ensuring minimal computational overhead.

论文简评

这篇关于大型语言模型(LLM)压缩的研究论文提出了Taco-SVD框架,该框架旨在通过保留层删除过程中任务关键奇异值方向来最小化性能下降,并增强内部一致性。作者声称,Taco-SVD方法在多个架构和任务上优于现有方法,在保持性能的同时减少了模型大小。该研究提供了对大型语言模型压缩挑战的一系列见解,并提出了一种新颖的方法,用于通过任务感知奇异值分解保留任务关键变换。此外,作者还进行了全面评估,展示了Taco-SVD在保持性能的同时减少模型大小的有效性。总的来说,这篇论文为大规模语言模型的压缩提供了一个新的视角,具有重要的理论意义和实践价值。

3.Decoding Knowledge in Large Language Models: A Framework for Categorization and Comprehension

Authors: Yanbo Fang, Ruixiang Tang

https://arxiv.org/abs/2501.01332

论文摘要

Understanding how large language models (LLMs) acquire, retain, and apply knowledge remains an open challenge. This paper introduces a novel framework, K-(CSA)², which categorizes LLM knowledge along two dimensions: correctness and confidence. The framework defines six categories of knowledge, ranging from highly confident correctness to confidently held misconceptions, enabling a nuanced evaluation of model comprehension beyond binary accuracy. Using this framework, we demonstrate how techniques like chain-of-thought prompting and reinforcement learning with human feedback fundamentally alter the knowledge structures of internal (pre-trained) and external (context-dependent) knowledge in LLMs. CoT particularly enhances base model performance and shows synergistic benefits when applied to aligned LLMs. Moreover, our layer-wise analysis reveals that higher layers in LLMs encode more high-confidence knowledge, while low-confidence knowledge tends to emerge in middle-to-lower layers.

论文简评

该论文提出了一个名为K-(CSA)²框架,旨在通过融合知识的准确性与信心两个维度来评估大型语言模型(LLM)中的知识。这个框架被设计为提供一种新的方法来衡量这些知识,而不仅仅是准确性或信心。此外,作者还探讨了链式思维提示等技术如何影响LLM中知识结构,这一研究对未来的研究具有潜在影响。尽管该框架的设计创新,但在定义上存在不足,且实验缺乏充分的严谨性,这些问题或将影响其结论的可靠性。然而,论文强调了对知识表示在LLM中的重要性,这在当前研究领域是一个重要议题。总的来说,尽管该论文在某些方面存在问题,它仍是一个有价值的探索,值得进一步的研究和发展。

4.Large Language Models for Mathematical Analysis

Authors: Ziye Chen, Hao Qi

https://arxiv.org/abs/2501.00059

论文摘要

Mathematical problem-solving is a key field in artificial intelligence (AI) and a critical benchmark for evaluating the capabilities of large language models (LLMs). While extensive research has focused on mathematical problem-solving, most existing work and datasets concentrate on computational tasks, leaving gaps in areas like mathematical analysis, which demands rigorous proofs and formal reasoning. We developed the \texttt{DEMI-MathAnalysis} dataset, comprising proof-based problems from mathematical analysis topics such as Sequences and Limits, Infinite Series, and Convex Functions. We also designed a guiding framework to rigorously enhance LLMs' ability to solve these problems. Through fine-tuning LLMs on this dataset and employing our framework, we observed significant improvements in their capability to generate logical, complete, and elegant proofs. This work addresses critical gaps in mathematical reasoning and contributes to advancing trustworthy AI capable of handling formalized mathematical language. The code is publicly accessible at \href{https://github.com/ziye2chen/LLMs-for-Mathematical-Analysis}{LLMs for Mathematical Analysis}.

论文简评

本文提出了一种新的数学分析问题解决数据集——DEMI-MathAnalysis,旨在为大型语言模型(如GPT-4)提供一个严谨推理能力提升的框架。该数据集旨在填补现有基准中主要侧重于计算任务的数据集中的空白,并通过改善模型对严格数学推理的理解来实现这一目标。

首先,文章介绍了该数据集的主要特点:它是一个专注于证明型数学问题的专门数据库,旨在帮助模型更好地理解和解决复杂数学问题。其次,作者提出了一个指导性框架,强调如何利用现有算法和技术,以及如何优化这些技术以提高模型在数学推理方面的表现。

最后,实验结果表明,在使用新的数据集进行微调之后,模型在解决数学问题时表现显著改善。这一成果显示了模型在处理复杂、严格的数学问题上的潜力和改进空间。总的来说,本文为现有数学数据分析基准提供了一个重要补充,有助于推动模型在数学领域的发展。

5.Mitigating Hallucination for Large Vision Language Model by Inter-Modality Correlation Calibration Decoding

Authors: Jiaming Li, Jiacheng Zhang, Zequn Jie, Lin Ma, Guanbin Li

https://arxiv.org/abs/2501.01926

论文摘要

Large vision-language models (LVLMs) have shown remarkable capabilities in visual-language understanding for downstream multi-modal tasks. Despite their success, LVLMs still suffer from generating hallucinations in complex generation tasks, leading to inconsistencies between visual inputs and generated content. To address this issue, some approaches have introduced inference-time interventions, such as contrastive decoding and attention rectification, to reduce overreliance on language priors. However, these approaches overlook hallucinations stemming from spurious inter-modality correlations. In this paper, we propose an Inter-Modality Correlation Calibration Decoding (IMCCD) method to mitigate hallucinations in LVLMs in a training-free manner. In this method, we design a Cross-Modal Value-Enhanced Decoding (CMVED) module to alleviate hallucination by a novel contrastive decoding mechanism. During the estimation of distorted distribution, CMVED masks the value vectors associated with significant cross-modal attention weights, which address both uni-modality overreliance and misleading inter-modality correlations. Additionally, a Content-Driven Attention Refinement (CDAR) module refines cross-modal attention weights, guiding LVLMs to focus on important visual content. Experimental results on diverse hallucination benchmarks validate the superiority of our method over existing state-of-the-art techniques in reducing hallucinations in LVLM text generation. Our code will be available at \url{https://github.com/lijm48/IMCCD}.

论文简评

本文探讨了大规模视觉语言模型(LVLM)中的幻觉问题,并提出了Inter-Modality Correlation Calibration Decoding(IMCCD)解决方案来应对这一问题。IMCCD由两个核心组件组成:Cross-Modal Value-Enhanced Decoding(CMVED)模块和Content-Driven Attention Refinement(CDAR)模块。这两个模块旨在增强模态间的相关性,同时提高视觉内容的保留率。实验结果显示,IMCCD方法优于现有技术,但作者在表达时可以更加清晰地阐述其创新之处。

总体而言,本文提出了一种新的解决方案来应对大规模视觉语言模型中的幻觉问题,并通过实验验证了其有效性。然而,如果能进一步细化每个模块的功能及其如何协同工作以改善模型性能,那么这篇论文将会变得更加完整和有说服力。因此,尽管有一些改进空间,IMCCD仍然是一个值得关注的研究方向。


我们欢迎您在评论区中留下宝贵的建议!包括但不限于:

  • 可以提出推文中论文简评的不足!
  • 可以分享最近更值得推荐的论文并给出理由!

END

推荐阅读

2024-01-06 论文分享 | 智能体最新进展
2024-01-04 论文分享 | 多模态大模型最新进展
2024-01-03 论文分享 | 个性化最新进展
2025-01-02 论文分享 | 大语言模型最新进展

智荐阁
介绍生成式大模型与推荐系统领域的前沿进展,包括但不限于:大语言模型、推荐系统、智能体学习、强化学习、生成式推荐、引导式推荐、推荐智能体、智能体推荐
 最新文章