Daily Paper Machine

Tag: Layer Analysis

All the papers with the tag "Layer Analysis".

XBreaking: Explainable Artificial Intelligence for Jailbreaking LLMs
grok-3-latest
Score: 0.41
Published:2025年4月30日 at 14:44
#LLM, #Explainable AI, #Jailbreaking, #Layer Analysis, #Noise Injection
本文提出XBreaking方法，利用可解释性AI技术识别审查模型的关键层并注入噪声，成功绕过大型语言模型的安全限制，显著提升有害内容生成能力。