Paper-Conference

Improving Data Efficiency for LLM Reinforcement Fine-tuning Through Difficulty-targeted Online Data Selection and Rollout Replay featured image

Improving Data Efficiency for LLM Reinforcement Fine-tuning Through Difficulty-targeted Online Data Selection and Rollout Replay

Reinforcement learning (RL) has become an effective approach for fine-tuning large language models (LLMs), particularly to enhance their reasoning capabilities. However, RL …

yifan-sun
Training-Free Bayesianization for Low-Rank Adapters of Large Language Models featured image

Training-Free Bayesianization for Low-Rank Adapters of Large Language Models

Advances in Neural Information Processing Systems (NeurIPS), 2025

haizhou-shi
BLoB: Bayesian Low-Rank Adaptation by Backpropagation for Large Language Models featured image

BLoB: Bayesian Low-Rank Adaptation by Backpropagation for Large Language Models

Advances in Neural Information Processing Systems (NeurIPS), 2024

avatar
Yibin Wang