Tag: SFT | Zhangzhe's Blog

0%

SFT Tag

2025

08-10

On the Generalization of SFT: A Reinforcement Learning Perspective with Reward Rectification

2024

12-20

BAdam: A Memory Efficient Full Parameter Optimization Method for Large Language Models

12-20

GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection

12-13

上手训练大模型(2)——以LlamaFactory视角看大模型微调全流程

12-13

上手训练大模型(1)——用Alpaca-cleaned指令微调Llama-3.2-3B