Tag: LLM | Zhangzhe's Blog

0%

LLM Tag

2025

02-02

DeepSeek-V3 Technical Report

01-14

BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation

01-07

大模型DPO入门

01-06

大模型RLHF入门

01-04

大模型RAG入门

2024

12-31

GShard: Scaling Giant Models with Conditional Computation and Automatic Sharding

12-27

DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models

12-27

DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model

12-26

AdaLomo: Low-memory Optimization with Adaptive Learning Rate

12-26

LOMO:Full Parameter Fine-tuning for Large Language Models with Limited Resources