-
Tsinghua University
- Beijing, China
-
17:33
(UTC +08:00) - ubecwang@gmail.com
- @UbecWang
Highlights
- Pro
Pinned Loading
-
OpenRLHF/OpenRLHF
OpenRLHF/OpenRLHF PublicAn Easy-to-use, Scalable and High-performance RLHF Framework based on Ray (PPO & GRPO & REINFORCE++ & LoRA & vLLM & RFT)
-
THUDM/SWE-Dev
THUDM/SWE-Dev PublicSWE-Dev is an open-source SWE agent with a scalable test case construction pipeline. This pipeline synthesizes test cases through a two-step process: generating Gherkin descriptions and correspondi…
Python 18
-
Generalization-of-Transformers
Generalization-of-Transformers Public[ICLR'25] Understanding the Generalization of In-Context Learning in Transformers: An Empirical Study
Python 3
-
Shape-Control-of-DLO
Shape-Control-of-DLO PublicDeep Reinforcement Learning spring 24, Tsinghua Univ.
Python 4
If the problem persists, check the GitHub status page or contact support.