Jingze Shi
e450a6fbc4
Recipes for optimzing training scripts ( #120 )
...
* Add recipe configs to optimize scripts (#73 )
* remove small models
* Add README for recipes
* Add README for recipes
* Attempt to resolve conflicts
* Optimize src scripts
* Update recipe of DeepSeek-R1-Distill-Qwen-7B
* Update recipe of Qwen2.5-1.5B
* Updated recipe readme for qwen
* Update training command for recipes
* Update README.md
Co-authored-by: Kashif Rasul <kashif.rasul@gmail.com>
* Update preprocessing_num_workers from 36 to 8
* Add small language model recipes for quickly verify R1
* Fix src code quality
* Add back the Slurm job command
* Remove recipe of doge
* Fix torch_dtype is not used
* fix grpo yaml
* fix grpo yaml
* fix deprecation warning
* fix config folder location
* Remove duplicate variables in grpo.py
* Update README.md
Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>
* Update README.md
Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>
* Update recipes/qwen/Qwen2.5-1.5B-Instruct/grpo/confg_full.yaml
Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>
---------
Co-authored-by: Kashif Rasul <kashif.rasul@gmail.com>
Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>
2025-01-31 12:41:53 +01:00
Dongwei Jiang
22512e62bc
Update README.md ( #132 )
2025-01-31 11:27:17 +01:00
Sam Schorb
356f6a5c4f
Add Table of Contents to README for easier navigation ( #125 )
...
* Update README.md
* Update README.md
2025-01-30 16:32:13 +01:00
Kashif Rasul
c0b53fae29
Grpo slurm scripts ( #112 )
...
* initial grpo.slurm script
* initial zero3 yaml using 1 less gpu
* add completion and promp length
* initial doc
* use main
* fix typo
* remove num_processes
* use vllm 0.7.0
* remove double module load
* update math-verify
* Update README.md
Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>
* overwrite num_procs in the slurm script
* add vllm args to readme
* update readme
---------
Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>
2025-01-30 10:22:45 +01:00
Lewis
fb1b4c4e3f
docs(README): note about CUDA 12.1 ( #121 )
...
will segfault for CUDA 14.1 under certain conditions; instructions are specific to 12.1
- fixes #106
- fixes #117
2025-01-30 08:42:43 +01:00
Edward Beeching
bd0e15bfb5
Update README.md ( #93 )
2025-01-30 00:42:29 +01:00
Mayur Pagote
7a7682b6a4
Corrected Typos in README.md ( #110 )
2025-01-30 00:38:47 +01:00
Deborah Shekinah Jacob
971294b018
Modified: pip install --upgrade pip ( #99 )
2025-01-29 19:55:54 +01:00
María Grandury
401a219575
fix typos ( #40 )
...
Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>
2025-01-29 12:37:11 +01:00
Andrés Marafioti
c4fdb69940
Change conda for uv ( #91 )
...
* Change conda for uv
* quentin's magical path
2025-01-28 21:16:48 +01:00
Gabriel Martín Blázquez
b03480d868
Add --input-batch-size
, --client-replicas
args and download Ray logs ( #71 )
...
Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>
2025-01-27 14:58:13 +01:00
Quentin Gallouédec
67764bc6ae
Update README.md ( #72 )
2025-01-27 13:24:25 +01:00
María Grandury
d6f1a179a5
Implement make evaluate command ( #41 )
...
* implement evaluate make command
* add example usage of make evaluate to readme
2025-01-27 10:45:56 +01:00
CharlesCNorton
8d37c5c27f
docs: fix grammar and phrasing issues (1, 2, 3) ( #62 )
...
1. Insert missing article "a":
- Original: "This repo is work in progress..."
- Revised: "This repo is a work in progress..."
Rationale:
The article "a" is needed before "work in progress" to make the sentence grammatically correct.
2. Add "as well as" for parallelism:
- Original: "...scripts to train and evaluate models as well generate synthetic data..."
- Revised: "...scripts to train and evaluate models as well as generate synthetic data..."
Rationale:
"As well as" is the correct conjunction to link multiple verbs or verb phrases, improving clarity.
3. Clarify GPU resource phrasing:
- Original: "we used 2 nodes of 8xH100 each one..."
- Revised: "we used 2 nodes, each with 8×H100 GPUs..."
Rationale:
This rewording removes redundant language ("each one") and more clearly states that each node has eight H100 GPUs.
2025-01-27 10:45:34 +01:00
Ikko Eltociear Ashimine
7c01b59c44
docs: update README.md ( #54 )
...
scipts -> scripts
Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>
2025-01-26 18:48:07 +01:00
Anton Lozhkov
15df4fb134
vllm speed tweaks ( #43 )
2025-01-26 01:59:50 +01:00
Agus
d98862a5c8
Add example of generating data with deepseek r1 and distilled models ( #29 )
2025-01-25 17:34:21 +01:00
Manuel Romero
c27a974b99
Fix typo ( #25 )
2025-01-25 15:16:56 +01:00
Quentin Gallouédec
f844eac629
Update README.md
2025-01-25 14:49:49 +01:00
Quentin Gallouédec
ff34d8f651
Handle error in verifier + deepspeed command ( #17 )
...
* handle error in verification
* command with zero2 and catch more error in verifier
* Update README.md
Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>
* deepseek distill and remove grad chekpoint
* drop grad checkpoint
* except
---------
Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>
2025-01-25 13:58:04 +01:00
lewtun
2580fd8c1b
Fix Slurm SFT and gather Slurm scripts ( #19 )
...
* Fix slurm
* Fix generate
* Fix install
* Fix c
2025-01-25 13:47:52 +01:00
lewtun
64b4927a33
Update README.md
2025-01-25 13:10:13 +01:00
lewtun
13d8392b78
Fix eval comamnds ( #18 )
2025-01-25 12:31:40 +01:00
Lewis Tunstall
5ecc11b50a
Scale image
2025-01-25 10:22:38 +00:00
lewtun
7564de2c24
Add diagram ( #16 )
2025-01-25 11:20:17 +01:00
Loubna Ben Allal
2ceba252a3
Add SFT command to the readme ( #15 )
2025-01-25 10:56:33 +01:00
Gabriel Martín Blázquez
02bed5308c
Add synthetic data generation script ( #9 )
...
* Add synthetic data generation script
Co-authored-by: Anton <anton-l@users.noreply.github.com>
Co-authored-by: Agustin <plaguss@users.noreply.github.com>
* Fix format
* Fix imports sorting
---------
Co-authored-by: Anton <anton-l@users.noreply.github.com>
Co-authored-by: Agustin <plaguss@users.noreply.github.com>
2025-01-25 01:42:24 +01:00
Quentin Gallouédec
05496dcdab
System prompt; Fix readme command
2025-01-25 00:21:21 +00:00
Quentin Gallouédec
b47b1d058b
GRPO script ( #3 )
...
* inital commit
* with reward func
* fix box extract
* example line
* don't break when answer malformed
* command and logging
* holly simplicity
* move grpo
* reverse readme
* instructions
2025-01-25 00:19:38 +01:00
lewtun
ca8f35c143
REFACTOR TO THE MAX ( #7 )
2025-01-25 00:12:25 +01:00
lewtun
26184f71ae
Refactor evaluation ( #6 )
2025-01-24 23:46:34 +01:00
Edward Beeching
9c398973e8
Adds Math-500 and AIME24 evals ( #4 )
...
* adds evals
* up max model len
---------
Co-authored-by: Kashif Rasul <kashif.rasul@gmail.com>
2025-01-24 23:09:07 +01:00
Leandro von Werra
52aefc29e2
Update README.md
2025-01-24 22:06:22 +01:00
lewtun
6acc9a0aa0
Add configs and stuff ( #2 )
2025-01-24 20:05:18 +01:00
lewtun
83f9c6c8da
Initial commit
2025-01-24 16:44:12 +01:00