lewtun
|
7564de2c24
|
Add diagram (#16)
|
2025-01-25 11:20:17 +01:00 |
|
Quentin Gallouédec
|
742cc008b2
|
Pin main for transformers and trl
|
2025-01-25 11:07:17 +01:00 |
|
Agus
|
33795e1b5a
|
Add math-verify to check accuracy of completions on GRPO (#14)
* Add math-verify to check accuracy of completions on GRPO
* Handle make_conversation
* Update src/open_r1/grpo.py
Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>
* Update src/open_r1/grpo.py
Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>
* Update src/open_r1/grpo.py
Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>
* fix quality
* Remove unnecesary item access in parsed answer
---------
Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>
|
2025-01-25 11:03:58 +01:00 |
|
Loubna Ben Allal
|
2ceba252a3
|
Add SFT command to the readme (#15)
|
2025-01-25 10:56:33 +01:00 |
|
Gabriel Martín Blázquez
|
692e075715
|
Fix generate.slurm (#10)
|
2025-01-25 02:11:53 +01:00 |
|
elie
|
9987bb8995
|
use liger kernel
|
2025-01-25 01:51:39 +01:00 |
|
Gabriel Martín Blázquez
|
02bed5308c
|
Add synthetic data generation script (#9)
* Add synthetic data generation script
Co-authored-by: Anton <anton-l@users.noreply.github.com>
Co-authored-by: Agustin <plaguss@users.noreply.github.com>
* Fix format
* Fix imports sorting
---------
Co-authored-by: Anton <anton-l@users.noreply.github.com>
Co-authored-by: Agustin <plaguss@users.noreply.github.com>
|
2025-01-25 01:42:24 +01:00 |
|
Quentin Gallouédec
|
05496dcdab
|
System prompt; Fix readme command
|
2025-01-25 00:21:21 +00:00 |
|
elie
|
9ae671a75e
|
fix slurm (#8)
|
2025-01-25 01:02:26 +01:00 |
|
Quentin Gallouédec
|
b47b1d058b
|
GRPO script (#3)
* inital commit
* with reward func
* fix box extract
* example line
* don't break when answer malformed
* command and logging
* holly simplicity
* move grpo
* reverse readme
* instructions
|
2025-01-25 00:19:38 +01:00 |
|
Lewis Tunstall
|
e660e43610
|
Fix configs
|
2025-01-24 23:13:20 +00:00 |
|
lewtun
|
ca8f35c143
|
REFACTOR TO THE MAX (#7)
|
2025-01-25 00:12:25 +01:00 |
|
lewtun
|
26184f71ae
|
Refactor evaluation (#6)
|
2025-01-24 23:46:34 +01:00 |
|
Edward Beeching
|
9c398973e8
|
Adds Math-500 and AIME24 evals (#4)
* adds evals
* up max model len
---------
Co-authored-by: Kashif Rasul <kashif.rasul@gmail.com>
|
2025-01-24 23:09:07 +01:00 |
|
elie
|
c421bc893b
|
Improve sft (#5)
* first commit
* working training
* change model_id
* Update scripts/training/sft.py
Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>
---------
Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>
|
2025-01-24 22:23:49 +01:00 |
|
Leandro von Werra
|
52aefc29e2
|
Update README.md
|
2025-01-24 22:06:22 +01:00 |
|
lewtun
|
6acc9a0aa0
|
Add configs and stuff (#2)
|
2025-01-24 20:05:18 +01:00 |
|
Quentin Gallouédec
|
a4bf90465f
|
Update setup.py (#1)
|
2025-01-24 19:13:04 +01:00 |
|
Lewis Tunstall
|
697c119dd8
|
Add data
|
2025-01-24 16:51:03 +00:00 |
|
Lewis Tunstall
|
2ff66e6cde
|
Add skeleton
|
2025-01-24 16:50:13 +00:00 |
|
lewtun
|
83f9c6c8da
|
Initial commit
|
2025-01-24 16:44:12 +01:00 |
|