We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
1 parent 895cec7 commit cde7cfeCopy full SHA for cde7cfe
1 file changed
template.yaml
@@ -21,7 +21,7 @@ description: >
21
Distillation-Guided Policy Optimization (DGPO), a two-phase training framework combining
22
cold-start knowledge distillation and selective teacher-guided reinforcement learning.
23
24
-image: neurips_ws_teaser_top.png
+image: https://omron-sinicx.github.io/dgpo/acl26_fig2_v3.png
25
26
url: https://omron-sinicx.github.io/DGPO
27
0 commit comments