DPPO with Summed Likelihood (QSM-PG) by jlidard · Pull Request #13 · jlidard/dppo-dev

jlidard · 2024-11-23T00:53:50Z

Take expectation over env steps and sum over all denoising steps.

This change is

allenzren · 2024-11-25T17:09:37Z

model/diffusion/diffusion_ppo_sumlikelihood.py

+        obs_repeat = {
+            "state": obs["state"].repeat_interleave(self.ft_denoising_steps, dim=0)
+        }
+        denoising_inds = denoising_inds.repeat_interleave(chains_prev.shape[0])


This is wrong? obs_repeat now has obs_1, obs_1, obs_1, ... obs_2, obs_2,...., but denoising_inds has 0, 0,0,0,0,,,,...1,1,1,1,1, .....

I think you want repeat instead of repeat_interleave for denoising_inds

allenzren · 2024-11-25T17:13:18Z

model/diffusion/diffusion_ppo_sumlikelihood.py

+
+        # exponentially interpolate between the base and the current clipping value over denoising steps and repeat
+        t = (denoising_inds.float() / (self.ft_denoising_steps - 1)).to(self.device)
+        t = t[


this needs to be modified too then I think

Justin M. Lidard and others added 7 commits November 14, 2024 21:01

add diffusion versions of rlpd/ibrl/cal-ql

a99b828

fix typo:

4888a17

config fix

92d2dac

make forward pass differentiable

b0caba9

update configs

c367feb

add ppo with summed likelihood

583e191

minor

2938fdd

allenzren reviewed Nov 25, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DPPO with Summed Likelihood (QSM-PG)#13

DPPO with Summed Likelihood (QSM-PG)#13
jlidard wants to merge 7 commits intomainfrom
dppo_sumlikelihood

jlidard commented Nov 23, 2024 •

edited

Loading

Uh oh!

allenzren Nov 25, 2024

Uh oh!

allenzren Nov 25, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

jlidard commented Nov 23, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

allenzren Nov 25, 2024

Choose a reason for hiding this comment

Uh oh!

allenzren Nov 25, 2024

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

jlidard commented Nov 23, 2024 •

edited

Loading