How to use custom network? #301

ErcBunny · 2024-08-09T23:04:52Z

I would like to use the following network for my project, but I am not sure how exactly to do it.

                                                     actor    
    ┌─────┐      ┌─────────┐   ┌────┐  ┌─────────┐   ┌───┐    
x──►│ CNN ├─────►│torch.cat│──►│LSTM├─►│torch.cat├┬─►│MLP├──►a
    └─────┘      └─────────┘   └────┘  └─────────┘│  └───┘    
                      ▲                   ▲  ▲    │  ┌───┐    
                      │                   │  │    └─►│MLP├──►v 
                      y───────────────────┘  z       └───┘    
                                                     value

In the diagram, x, y, z come from the observation dictionary, and a represents action, v is the value.

Thank you very much for considering my question and I look forward to the guidance.

The text was updated successfully, but these errors were encountered:

Denys88 · 2024-08-11T03:30:32Z

https://github.com/Denys88/IsaacGymEnvs/blob/main/isaacgymenvs/learning/networks/ig_networks.py here is a good example how I tested pretty complex networks with IsaacGym.
Let me know if it is enough for you.

ViktorM · 2024-08-11T05:00:35Z

Not exactly your example, but here is a very similar Resnet network builder with RNN (LSTM) layers support.

ErcBunny · 2024-08-11T09:36:48Z

Thank you @Denys88 and @ViktorM for providing the examples and the pointer to the A2CResnetBuilder.

While waiting for the answer, I was also looking at the code in network builder and found A2CBuilder and A2CResnetBuilder, which all provide blocks to create the CNN/Resnet + LSTM + MLP network.

They all seem to only accept obs_dict['obs'] as the single input to the forward function, but in my project I have not only the image tensor obs_dict['x'] but also other state tensors obs_dict['y'] and obs_dict['z'] to be consumed by different blocks of the net.

So, I am planning to create a derived class of NetworkBuilder mimicking either A2CBuilder or A2CResnetBuilder (btw which one is better for my single channel, normalized depth image of size (256,192)?) and modify the forward function (and perhaps other necessary intialization parts) to adapt it to my obs_dict. I guess I'll also need a new model derived from ModelA2CContinuousLogStd to make it work. Is this approach feasible and will it bring potential problems?

Please correct me if I've misunderstood anything. Looking forward to hearing your thoughts on this approach and any recommendations you might have!

ankurhanda · 2024-08-11T14:42:41Z

@ViktorM @Denys88 the example above assumes that you are using a frozen network. You can't optimise the weights of this network because rl_games has torch.infernce() context for doing running mean and std normalisation which breaks the compute graph for the vision network.

So, this is only suitable for pre-trained networks and not end to end visual RL.

ErcBunny · 2024-08-11T16:48:44Z

@ViktorM @Denys88 the example above assumes that you are using a frozen network. You can't optimise the weights of this network because rl_games has torch.infernce() context for doing running mean and std normalisation which breaks the compute graph for the vision network.

So, this is only suitable for pre-trained networks and not end to end visual RL.

Thanks for your comment @ankurhanda. I have a question about standardization breaking the compute graph for vision net.

I decided to first implement a simpler version of my network illustrated like this:

                                                     actor    
    ┌─────┐      ┌─────────┐   ┌────┐  ┌─────────┐   ┌───┐    
x──►│ CNN ├─────►│torch.cat│──►│LSTM├─►│torch.cat├┬─►│MLP├──►a
    └─────┘      └─────────┘   └────┘  └─────────┘│  └───┘    
                      ▲                           │  ┌───┐    
                      │                           └─►│MLP├──►v 
                      y                              └───┘    
                                                     value

where x is retrieved from input_dict["obs"]["image"] and y from input_dict["obs"]["state"].

And my question would be: If I only use running statistics to standardize y and manually normalize x inside my env step to [0, 1], is it possible to do e2e learning with CNN?

ankurhanda · 2024-08-11T16:59:13Z

As long as you don't do anything to the CNN, you should be fine. Normalizing x should be OK.

My main concern is if you want to do end-to-end optimising CNN weights. Current settings don't allow that because compute graph is broken during normalisation inside the rl_games code.

rl_games/rl_games/algos_torch/models.py

Line 50 in 2606eff

def norm_obs(self, observation):

ErcBunny · 2024-08-11T17:24:11Z

I am trying to do e2e learning to also optimize the CNN weights. Why does normalizing the input to a network with no grad break the compute graph? Could you share more details?

I assume if the concatenated tensor of (x, y) is normalized through no grad, then CNN params will not be updated. But in my case normalization happens at inputs, I guess it is probably fine? Please correct me if I am wrong...

ViktorM · 2024-08-13T00:08:41Z

@ViktorM @Denys88 the example above assumes that you are using a frozen network. You can't optimise the weights of this network because rl_games has torch.infernce() context for doing running mean and std normalisation which breaks the compute graph for the vision network.

So, this is only suitable for pre-trained networks and not end to end visual RL.

@ankurhanda I don't think we use torch.inference() in the code, can you point to the exact place. The example above: https://github.com/Denys88/rl_games/blob/master/rl_games/algos_torch/network_builder.py#L623 is for end2end training, we have configs Atari training from scratch: https://github.com/Denys88/rl_games/blob/master/rl_games/configs/atari/ppo_breakout_torch_impala.yaml

It can easily be modified to load pre-trained weights and freeze them, or not, but the default variant is exactly for e2e training.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to use custom network? #301

How to use custom network? #301

ErcBunny commented Aug 9, 2024

Denys88 commented Aug 11, 2024

ViktorM commented Aug 11, 2024

ErcBunny commented Aug 11, 2024 •

edited

Loading

ankurhanda commented Aug 11, 2024 •

edited

Loading

ErcBunny commented Aug 11, 2024 •

edited

Loading

ankurhanda commented Aug 11, 2024 •

edited

Loading

ErcBunny commented Aug 11, 2024 •

edited

Loading

ViktorM commented Aug 13, 2024

How to use custom network? #301

How to use custom network? #301

Comments

ErcBunny commented Aug 9, 2024

Denys88 commented Aug 11, 2024

ViktorM commented Aug 11, 2024

ErcBunny commented Aug 11, 2024 • edited Loading

ankurhanda commented Aug 11, 2024 • edited Loading

ErcBunny commented Aug 11, 2024 • edited Loading

ankurhanda commented Aug 11, 2024 • edited Loading

ErcBunny commented Aug 11, 2024 • edited Loading

ViktorM commented Aug 13, 2024

ErcBunny commented Aug 11, 2024 •

edited

Loading

ankurhanda commented Aug 11, 2024 •

edited

Loading

ErcBunny commented Aug 11, 2024 •

edited

Loading

ankurhanda commented Aug 11, 2024 •

edited

Loading

ErcBunny commented Aug 11, 2024 •

edited

Loading