Skip to content
Permalink

Comparing changes

Choose two branches to see what’s changed or to start a new pull request. If you need to, you can also or learn more about diff comparisons.

Open a pull request

Create a new pull request by comparing changes across two branches. If you need to, you can also . Learn more about diff comparisons here.
base repository: openai/baselines
Failed to load repositories. Confirm that selected base ref is valid, then try again.
Loading
base: master
Choose a base ref
...
head repository: openai/baselines
Failed to load repositories. Confirm that selected head ref is valid, then try again.
Loading
compare: stateful_rnn
Choose a head ref
Checking mergeability… Don’t worry, you can still create the pull request.
  • 1 commit
  • 13 files changed
  • 1 contributor

Commits on Apr 26, 2019

  1. RNN support for PPO2 (#859)

    * initial implementaion of ppo2_rnn.
    
    * set lstm memory as tf.GraphKeys.LOCAL_VARIABLES.
    
    * replace dones with tf.placeholder_with_default.
    
    * improves for 'play' option.
    
    * removed unnecessary TODO .
    
    * improve lstm code.
    
    * move learning rate placeholer to optimizer scope.
    
    * support the microbatched model.
    
    * sync cnn lstm layer with originals.
    
    * add cnn_lnlstm layer.
    
    * fix a case when `states` is None.
    
    * add initial_state variable to help test.
    
    * make ppo2 rnn test available.
    
    * rename 'obs' with 'observations'.
    rename 'transition' with 'transitions'.
    fix forgetting `dones` in the replay buffer.
    fix a misuse of `states` and `next_states` in the replay buffer.
    
    * make initialization once.
    make `test_fixed_sequence` compatible with ppo2.
    
    * adjust input shape.
    
    * fix checking of a model input args in `simple_test` function.
    
    * disable warning on purpose.
    
    * support the play.
    
    * improve scopes to compatible with multiple models (i.e, other tensorflow global/local variables)
    
    * clean the scope of ppo2 policy model.
    
    * name the memory variable of PPO RNNs more describly
    
    * wrap the initializations in ppo2.
    
    * remove redundant lines.
    
    * update `REAMD.md`.
    
    * add RNN layers.
    
    * add the result of HalfCheeta-v2 env  experiment.
    
    * correct a typo.
    
    * add RNN class.
    
    * rename `nlstm` with `num_units` in RNN builder functions.
    
    * remove state saving.
    
    * reuse RNNs in a2c.utils.
    
    * revert baselines/run.py.
    
    * replace `ppo2.step()` with original interface.
    
    * revert `baselines/common/tests/util.py`.
    
    * remove redundant lines.
    
    * revert `baselines/common/test/util.py` to b875fb7.
    
    * remove `states` variable.
    
    * move RNN class to `baselines/ppo2/layers.py' and revert `baselines/common/models.py` to 858afa8.
    
    * rename `model.step_as_dict` with `model.step_with_dict`.
    
    * removed `ppo_lstm_mlp`.
    
    * fix 02e26fd.
    gyunt authored and pzhokhov committed Apr 26, 2019
    Configuration menu
    Copy the full SHA
    fc0c43b View commit details
    Browse the repository at this point in the history
Loading