async_deep_reinforce

Asynchronous deep reinforcement learning

About

An attempt to repdroduce Google Deep Mind's paper "Asynchronous Methods for Deep Reinforcement Learning."

Asynchronous Advantage Actor-Critic (A3C) method for playing "Atari Pong" is implemented with TensorFlow. Both A3C-FF and A3C-LSTM are implemented.

Learning result movment after 26 hours (A3C-FF) is like this.

Any advice or suggestion is strongly welcomed in issues thread.

#1

How to build

First we need to build multi thread ready version of Arcade Learning Enviroment. I made some modification to it to run it on multi thread enviroment.

$ git clone https://github.com/miyosuda/Arcade-Learning-Environment.git
$ cd Arcade-Learning-Environment
$ cmake -DUSE_SDL=ON -DUSE_RLGLUE=OFF -DBUILD_EXAMPLES=OFF .
$ make -j 4

$ pip install .

I recommend to install it on VirtualEnv environment.

How to run

To train,

$python a3c.py

To display the result with game play,

$python a3c_disp.py

Using GPU

To enable gpu, change "USE_GPU" flag in "constants.py".

When running with 8 parallel game environemts, speeds of GPU (GTX980Ti) and CPU(Core i7 6700) were like this. (Recorded with LOCAL_T_MAX=20 setting.)

type	A3C-FF	A3C-LSTM
GPU	1722 steps per sec	864 steps per sec
CPU	1077 steps per sec	540 steps per sec

Result

Score plots of local threads of pong were like these. (with GTX980Ti)

A3C-LSTM LOCAL_T_MAX = 5

A3C-LSTM LOCAL_T_MAX = 20

Scores are not averaged using global network unlike the original paper.

Requirements

TensorFlow r1.0
numpy
cv2
matplotlib

References

This project uses setting written in muupan's wiki [muuupan/async-rl] (https://github.com/muupan/async-rl/wiki)

Acknowledgements

@aravindsrinivas for providing information for some of the hyper parameters.

Name		Name	Last commit message	Last commit date
Latest commit History 111 Commits
docs		docs
.gitignore		.gitignore
LICENSE.txt		LICENSE.txt
README.md		README.md
a3c.py		a3c.py
a3c_display.py		a3c_display.py
a3c_training_thread.py		a3c_training_thread.py
a3c_visualize.py		a3c_visualize.py
constants.py		constants.py
game_ac_network.py		game_ac_network.py
game_state.py		game_state.py
game_state_test.py		game_state_test.py
pong.bin		pong.bin
rmsprop_applier.py		rmsprop_applier.py
rmsprop_applier_test.py		rmsprop_applier_test.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

async_deep_reinforce

About

How to build

How to run

Using GPU

Result

A3C-LSTM LOCAL_T_MAX = 5

A3C-LSTM LOCAL_T_MAX = 20

Requirements

References

Acknowledgements

About

Releases

Packages

Contributors 3

Languages

License

miyosuda/async_deep_reinforce

Folders and files

Latest commit

History

Repository files navigation

async_deep_reinforce

About

How to build

How to run

Using GPU

Result

A3C-LSTM LOCAL_T_MAX = 5

A3C-LSTM LOCAL_T_MAX = 20

Requirements

References

Acknowledgements

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages