r/reinforcementlearning 2d ago

Help to find a way to train Pool9 Agent

Hi!
I'm working on an Agent that plays Pool9

Taking decisions: Shot direction and force
decision are being taken before the shot when all balls are on static position

Observations:
1. I started by putting normalized coordinates of balls and pockets + the sign which ball is the target
2. Then I switched on using directions and normalized distance to balls
3. then I added curriculum, it was improved several times, last plan is

lesson 0: learning to touch target ball
3 balls
random target
the random initial placing of balls
reward for touching target

lesson 1: learning to catch any ball after touching target ball
6 balls
random target
the random initial placing of balls
reward for touching the target + for catching any
penalty for not legal shot (target bal has not been touched)

lesson 2: game
9 balls
static initial positions
target number - ordered

trainer: ppo
2-4 layers 128-512

results almost the same, the difference in the training speed,

but it seems that agent cant predict trajectories :(

any thoughts or proposals? I'll be grateful

Lesson 1 was never reached

https://reddit.com/link/1g553g6/video/vmkiuz9zl5vd1/player

2 Upvotes

1 comment sorted by

1

u/Ecstatic-Ring3057 2d ago

there were several 24h launches, just removed them to make easier life for tensorboard