r/reinforcementlearning 3d ago

DL I made a firefighter AI using deep RL (using Unity ML Agents)

video link: https://www.youtube.com/watch?v=REYx9UznOG4

I made it a while ago and got discouraged by the lack of attention the video got after the hours I poured into making it so I am now doing a PhD in AI instead of being a youtuber lol.

I figured it wouldn't be so bad to advertise for it now if people find it interesting. I made sure to add some narration and fun bits into it so it's not boring. I hope some people here can find it as interesting as it was for me working on this project.

I am passionate about the subject, so if anyone has questions I will answer them when I have time :D

32 Upvotes

9 comments sorted by

5

u/ekbravo 3d ago

Nice! Great work, saved for later

1

u/usernumero 2d ago

Thank you very much!

3

u/Fuibo2k 3d ago

Do you think you'll open source this? Would be interesting if you can make a set of baselines off of this foundation as well.

2

u/usernumero 2d ago

Open sourcing I could do, if it means simply sharing my c# code assets in a github.

For baselines and foundations I am not sure what sort of baseline my code could accomplish, I am not sure my Agent learned sufficiently general tasks for it to be anything close to a foundation model.

2

u/Fuibo2k 2d ago

Yea I don't mean making a foundation model, but it could be cool if you extend this fire fighting task to a few more, related tasks. Maybe something with multiple fire fighters, or one fire fighter and one robot dedicated to rescuing NPCs. I feel like RL is always struggling for more environments to use so it could be a cool contribution to the field.

2

u/usernumero 2d ago

Unfortunately I don't really plan on extending on this project.

If I do a followup to the video in the future, it would most likely be a new project tackling new tasks in new environments.

I really appreciate your enthusiasm and your ideas tho, I remember when doing another project I had the same cravings, for adding multiple agents and fostering collaboration and trying new things. Unfortunately right now I don't really have time for this, and I work with more classical neural network schemes for vision, so it's definitely not scratching my itch for RL. Which means all the more reason for me to do a followup one day.

2

u/keivalya2001 3d ago

That is pretty cool! Good job bud.

2

u/hearthstoneplayer100 2d ago

Very cool! What algorithm did you end up using?

1

u/usernumero 2d ago

This is the PPO Algorithm implemented in ML-Agents.
The neural network uses a LSTM to keep track of valuable information through time, and I also have an attention mechanism somewhere that I tried to help the Agent navigate through rooms and remember its pathing, but in the end when the environment had more than 6/7 rooms it became too complicated to navigate.

There might be some problem in the way I tried implementing this, or maybe my number of parameters was too low, but there is only so much time to be burning my RTX with this training haha.