![]() * Pathological reward function seems to imply that Plebs learn to count steps, so if they are too far away from their objective, they know they can gain more points by procrastinating near the objectie than by picking-up/dropping-off and accelerating toward the next objective. * Pathological reward function that was found. They do not learn to adapt when a spy shows up in a different slot. they learn that certain slots holding observations of agents will always hold spies. * Watchpoint: Plebs may rely upon the position of an agent in the observation vector in order to assign reputation, i.e. * Reward all Plebs when any Pleb picks up food or drops it off. ![]() * Penalize all Plebs when a Spy steals from a Pleb. Plebs' reward for picking up food, this is a beneficial symbiotic The Plebs are happy enough to score rewards by only picking up foodīecause the Spy's reward for taking the food does not undo all of the * The Plebs do not develop any compensating behavior. * The spy ambushes the Plebs near the food source. * Rewarded in inverse proportion to distance to all Plebs (up to a bound). * If they contact a Pleb, they steal its food and gain a reward. Tried to compensate.Īdded noose so that Plebs must get closer to objective in order to receive any more rewards. Noticed lots of hesitation as Plebs got near food source and nest. * Points in inverse proportion to distance from current objective (either food source or nest). * Higher fixed amount of points for dropping off food. * Fixed amount of points for picking up food. You have a difficult challenge here to sell the run-down property in the picture. $ python3 train.py -scenario=spies-like-us -restore -display Estate Agent Role Play Take the role of an estate agent whose job it is to try and sell houses. Observing the results of the trained experiments: $ python3 train.py -scenario=spies-like-us -restore # Do more training on an existing policy. $ python3 train.py -scenario=spies-like-us # Train a new policy. Maddpg/spies-like-us/experiments/ - The saved results of experiments.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |