This is some fun Facebook research on learning to behave. Mimicing and learning social conventions within the framework of game theory, such a nice idea.
Social conventions – arbitrary ways to organize group behavior – are an important part of social life. Any agent that wants to enter an existing society must be able to learn its conventions (e.g. which side of the road to drive on, which language to speak) from relatively few observations or risk being unable to coordinate with everyone else. The article considers the game theoretic framework of David Lewis which views the selection of a social convention as the selection of an equilibrium in a coordination game. We ask how to construct reinforcement learning based agents that can solve the convention learning task in the self-play paradigm: at training time the agent has access to a good model of the environment and a small amount of observations about how individuals in society act. The agent then has to construct a policy that is compatible with the test-time social convention. We study three environments from the literature which have multiple conventions: traffic, communication, and risky coordination. In each of these we observe that adding a small amount of imitation learning during self-play training greatly increases the probability that the strategy found by self-play fits well with the social convention the agent will face at test time. This works even in an environment where standard independent multi-agent RL very rarely finds the correct test-time equilibrium.