Importance Sampling for Fair Policy Selection / Shayan Doroudi, Philip S. Thomas and Emma Brunskill.
We consider the problem of off-policy policy selection in reinforcement learning: using historical data generated from running one policy to compare two or more policies. We show that approaches based on importance sampling can be "unfair"--they can select the worse of two policies more of...
Saved in:
Online Access: |
Full Text (via ERIC) |
---|---|
Main Authors: | , , |
Format: | eBook |
Language: | English |
Published: |
[Place of publication not identified] :
Distributed by ERIC Clearinghouse,
2017.
|
Subjects: |
Internet
Full Text (via ERIC)Online
Call Number: |
ED586042
|
---|---|
ED586042 | Available |