arXiv:1208.0984 [cs.LG]AbstractReferencesReviewsResources Classifications Subjects Themes Keywords reinforcement learning, approximate policy return, active preference-learning, cancer treatment testbeds witness, achieve direct policy search Tags journal article Journal Information Publisher Journal Year Month Volume Number Pages DOI URL Miscellaneous Typesetting Pages Language License Submit Reset