ICML2019-深度强化学习文章汇总
- 2019 年 11 月 21 日
- 筆記
深度强化学习-Report
来源:icml2019 conference
编辑:DeepRL
强化学习是一种通用的学习、预测和决策范式。RL为顺序决策问题提供了解决方法,并将其转化为顺序决策问题。RL与优化、统计学、博弈论、因果推理、序贯实验等有着深刻的联系,与近似动态规划和最优控制有着很大的重叠,在科学、工程和艺术领域有着广泛的应用。
RL最近在学术界取得了稳定的进展,如Atari游戏、AlphaGo、VisuoMotor机器人政策。RL也被应用于现实场景,如推荐系统和神经架构搜索。请参阅有关RL应用程序的最新集合。希望RL系统能够在现实世界中工作,并具有实际的好处。然而,RL存在着许多问题,如泛化、样本效率、勘探与开发困境等。因此,RL远未被广泛部署。对于RL社区来说,常见的、关键的和紧迫的问题是:RL是否有广泛的部署?问题是什么?如何解决这些问题?
在国际会议上的机器学习(ICML)是一个国际学术会议上机器学习。它是机器学习和人工智能研究中高影响力的两个主要会议之一。每年的ICML中都有大量的关于强化学习的文章,其中2019总共接收强化学习论文46篇(已经是很高比例了,快接近10%),下面是本次会议文章的总结,文章pdf版本汇总下载链接见文章末尾。
方法类文章
- Efficient Off-Policy Meta-Reinforcement Learning via Probabilistic Context Variables
- Bayesian Action Decoder for Deep Multi-Agent Reinforcement Learning
- Quantifying Generalization in Reinforcement Learning
- Policy Certificates: Towards Accountable Reinforcement Learning
- Neural Logic Reinforcement Learning
- Probability Functional Descent: A Unifying Perspective on GANs, Variational Inference, and Reinforcement Learning
- Few-Shot Intent Inference via Meta-Inverse Reinforcement Learning
- Calibrated Model-Based Deep Reinforcement Learning
- Information-Theoretic Considerations in Batch Reinforcement Learning
- Taming MAML: Control variates for unbiased meta-reinforcement learning gradient estimation
- Option Discovery for Solving Sparse Reward Reinforcement Learning Problems
优化类文章
- Fingerprint Policy Optimisation for Robust Reinforcement Learning
- Collaborative Evolutionary Reinforcement Learning
- Composing Value Functions in Reinforcement Learning
- Task-Agnostic Dynamics Priors for Deep Reinforcement Learning
- Policy Consolidation for Continual Reinforcement Learning
探索-利用及模型参数
- Exploration Conscious Reinforcement Learning Revisited
- Dynamic Weights in Multi-Objective Deep Reinforcement Learning
- Control Regularization for Reduced Variance Reinforcement Learning
- Dead-ends and Secure Exploration in Reinforcement Learning
- Off-Policy Deep Reinforcement Learning without Exploration
- Dimension-Wise Importance Sampling Weight Clipping for Sample-Efficient Reinforcement Learning
- Extrapolating Beyond Suboptimal Demonstrations via Inverse Reinforcement Learning from Observations
- On the Generalization Gap in Reparameterizable Reinforcement Learning
多智能体
- Social Influence as Intrinsic Motivation for Multi-Agent Deep Reinforcement Learning
- CURIOUS: Intrinsically Motivated Multi-Task, Multi-Goal Reinforcement Learning
- Finite-Time Analysis of Distributed TD(0) with Linear Function Approximation on Multi-Agent Reinforcement Learning
- Maximum Entropy-Regularized Multi-Goal Reinforcement Learning
- Multi-Agent Adversarial Inverse Reinforcement Learning
- Grid-Wise Control for Multi-Agent Reinforcement Learning in Video Game AI
- QTRAN: Learning to Factorize with Transformation for Cooperative Multi-Agent Reinforcement Learning
- Actor-Attention-Critic for Multi-Agent Reinforcement Learning
图模型强化学习
- TibGM: A Transferable and Information-Based Graphical Model Approach for Reinforcement Learning
- SOLAR: Deep Structured Representations for Model-Based Reinforcement Learning
分布式强化学习
- Statistics and Samples in Distributional Reinforcement Learning
- Distribution Reinforcement Learning for Efficient Exploration
应用类
- Action Robust Reinforcement Learning and Applications in Continuous Control
- Transfer Learning for Related Reinforcement Learning Tasks via Image-to-Image Translation
- Learning Action Representations for Reinforcement Learning
- The Value Function Polytope in Reinforcement Learning
- Generative Adversarial User Model for Reinforcement Learning Based Recommendation System
其他
- Kernel-Based Reinforcement Learning in Robust Markov Decision Processes
- A Deep Reinforcement Learning Perspective on Internet Congestion Control
- Reinforcement Learning in Configurable Continuous Environments
- Tighter Problem-Dependent Regret Bounds in Reinforcement Learning without Domain Knowledge using Value Function Bounds
注:部分文章还没有在arxiv上,或者没有的请自行Google