ICML2019-深度強化學習文章匯總
- 2019 年 11 月 21 日
- 筆記
深度強化學習-Report
來源:icml2019 conference
編輯:DeepRL
強化學習是一種通用的學習、預測和決策範式。RL為順序決策問題提供了解決方法,並將其轉化為順序決策問題。RL與優化、統計學、博弈論、因果推理、序貫實驗等有着深刻的聯繫,與近似動態規劃和最優控制有着很大的重疊,在科學、工程和藝術領域有着廣泛的應用。
RL最近在學術界取得了穩定的進展,如Atari遊戲、AlphaGo、VisuoMotor機械人政策。RL也被應用於現實場景,如推薦系統和神經架構搜索。請參閱有關RL應用程序的最新集合。希望RL系統能夠在現實世界中工作,並具有實際的好處。然而,RL存在着許多問題,如泛化、樣本效率、勘探與開發困境等。因此,RL遠未被廣泛部署。對於RL社區來說,常見的、關鍵的和緊迫的問題是:RL是否有廣泛的部署?問題是什麼?如何解決這些問題?
在國際會議上的機器學習(ICML)是一個國際學術會議上機器學習。它是機器學習和人工智能研究中高影響力的兩個主要會議之一。每年的ICML中都有大量的關於強化學習的文章,其中2019總共接收強化學習論文46篇(已經是很高比例了,快接近10%),下面是本次會議文章的總結,文章pdf版本匯總下載鏈接見文章末尾。
方法類文章
- Efficient Off-Policy Meta-Reinforcement Learning via Probabilistic Context Variables
- Bayesian Action Decoder for Deep Multi-Agent Reinforcement Learning
- Quantifying Generalization in Reinforcement Learning
- Policy Certificates: Towards Accountable Reinforcement Learning
- Neural Logic Reinforcement Learning
- Probability Functional Descent: A Unifying Perspective on GANs, Variational Inference, and Reinforcement Learning
- Few-Shot Intent Inference via Meta-Inverse Reinforcement Learning
- Calibrated Model-Based Deep Reinforcement Learning
- Information-Theoretic Considerations in Batch Reinforcement Learning
- Taming MAML: Control variates for unbiased meta-reinforcement learning gradient estimation
- Option Discovery for Solving Sparse Reward Reinforcement Learning Problems
優化類文章
- Fingerprint Policy Optimisation for Robust Reinforcement Learning
- Collaborative Evolutionary Reinforcement Learning
- Composing Value Functions in Reinforcement Learning
- Task-Agnostic Dynamics Priors for Deep Reinforcement Learning
- Policy Consolidation for Continual Reinforcement Learning
探索-利用及模型參數
- Exploration Conscious Reinforcement Learning Revisited
- Dynamic Weights in Multi-Objective Deep Reinforcement Learning
- Control Regularization for Reduced Variance Reinforcement Learning
- Dead-ends and Secure Exploration in Reinforcement Learning
- Off-Policy Deep Reinforcement Learning without Exploration
- Dimension-Wise Importance Sampling Weight Clipping for Sample-Efficient Reinforcement Learning
- Extrapolating Beyond Suboptimal Demonstrations via Inverse Reinforcement Learning from Observations
- On the Generalization Gap in Reparameterizable Reinforcement Learning
多智能體
- Social Influence as Intrinsic Motivation for Multi-Agent Deep Reinforcement Learning
- CURIOUS: Intrinsically Motivated Multi-Task, Multi-Goal Reinforcement Learning
- Finite-Time Analysis of Distributed TD(0) with Linear Function Approximation on Multi-Agent Reinforcement Learning
- Maximum Entropy-Regularized Multi-Goal Reinforcement Learning
- Multi-Agent Adversarial Inverse Reinforcement Learning
- Grid-Wise Control for Multi-Agent Reinforcement Learning in Video Game AI
- QTRAN: Learning to Factorize with Transformation for Cooperative Multi-Agent Reinforcement Learning
- Actor-Attention-Critic for Multi-Agent Reinforcement Learning
圖模型強化學習
- TibGM: A Transferable and Information-Based Graphical Model Approach for Reinforcement Learning
- SOLAR: Deep Structured Representations for Model-Based Reinforcement Learning
分佈式強化學習
- Statistics and Samples in Distributional Reinforcement Learning
- Distribution Reinforcement Learning for Efficient Exploration
應用類
- Action Robust Reinforcement Learning and Applications in Continuous Control
- Transfer Learning for Related Reinforcement Learning Tasks via Image-to-Image Translation
- Learning Action Representations for Reinforcement Learning
- The Value Function Polytope in Reinforcement Learning
- Generative Adversarial User Model for Reinforcement Learning Based Recommendation System
其他
- Kernel-Based Reinforcement Learning in Robust Markov Decision Processes
- A Deep Reinforcement Learning Perspective on Internet Congestion Control
- Reinforcement Learning in Configurable Continuous Environments
- Tighter Problem-Dependent Regret Bounds in Reinforcement Learning without Domain Knowledge using Value Function Bounds
註:部分文章還沒有在arxiv上,或者沒有的請自行Google