dblp.uni-trier.dewww.uni-trier.de

Richard S. Sutton Vis

List of publications from the DBLP Bibliography Server - FAQ
Coauthor Index - Ask others: ACM DL/Guide - CiteSeerX - CSB - MetaPress - Google - Bing - Yahoo

*2009
50EERichard S. Sutton, Hamid Reza Maei, Doina Precup, Shalabh Bhatnagar, David Silver, Csaba Szepesvári, Eric Wiewiora: Fast gradient-descent methods for temporal-difference learning with linear function approximation. ICML 2009: 125
2008
49 Maria Cutumisu, Duane Szafron, Michael H. Bowling, Richard S. Sutton: Agent Learning using Action-Dependent Learning Rates in Computer Role-Playing Games. AIIDE 2008
48EEDavid Silver, Richard S. Sutton, Martin Müller: Sample-based learning and search with permanent and transient memories. ICML 2008: 968-975
47EERichard S. Sutton, Csaba Szepesvári, Hamid Reza Maei: A Convergent O(n) Temporal-difference Algorithm for Off-policy Learning with Linear Function Approximation. NIPS 2008: 1609-1616
46EEElliot A. Ludvig, Richard S. Sutton, Eric Verbeek, E. James Kehoe: A computational model of hippocampal function in trace conditioning. NIPS 2008: 993-1000
45EERichard S. Sutton, Csaba Szepesvári, Alborz Geramifard, Michael H. Bowling: Dyna-Style Planning with Linear Function Approximation and Prioritized Sweeping. UAI 2008: 528-536
44EEElliot A. Ludvig, Richard S. Sutton, E. James Kehoe: Stimulus Representation and the Timing of Reward-Prediction Errors in Models of the Dopamine System. Neural Computation 20(12): 3034-3054 (2008)
2007
43EERichard S. Sutton, Anna Koop, David Silver: On the role of tracking in stationary environments. ICML 2007: 871-878
42EEDavid Silver, Richard S. Sutton, Martin Müller: Reinforcement Learning of Local Shape in the Game of Go. IJCAI 2007: 1053-1058
41EEShalabh Bhatnagar, Richard S. Sutton, Mohammad Ghavamzadeh, Mark Lee: Incremental Natural Actor-Critic Algorithms. NIPS 2007
2006
40 Alborz Geramifard, Michael H. Bowling, Richard S. Sutton: Incremental Least-Squares Temporal Difference Learning. AAAI 2006
39EEAlborz Geramifard, Michael H. Bowling, Martin Zinkevich, Richard S. Sutton: iLSTD: Eligibility Traces and Convergence Analysis. NIPS 2006: 441-448
2005
38EEBrian Tanner, Richard S. Sutton: TD(lambda) networks: temporal-difference networks with eligibility traces. ICML 2005: 888-895
37EEEddie J. Rafols, Mark B. Ring, Richard S. Sutton, Brian Tanner: Using Predictive Representations to Improve Generalization in Reinforcement Learning. IJCAI 2005: 835-840
36EEBrian Tanner, Richard S. Sutton: Temporal-Difference Networks with History. IJCAI 2005: 865-870
35EEDoina Precup, Richard S. Sutton, Cosmin Paduraru, Anna Koop, Satinder P. Singh: Off-policy Learning with Options and Recognizers. NIPS 2005
34EERichard S. Sutton, Eddie J. Rafols, Anna Koop: Temporal Abstraction in Temporal-difference Networks. NIPS 2005
2004
33EERichard S. Sutton, Brian Tanner: Temporal-Difference Networks. NIPS 2004
2001
32 Doina Precup, Richard S. Sutton, Sanjoy Dasgupta: Off-Policy Temporal Difference Learning with Function Approximation. ICML 2001: 417-424
31 Peter Stone, Richard S. Sutton: Scaling Reinforcement Learning toward RoboCup Soccer. ICML 2001: 537-544
30EEMichael L. Littman, Richard S. Sutton, Satinder P. Singh: Predictive Representations of State. NIPS 2001: 1555-1561
29EEPeter Stone, Richard S. Sutton: Keepaway Soccer: A Machine Learning Testbed. RoboCup 2001: 214-223
2000
28 Doina Precup, Richard S. Sutton, Satinder P. Singh: Eligibility Traces for Off-Policy Policy Evaluation. ICML 2000: 759-766
27EEPeter Stone, Richard S. Sutton, Satinder P. Singh: Reinforcement Learning for 3 vs. 2 Keepaway RoboCup 2000: 249-258
1999
26EERichard S. Sutton: Open Theoretical Questions in Reinforcement Learning. EuroCOLT 1999: 11-17
25EERichard S. Sutton, David A. McAllester, Satinder P. Singh, Yishay Mansour: Policy Gradient Methods for Reinforcement Learning with Function Approximation. NIPS 1999: 1057-1063
24EERichard S. Sutton, Doina Precup, Satinder P. Singh: Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning. Artif. Intell. 112(1-2): 181-211 (1999)
1998
23EEDoina Precup, Richard S. Sutton, Satinder P. Singh: Theoretical Results on Reinforcement Learning with Temporally Abstract Options. ECML 1998: 382-393
22 Richard S. Sutton, Doina Precup, Satinder P. Singh: Intra-Option Learning about Temporally Abstract Actions. ICML 1998: 556-564
21EERobert Moll, Andrew G. Barto, Theodore J. Perkins, Richard S. Sutton: Learning Instance-Independent Value Functions to Enhance Local Search. NIPS 1998: 1017-1023
20EERichard S. Sutton, Satinder P. Singh, Doina Precup, Balaraman Ravindran: Improved Switching among Temporally Abstract Actions. NIPS 1998: 1066-1072
19EERichard S. Sutton: Reinforcement Learning: Past, Present and Future. SEAL 1998: 195-197
18EERichard S. Sutton, Andrew G. Barto: Reinforcement Learning: An Introduction. IEEE Transactions on Neural Networks 9(5): 1054-1054 (1998)
1997
17 Richard S. Sutton: On the Significance of Markov Decision Processes. ICANN 1997: 273-282
16 Doina Precup, Richard S. Sutton: Exponentiated Gradient Methods for Reinforcement Learning. ICML 1997: 272-277
15 Doina Precup, Richard S. Sutton: Multi-time Models for Temporally Abstract Planning. NIPS 1997
1996
14 Satinder P. Singh, Richard S. Sutton: Reinforcement Learning with Replacing Eligibility Traces. Machine Learning 22(1-3): 123-158 (1996)
1995
13 Richard S. Sutton: TD Models: Modeling the World at a Mixture of Time Scales. ICML 1995: 531-539
12EERichard S. Sutton: Generalization in Reinforcement Learning: Successful Examples Using Sparse Coarse Coding. NIPS 1995: 1038-1044
1993
11 Richard S. Sutton, Steven D. Whitehead: Online Learning with Random Representations. ICML 1993: 314-321
1992
10 Richard S. Sutton: Adapting Bias by Gradient Descent: An Incremental Version of Delta-Bar-Delta. AAAI 1992: 171-176
1991
9 Richard S. Sutton, Christopher J. Matheus: Learning Polynomial Functions by Feature Construction. ML 1991: 208-212
8 Richard S. Sutton: Planning by Incremental Dynamic Programming. ML 1991: 353-357
7EETerence D. Sanger, Richard S. Sutton, Christopher J. Matheus: Iterative Construction of Sparse Polynomial Approximations. NIPS 1991: 1064-1071
6 Richard S. Sutton: Dyna, an Integrated Architecture for Learning, Planning, and Reacting. SIGART Bulletin 2(4): 160-163 (1991)
1990
5 Richard S. Sutton: Integrated Architectures for Learning, Planning, and Reacting Based on Approximating Dynamic Programming. ML 1990: 216-224
4EERichard S. Sutton: Integrated Modeling and Control Based on Reinforcement Learning. NIPS 1990: 471-478
1989
3EEAndrew G. Barto, Richard S. Sutton, Christopher J. C. H. Watkins: Sequential Decision Probelms and Neural Networks. NIPS 1989: 686-693
1988
2 Richard S. Sutton: Learning to Predict by the Methods of Temporal Differences. Machine Learning 3: 9-44 (1988)
1985
1 Oliver G. Selfridge, Richard S. Sutton, Andrew G. Barto: Training and Tracking in Robotics. IJCAI 1985: 670-672

Coauthor Index

1Andrew G. Barto [1] [3] [18] [21]
2Shalabh Bhatnagar [41] [50]
3Michael H. Bowling [39] [40] [45] [49]
4Maria Cutumisu [49]
5Sanjoy Dasgupta [32]
6Alborz Geramifard [39] [40] [45]
7Mohammad Ghavamzadeh [41]
8E. James Kehoe [44] [46]
9Anna Koop [34] [35] [43]
10Mark Lee [41]
11Michael L. Littman [30]
12Elliot A. Ludvig [44] [46]
13Hamid Reza Maei [47] [50]
14Yishay Mansour [25]
15Christopher J. Matheus [7] [9]
16David A. McAllester [25]
17Robert Moll (Robert N. Moll) [21]
18Martin Müller [42] [48]
19Cosmin Paduraru [35]
20Theodore J. Perkins [21]
21Doina Precup [15] [16] [20] [22] [23] [24] [28] [32] [35] [50]
22Eddie J. Rafols [34] [37]
23Balaraman Ravindran [20]
24Mark B. Ring [37]
25Terence D. Sanger [7]
26Oliver G. Selfridge [1]
27David Silver [42] [43] [48] [50]
28Satinder P. Singh [14] [20] [22] [23] [24] [25] [27] [28] [30] [35]
29Peter Stone [27] [29] [31]
30Duane Szafron [49]
31Csaba Szepesvári [45] [47] [50]
32Brian Tanner [33] [36] [37] [38]
33H. M. W. (Eric) Verbeek (H. M. W. Verbeek, Eric Verbeek) [46]
34Christopher J. C. H. Watkins [3]
35Steven D. Whitehead [11]
36Eric Wiewiora [50]
37Martin Zinkevich [39]

Colors in the list of coauthors

Copyright © Tue Nov 3 08:52:44 2009 by Michael Ley (ley@uni-trier.de)