dc.contributor.author |
Rosman, Benjamin S
|
|
dc.contributor.author |
Ramamoorthy, S
|
|
dc.date.accessioned |
2015-08-31T06:51:56Z |
|
dc.date.available |
2015-08-31T06:51:56Z |
|
dc.date.issued |
2015-04 |
|
dc.identifier.citation |
Rosman, B.S. and Ramamoorthy, S. 2015. Action priors for learning domain invariances. IEEE Transactions of Autonomous Mental Development, vol. 7(2), pp 107-118 |
en_US |
dc.identifier.issn |
1943-0604 |
|
dc.identifier.uri |
http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=7079487
|
|
dc.identifier.uri |
http://hdl.handle.net/10204/8114
|
|
dc.description |
Copyright; 2015 IEEE Xplore. Due to copyright restrictions, the attached PDF file only contains the abstract of the full text item. For access to the full text item, please consult the publisher's website. The definitive version of the work is published in IEEE Transactions of Autonomous Mental Development, vol. 7(2), pp 107-118 |
en_US |
dc.description.abstract |
An agent tasked with solving a number of different decision making problems in similar environments has an opportunity to learn over a longer timescale than each individual task. Through examining solutions to different tasks, it can uncover behavioural invariances in the domain, by identifying actions to be prioritised in local contexts, invariant to task details. This information has the effect of greatly increasing the speed of solving new problems. We formalise this notion as action priors, defined as distributions over the action space, conditioned on environment state, and show how these can be learnt from a set of value functions. We apply action priors in the setting of reinforcement learning, to bias action selection during exploration. Aggressive use of action priors performs context based pruning of the available actions, thus reducing the complexity of look ahead during search. We additionally define action priors over observation features, rather than states, which provides further flexibility and generalisability, with the additional benefit of enabling feature selection. Action priors are demonstrated in experiments in a simulated factory environment and a large random graph domain, and show significant speed ups in learning new tasks. Furthermore, we argue that this mechanism is cognitively plausible, and is compatible with findings from cognitive psychology. |
en_US |
dc.language.iso |
en |
en_US |
dc.publisher |
IEEE Xplore |
en_US |
dc.relation.ispartofseries |
Workflow;15282 |
|
dc.subject |
Search pruning |
en_US |
dc.subject |
Action selection |
en_US |
dc.subject |
Action ordering |
en_US |
dc.subject |
Transfer learning |
en_US |
dc.subject |
Reinforcement learning |
en_US |
dc.title |
Action priors for learning domain invariances |
en_US |
dc.type |
Article |
en_US |
dc.identifier.apacitation |
Rosman, B. S., & Ramamoorthy, S. (2015). Action priors for learning domain invariances. http://hdl.handle.net/10204/8114 |
en_ZA |
dc.identifier.chicagocitation |
Rosman, Benjamin S, and S Ramamoorthy "Action priors for learning domain invariances." (2015) http://hdl.handle.net/10204/8114 |
en_ZA |
dc.identifier.vancouvercitation |
Rosman BS, Ramamoorthy S. Action priors for learning domain invariances. 2015; http://hdl.handle.net/10204/8114. |
en_ZA |
dc.identifier.ris |
TY - Article
AU - Rosman, Benjamin S
AU - Ramamoorthy, S
AB - An agent tasked with solving a number of different decision making problems in similar environments has an opportunity to learn over a longer timescale than each individual task. Through examining solutions to different tasks, it can uncover behavioural invariances in the domain, by identifying actions to be prioritised in local contexts, invariant to task details. This information has the effect of greatly increasing the speed of solving new problems. We formalise this notion as action priors, defined as distributions over the action space, conditioned on environment state, and show how these can be learnt from a set of value functions. We apply action priors in the setting of reinforcement learning, to bias action selection during exploration. Aggressive use of action priors performs context based pruning of the available actions, thus reducing the complexity of look ahead during search. We additionally define action priors over observation features, rather than states, which provides further flexibility and generalisability, with the additional benefit of enabling feature selection. Action priors are demonstrated in experiments in a simulated factory environment and a large random graph domain, and show significant speed ups in learning new tasks. Furthermore, we argue that this mechanism is cognitively plausible, and is compatible with findings from cognitive psychology.
DA - 2015-04
DB - ResearchSpace
DP - CSIR
KW - Search pruning
KW - Action selection
KW - Action ordering
KW - Transfer learning
KW - Reinforcement learning
LK - https://researchspace.csir.co.za
PY - 2015
SM - 1943-0604
T1 - Action priors for learning domain invariances
TI - Action priors for learning domain invariances
UR - http://hdl.handle.net/10204/8114
ER -
|
en_ZA |