ResearchSpace

Hierarchical subtask discovery with non-negative matrix factorization

Show simple item record

dc.contributor.author Earle, AC
dc.contributor.author Saxe, AM
dc.contributor.author Rosman, Benjamin S
dc.date.accessioned 2018-05-23T06:57:35Z
dc.date.available 2018-05-23T06:57:35Z
dc.date.issued 2018-04
dc.identifier.citation Earle, A.C., Saxe, A.M. and Rosman, B.S. 2018. Hierarchical subtask discovery with non-negative matrix factorization. Sixth International Conference on Learning Representations (ICLR2018), 30 April 2018 - 3 May 2018, Vancouver Convention Center, Vancouver, Canada en_US
dc.identifier.uri https://iclr.cc/Conferences/2018/Schedule?type=Poster
dc.identifier.uri https://openreview.net/forum?id=ry80wMW0W
dc.identifier.uri http://hdl.handle.net/10204/10228
dc.description Paper presented at the Sixth International Conference on Learning Representations (ICLR2018), 30 April 2018 - 3 May 2018, Vancouver Convention Center, Vancouver, Canada en_US
dc.description.abstract Hierarchical reinforcement learning methods offer a powerful means of planning flexible behavior in complicated domains. However, learning an appropriate hierarchical decomposition of a domain into subtasks remains a substantial challenge. We present a novel algorithm for subtask discovery, based on the recently introduced multitask linearly-solvable Markov decision process (MLMDP) framework. The MLMDP can perform never-before-seen tasks by representing them as a linear combination of a previously learned basis set of tasks. In this setting, the subtask discovery problem can naturally be posed as finding an optimal low-rank approximation of the set of tasks the agent will face in a domain. We use non-negative matrix factorization to discover this minimal basis set of tasks, and show that the technique learns intuitive decompositions in a variety of domains. Our method has several qualitatively desirable features: it is not limited to learning subtasks with single goal states, instead learning distributed patterns of preferred states; it learns qualitatively different hierarchical decompositions in the same domain depending on the ensemble of tasks the agent will face; and it may be straightforwardly iterated to obtain deeper hierarchical decompositions. en_US
dc.language.iso en en_US
dc.relation.ispartofseries Worklist;20912
dc.subject Reinforcement learning en_US
dc.subject Subtask discovery en_US
dc.subject LMDPs en_US
dc.title Hierarchical subtask discovery with non-negative matrix factorization en_US
dc.type Conference Presentation en_US
dc.identifier.apacitation Earle, A., Saxe, A., & Rosman, B. S. (2018). Hierarchical subtask discovery with non-negative matrix factorization. http://hdl.handle.net/10204/10228 en_ZA
dc.identifier.chicagocitation Earle, AC, AM Saxe, and Benjamin S Rosman. "Hierarchical subtask discovery with non-negative matrix factorization." (2018): http://hdl.handle.net/10204/10228 en_ZA
dc.identifier.vancouvercitation Earle A, Saxe A, Rosman BS, Hierarchical subtask discovery with non-negative matrix factorization; 2018. http://hdl.handle.net/10204/10228 . en_ZA
dc.identifier.ris TY - Conference Presentation AU - Earle, AC AU - Saxe, AM AU - Rosman, Benjamin S AB - Hierarchical reinforcement learning methods offer a powerful means of planning flexible behavior in complicated domains. However, learning an appropriate hierarchical decomposition of a domain into subtasks remains a substantial challenge. We present a novel algorithm for subtask discovery, based on the recently introduced multitask linearly-solvable Markov decision process (MLMDP) framework. The MLMDP can perform never-before-seen tasks by representing them as a linear combination of a previously learned basis set of tasks. In this setting, the subtask discovery problem can naturally be posed as finding an optimal low-rank approximation of the set of tasks the agent will face in a domain. We use non-negative matrix factorization to discover this minimal basis set of tasks, and show that the technique learns intuitive decompositions in a variety of domains. Our method has several qualitatively desirable features: it is not limited to learning subtasks with single goal states, instead learning distributed patterns of preferred states; it learns qualitatively different hierarchical decompositions in the same domain depending on the ensemble of tasks the agent will face; and it may be straightforwardly iterated to obtain deeper hierarchical decompositions. DA - 2018-04 DB - ResearchSpace DP - CSIR KW - Reinforcement learning KW - Subtask discovery KW - LMDPs LK - https://researchspace.csir.co.za PY - 2018 T1 - Hierarchical subtask discovery with non-negative matrix factorization TI - Hierarchical subtask discovery with non-negative matrix factorization UR - http://hdl.handle.net/10204/10228 ER - en_ZA


Files in this item

This item appears in the following Collection(s)

Show simple item record