Hierarchical subtask discovery with non-negative matrix factorization

Earle, AC; Saxe, AM; Rosman, Benjamin S

dc.contributor.author	Earle, AC
dc.contributor.author	Saxe, AM
dc.contributor.author	Rosman, Benjamin S
dc.date.accessioned	2018-05-23T06:57:35Z
dc.date.available	2018-05-23T06:57:35Z
dc.date.issued	2018-04
dc.identifier.citation	Earle, A.C., Saxe, A.M. and Rosman, B.S. 2018. Hierarchical subtask discovery with non-negative matrix factorization. Sixth International Conference on Learning Representations (ICLR2018), 30 April 2018 - 3 May 2018, Vancouver Convention Center, Vancouver, Canada	en_US
dc.identifier.uri	https://iclr.cc/Conferences/2018/Schedule?type=Poster
dc.identifier.uri	https://openreview.net/forum?id=ry80wMW0W
dc.identifier.uri	http://hdl.handle.net/10204/10228
dc.description	Paper presented at the Sixth International Conference on Learning Representations (ICLR2018), 30 April 2018 - 3 May 2018, Vancouver Convention Center, Vancouver, Canada	en_US
dc.description.abstract	Hierarchical reinforcement learning methods offer a powerful means of planning flexible behavior in complicated domains. However, learning an appropriate hierarchical decomposition of a domain into subtasks remains a substantial challenge. We present a novel algorithm for subtask discovery, based on the recently introduced multitask linearly-solvable Markov decision process (MLMDP) framework. The MLMDP can perform never-before-seen tasks by representing them as a linear combination of a previously learned basis set of tasks. In this setting, the subtask discovery problem can naturally be posed as finding an optimal low-rank approximation of the set of tasks the agent will face in a domain. We use non-negative matrix factorization to discover this minimal basis set of tasks, and show that the technique learns intuitive decompositions in a variety of domains. Our method has several qualitatively desirable features: it is not limited to learning subtasks with single goal states, instead learning distributed patterns of preferred states; it learns qualitatively different hierarchical decompositions in the same domain depending on the ensemble of tasks the agent will face; and it may be straightforwardly iterated to obtain deeper hierarchical decompositions.	en_US
dc.language.iso	en	en_US
dc.relation.ispartofseries	Worklist;20912
dc.subject	Reinforcement learning	en_US
dc.subject	Subtask discovery	en_US
dc.subject	LMDPs	en_US
dc.title	Hierarchical subtask discovery with non-negative matrix factorization	en_US
dc.type	Conference Presentation	en_US
dc.identifier.apacitation	Earle, A., Saxe, A., & Rosman, B. S. (2018). Hierarchical subtask discovery with non-negative matrix factorization. http://hdl.handle.net/10204/10228	en_ZA
dc.identifier.chicagocitation	Earle, AC, AM Saxe, and Benjamin S Rosman. "Hierarchical subtask discovery with non-negative matrix factorization." (2018): http://hdl.handle.net/10204/10228	en_ZA
dc.identifier.vancouvercitation	Earle A, Saxe A, Rosman BS, Hierarchical subtask discovery with non-negative matrix factorization; 2018. http://hdl.handle.net/10204/10228 .	en_ZA
dc.identifier.ris	TY - Conference Presentation AU - Earle, AC AU - Saxe, AM AU - Rosman, Benjamin S AB - Hierarchical reinforcement learning methods offer a powerful means of planning flexible behavior in complicated domains. However, learning an appropriate hierarchical decomposition of a domain into subtasks remains a substantial challenge. We present a novel algorithm for subtask discovery, based on the recently introduced multitask linearly-solvable Markov decision process (MLMDP) framework. The MLMDP can perform never-before-seen tasks by representing them as a linear combination of a previously learned basis set of tasks. In this setting, the subtask discovery problem can naturally be posed as finding an optimal low-rank approximation of the set of tasks the agent will face in a domain. We use non-negative matrix factorization to discover this minimal basis set of tasks, and show that the technique learns intuitive decompositions in a variety of domains. Our method has several qualitatively desirable features: it is not limited to learning subtasks with single goal states, instead learning distributed patterns of preferred states; it learns qualitatively different hierarchical decompositions in the same domain depending on the ensemble of tasks the agent will face; and it may be straightforwardly iterated to obtain deeper hierarchical decompositions. DA - 2018-04 DB - ResearchSpace DP - CSIR KW - Reinforcement learning KW - Subtask discovery KW - LMDPs LK - https://researchspace.csir.co.za PY - 2018 T1 - Hierarchical subtask discovery with non-negative matrix factorization TI - Hierarchical subtask discovery with non-negative matrix factorization UR - http://hdl.handle.net/10204/10228 ER -	en_ZA

Files in this item

Name: Earle_20912_2018.pdf

Size: 1.098Mb

Format: PDF

Description: Conference paper

View/Open

This item appears in the following Collection(s)

Conference Publications

Show simple item record

Browse

All of ResearchSpace
This Collection
- By Issue Date
- Authors
- Titles
- Subjects
- Publication Type
- Cluster
- Impact Area

Quick Links

Legislation and compliance

General Enquiries

Tel: + 27 12 841 2911
Email: callcentre@csir.co.za

Physical Address
Meiring Naudé Road
Brummeria
Pretoria
South Africa

Postal Address
PO Box 395
Pretoria 0001
South Africa

Social Connect

Resources on this site are free to download and reuse according to associated licensing provision. Please read the terms and conditions of usage of each resource.