筑波大学
田中文英研究室

JA

マルチタスク強化学習

This was Dr. Tanaka’s Ph.D. work whose goal was to expand the basic framework of reinforcement learning so that it could deal with a continuous, lifelong learning scenario in which the agent (robot) was given multiple (infinite) learning tasks one by one. The agent was supposed to utilize its past learning experience to solve the current task effectively. This problem was formulated by introducing the distribution of MDPs (Markov Decision Processes) and then a reinforcement learning algorithm was derived.

The original study was conducted about 20 years ago; however, the idea of transfer of knowledge and developmental learning is still important, and is indeed a required element for current robots at large.