您好,欢迎来到华佗小知识。
搜索
您的当前位置:首页Artificial Consciousness for Improving Reinforcement Learning

Artificial Consciousness for Improving Reinforcement Learning

来源:华佗小知识
ArtificialConsciousnessforImproving

ReinforcementLearning

MartinNilsson

RWCPNovelFunctionsSICSLaboratory,POB1263,S-128Kista,Sweden

E-mail:mn@sics.se

Abstract:Reinforcementlearningmethodsareusefulforrobotlearning,butbecomeslowwhenrobotspossessmanydegreesoffreedom.Wesuggestequippingrobotswithfaston-boardsimulators,inordertoacceleratelearning.Suchsimulatorswillresembleformsofconsciousness,enablingtherobotstoperformrun-timetrialsinasimulatedworld,ratherthantediouslyperformingtheminpractice.Wehaveappliedthismethodtolocomotionforafriction-propelledsnake-likerobot.Thesimulatoronthisrobotusesanaccuratenon-linearmodelofisotropicfrictionthatisfastenoughtobeexecutableinrealtime.Althoughouroriginalgoalwastoproposeamethodforrobotprogramming,theapproachappearsusefulforreinforcementlearninginageneralcontext.

Keywords:reinforcementlearning,conscious,snake-like,mobile,autonomous,robot,locomotion,friction,simulation.

1Introduction

Programmingautonomousrobotsisdifficultforhumans,especiallywhentherearemanydegreesoffreedom,andtheinteractionwiththeenvironmentandthedegreesoffreedomiscomplex.Inordertocontrolsucharobot,somekindoflearningoradaptationisnecessary.OneclassofmethodssuitableforsuchapplicationsaretheReinforcementLearningmethods.However,thesemethodshavetheseriousdrawbackofbeingslowforlargestatespaces,whichmakesthemimpracticalforlearningbyphysicaltrials.

2Method

Analternativetolearnbyphysicaltrialsistolearninasimulator,providedthattrialscanbeperformedfasterinthesimulatorthaninreality,andthatsimulationreflectsthe

realworldaccuratelyenough.Ifarobotisabletousesuchasimulatorduringrun-time,simulatingtherobotitself,anditsinteractionwiththeenvironment,thesimulatorcanbesaidtoconstituteaformofrobot“consciousness.”

Theadvantagewiththeconsciousnessmodelisthatitcanbothacceleratereinforcementlearning,andreducememoryrequirements.Learningisspeededup,sincetrialscanbeperformedwithoutphysicalexecution.Lessmemoryisrequired,sincetentativeactionscanbereevaluatedratherthantabulatingtheresultofpreviousevaluations.Ifthesimulatorisadaptive,itmayalsobeusedforproblemsolvinginadynamicenvironment.Itisimportanttonotethatthesimulatordoesn’thavetobeperfectlyaccurate–itissufficientthatitisaccurateenoughtobeabletosearchthelocalstatespaceofinterest.Sensorfeedbackcanservetocorrectsimulationdrift.However,itisveryimportantthatsimulatedtrialsareatleastasfastasphysicaltrials.Here,itseemsthatitisoftenadvantageoustotradesimulationaccuracyforspeed.

3Results

Wehavetestedourmodelonasnake-like,creepingrobot.Thisrobotiscomposedofanumberofstraightlinks,connectedbyjoints.Therobotmovesinaplane,andpropelsitselfbychangingtheanglesofthejoints,usingfrictionastheonlymeansoflocomotion.Evenwithonlytwojointsandthreelinks,thisisadifficultprogrammingproblemforahumanprogrammer,duetothenon-linearnatureoffriction.Equippedwithconsciousnessintheformofarudimentarysimulator,therobotisabletocrawlefficientlyinrealtime.Resultsoftheseexperimentshavebeendescribedin[1].

4ConclusionsandDiscussion

Thisworkisstillatanearlystageofdevelopment,butresultssofarseemtoconfirmthatconsciousnesscanbeausefulconceptforimprovingrobotlearning.Althoughouroriginalgoalwastoproposeatentativelyviablemethodforprogrammingmotionofsnake-likerobotsandotherhyper-redundantrobots,theapproachappearsusefulforlearningagentsinamoregeneralcontext.

Ourproposedideaofconsciousnessseemstoagreefairlywellwiththeintuitiveideaofanimateconsciousness.Itspotentialuseasalearningacceleratorcouldperhapsalsoserveasanexplanationofwhyconsciousnessdevelopedduringevolution.

References

[1]Nilsson,M.andOjala,J.:TowardConsciousRobots:“Self-awareness”Speeds

Learning.InProc.Robotikdagarna1995.Link¨opingUniversity,Sweden.June1995.

因篇幅问题不能全部显示,请点此查看更多更全内容

Copyright © 2019- huatuo0.cn 版权所有 湘ICP备2023017654号-2

违法及侵权请联系:TEL:199 18 7713 E-MAIL:2724546146@qq.com

本站由北京市万商天勤律师事务所王兴未律师提供法律服务