nav emailalert searchbtn searchbox tablepage yinyongbenwen piczone journalimg journalInfo searchdiv qikanlogo popupnotification paper paperNew
2023, 12, v.50 74-80
基于PPO算法的机器人轴孔装配控制与仿真
基金项目(Foundation):
邮箱(Email): williamydp@scu.edu.cn;
DOI:
摘要:

针对在管道运输和航空航天领域常见的大口径轴孔装配任务,设计一种基于PPO算法的装配控制方法。首先,建立强化学习算法与装配环境交互训练框架,设计两个网络用于拟合装配策略和评估值函数;其次,设计机器人输出的动作空间与装配环境输出的状态空间,保证学习过程中的有效探索;然后,设计非线性奖励函数以确保训练过程的快速收敛;最后,搭建基于MuJoCo物理引擎的机器人大口径轴孔装配仿真平台,并在仿真平台上对设计算法进行训练和实验。结果表明:基于PPO算法的训练框架能保证训练过程的快速收敛,改进后的优势函数估计方法提升了训练过程的稳定性,训练模型不仅能保证轴插入孔和法兰面贴合,还能保证装配过程的安全性。

Abstract:

A PPO algorithm-based assembly control method is proposed for the large-diameter peg-in-hole assembly which is common in pipeline transportation and aerospace fields. Firstly, the interactive training framework between the reinforcement learning algorithm and assembly environment is established, and two networks are designed to fit the assembly strategy and the evaluation value function respectively. Secondly, the action space of robot output and the state space of assembly environment output are designed to ensure the effective exploration in the learning process. Then, a nonlinear reward function is designed to ensure the fast and stable convergence of the training process. Finally, a simulation platform for robot assembly of large-diameter peg-in-hole assembly based on MuJoCo physics engine is built, and the designed algorithm is trained and tested on the simulation platform. The results show that the training framework based on PPO algorithm can ensure the fast convergence of the training process, and the improved dominance function estimation method can improve the stability of the training process. The training model can not only ensure the fit of the shaft insertion hole and the flange surface, but also ensure the safety of the assembly process.

参考文献

[1]Xiaolin Zhang,Wang Zanqin,Yu Hang,et al. Research on Visual Inspection Technology in Automatic Assembly for Manhole Cover of Rocket Fuel Tank[C]. 2022 4th International Conference on Advances in Computer Technology, Information Science and Communications(CTISC),2022:1-5.

[2]未来10年工业机器人与协作机器人市场发展预测[J].机械,2017,44(10):54.

[3]计时鸣,黄希欢.工业机器人技术的发展与应用综述[J].机电工程,2015,32(1):1-13.

[4]Rui Li,Qiao Hong. A Survey of Methods and Strategies for High-Precision Robotic Grasping and Assembly Tasks-Some New Trends[J]. IEEE-ASME Transactions on Mechatronics,2019,24(6):2718-2732.

[5]张松松.多维感知融合驱动的机器人装配行为研究[D].贵阳:贵州大学,2023.

[6]Sainbuyan Natsagdorj, Chiang John-Y, Su Che-Han, et al.Vision-based Assembly and Inspection System for Golf Club Heads[J]. Robotics and Computer-Integrated Manufacturing,2015,32(4):83-92.

[7]R-J Chang,Lin C-Y,Lin P-S. Visual-Based Automation of Peg-in-Hole Microassembly Process[J]. Journal of Manufacturing Science and Engineering-Transactions of the Asme,2011,133(4):41015-41027.

[8]A-De-Sam Lazaro,G Rocak H. Precision assembly using force sensing[J]. International Journal of Advanced Manufacturing Technology,1996,11(2):77-82.

[9]D-E Whitney. Quasi-Static Assembly of Compliantly Supported Rigid Parts[J]. Journal of Dynamic Systems Measurement and Control-Transactions of the Asme,1982,104(1):65-77.

[10]陈佳盼,郑敏华.基于深度强化学习的机器人操作行为研究综述[J].机器人,2022,44(2):236-256.

[11]Tadanobu Inoue,De Magistris Giovanni,Munawar Asim,et al.Deep reinforcement learning for high precision assembly tasks[C].2017 IEEE/RSJ International Conference on Intelligent Robots and Systems(IROS),2017:819-825.

[12]Tianyu Ren,Dong Yunfei,Wu Dan,et al. Learning-Based Variable Compliance Control for Robotic Assembly[J]. Journal of Mechanisms and Robotics-Transactions of The Asme,2018,10(6):61008.

[13]Jing Xu, Hou Zhimin, Wang Wei, et al. Feedback Deep Deterministic Policy Gradient With Fuzzy Reward for Robotic Multiple Peg-in-Hole Assembly Tasks[J]. IEEE Transactions on Industrial Informatics,2019,15(3):1658-1667.

基本信息:

DOI:

中图分类号:TG95;TP18;TP242

引用信息:

[1]申玉鑫,刘晓明,肖逸等.基于PPO算法的机器人轴孔装配控制与仿真[J].机械,2023,50(12):74-80.

基金信息:

引用

GB/T 7714-2015 格式引文
MLA格式引文
APA格式引文
检 索 高级检索