MacOS-RL 01.开始

距离上次更新又过了好久,我最近有点时间开始继续做更新了。这次的项目是为了从头开始学习强化学习,搭建一个简易的基于pytorch,gymnasium和Mujoco的仿真与训练环境,用于进行机器人的训练控制。当前最常见的机器人仿真训练环境当属Nvidia的IsaacGym和IsaacLab,但是这两个环境基本只能在拥有Nvidia显卡的Ubuntu 20.04/22.04系统中运行,虽然效率很高,但是不具有很高的通用性,为了折腾➕学习,我这次在学习RL的过程中开始想到使用pytorch,gymnasium和Mujoco来搭建一个仿真环境。当然由于机器人训练的并行数量不够,导致训练效率不够高,不过这都是小问题。下面就开始第一次的代码学习。

本次是基本的gym的仿真代码:

import  gymnasium  as  gym

# Initialise the environment
env  =  gym.make("LunarLander-v3", render_mode="human")

# Reset the environment to generate the first observation
observation, info  =  env.reset(seed=42)

for  _  in  range(1000):
	# this is where you would insert your policy
	action  =  env.action_space.sample()
	
	# step (transition) through the environment with the action
	# receiving the next observation, reward and if the episode has terminated or truncated
	observation, reward, terminated, truncated, info  =  env.step(action)

	# If the episode has ended then we can reset to start a new episode
	if  terminated  or  truncated:
		observation, info  =  env.reset()

env.close()