MacOS-RL 01.开始
距离上次更新又过了好久,我最近有点时间开始继续做更新了。这次的项目是为了从头开始学习强化学习,搭建一个简易的基于pytorch,gymnasium和Mujoco的仿真与训练环境,用于进行机器人的训练控制。当前最常见的机器人仿真训练环境当属Nvidia的IsaacGym和IsaacLab,但是这两个环境基本只能在拥有Nvidia显卡的Ubuntu 20.04/22.04系统中运行,虽然效率很高,但是不具有很高的通用性,为了折腾➕学习,我这次在学习RL的过程中开始想到使用pytorch,gymnasium和Mujoco来搭建一个仿真环境。当然由于机器人训练的并行数量不够,导致训练效率不够高,不过这都是小问题。下面就开始第一次的代码学习。
本次是基本的gym的仿真代码:
import gymnasium as gym
# Initialise the environment
env = gym.make("LunarLander-v3", render_mode="human")
# Reset the environment to generate the first observation
observation, info = env.reset(seed=42)
for _ in range(1000):
# this is where you would insert your policy
action = env.action_space.sample()
# step (transition) through the environment with the action
# receiving the next observation, reward and if the episode has terminated or truncated
observation, reward, terminated, truncated, info = env.step(action)
# If the episode has ended then we can reset to start a new episode
if terminated or truncated:
observation, info = env.reset()
env.close()