ＡＩでブロック崩しを学習させよう。Advantage Actor-Critic（A2C）で学ぶ強化学習の応用編

Rating 4.0 out of 5 (18 ratings in Udemy)

What you'll learn

Actor-Critic によるデュアルネットワークの仕組み（AlphaZeroネットワーク）
同期分散処理アルゴリズム
A2Cによるブロック崩しの学習のさせ方
強化学習の応用
方策勾配法等

Description

強化学習アルゴリズム Advantage Actor-Critic（A2C）を使って、ブロック崩しゲームを経験ゼロの状態から自動で学習させていく方法を紹介します。Advantage Actor-Critic のネットワークモデルは、AlphaGo Zero でも使われているもので方策と価値と同時に学習できます。更に「同期処理」という方法からGPUを効率的に利用でき、数日かかっていた学習うが数時間でできるようになっています。ネットワークモデルの部分は、続編のAlphaGo Zeroのコースの基礎となりますので受講しておくようにしましょう。

また漫画Spot's storyで、A2Cの学習の流れを分かりやすく解説しています。わからなくなった …

Duration 2 Hours 58 Minutes

Paid

Self paced

Intermediate Level

Japanese

131

Rating 4.0 out of 5 (18 ratings in Udemy)

Go to the Course
We have partnered with providers to bring you collection of courses, When you buy through links on our site, we may earn an affiliate commission from provider.