Chapter . Operating policy generation using a reinforcement learning agent in a melt facility

Doug Creighton and Saeid Nahavandi

Intelligent Systems Research Lab, Deakin University, Victoria 3217, Australia

Abstract

This study presents a methodology to allow a reinforcement learning agent to generate near-optimal policies for a melt facility. The application of the learning method to this industrial scale, dynamic, stochastic problem poses a number of challenges. The process is formulated as a semi-Markov Decision Problem. A novel method for application of RL agents to continuous state and action spaces, based on mapping continuous to discrete state and action spaces is developed. The agent successfully identified robust polices that improved on ...

Get Intelligent Production Machines and Systems - First I*PROMS Virtual Conference now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.