Chapter . Operating policy generation using a reinforcement learning agent in a melt facility

Doug Creighton and Saeid Nahavandi

Intelligent Systems Research Lab, Deakin University, Victoria 3217, Australia


This study presents a methodology to allow a reinforcement learning agent to generate near-optimal policies for a melt facility. The application of the learning method to this industrial scale, dynamic, stochastic problem poses a number of challenges. The process is formulated as a semi-Markov Decision Problem. A novel method for application of RL agents to continuous state and action spaces, based on mapping continuous to discrete state and action spaces is developed. The agent successfully identified robust polices that improved on ...

Get Intelligent Production Machines and Systems - First I*PROMS Virtual Conference now with the O’Reilly learning platform.

O’Reilly members experience live online training, plus books, videos, and digital content from nearly 200 publishers.