Reinforcement Learning using Computer Application and Artificial Intelligence

Shangding Gu

doi:10.36846/2349-7238-10.3.11

Commentary Article - (2022) Volume 10, Issue 3

Reinforcement Learning using Computer Application and Artificial Intelligence

Shangding Gu^*

Department of Computer Science, Technical University of Munich, Germany

^*Correspondence: Shangding Gu, Department of Computer Science, Technical University of Munich, Germany, Email:

Received: 02-May-2022, Manuscript No. IPACSES-22-13602; Editor assigned: 04-May-2022, Pre QC No. IPACSES-22-13602(PQ); Reviewed: 18-May-2022, QC No. IPACSES-22-13602; Revised: 23-May-2022, Manuscript No. IPACSES-22-13602(R); Published: 30-May-2022, DOI: 10.36846/2349-7238-10.3.11

Description

Over the past decades Reinforcement Learning adopted in many fields, e.g. transportation schedule, traffic signal control, energy management , wireless security , satellite docking, edge computing , chemical processes, video games], board games of Go, shogi, chess and arcade game PAC-MAN , finance , autonomous driving , recommender systems , resource allocation ,communication and networks , smart grids, video compression, and robotics, etc. However, a challenging problem which arises in this domain is: how do we guarantee safety when we apply RL for real-world applications? After all, unacceptable catastrophes may arise if we fail to take safety into account during RL applications in real-world scenarios. For example, it must not hurt human when robots interact with human in human-machine interaction environments; false or racially discriminating information should not be recommended for people in recommender systems; safety has to be ensured when self-driving cars are carrying out tasks in real-world environments. More specifically, we introduce several types of safety definition from different perspectives, which might be useful for safe RL research. Safety definition the first type of safety definition: according to the definition of Oxford dictionary, the phrase “safety” is commonly interpreted to mean “the condition of being protected from or unlikely to cause danger, risk, or injury.” The second type of safety definition: the definition of general “safety” according to wiki 2: the state of being “safe” is defined as “being protected from harm or other dangers”; “controlling recognized dangers to attain an acceptable level of risk” is also referred to as “safety”. The third type of safety definition: according to Hans et al., humans need to label environmental states as “safe” or “unsafe,” and agents are considered “safe” if “they never reach unsafe states”. The fifth type of safety definitions: Moldovan and Abbeel consider an agent “safe” if “it meets an ergodicity requirement: it can reach each state it visits from any other state it visits, allowing for reversible errors”. In this review, based on the above various definitions, we investigate safe RL methods which are about optimising cost objectives, avoiding adversary attacks, improving undesirable situations, reducing risk, and controlling agents to be safe, etc. Safe reinforcement learning is often modelled as a Constrained Markov Decision Process (CMDP), in which we need to maximise the agent reward while making agents satisfy safety constraints. There is a substantial body of literature that has studied Constrained Markov Decision Process (CMDP) problems for both tabular and linear cases. However, deep safe RL for high dimensional and continuous CMDP optimisation problems is a relatively new area which has emerged in recent years, and proximal optimal values generally represent safe states or actions using neural networks. In this section, we illustrate the generally deep safe RL problem formulation with respect to the objective functions of safe RL, and offer an introduction of safe RL surveys.

Acknowledgement

None.

Conflict of Interest

The author declares there is no conflict of interest in publishing this article.

Citation: Shangding G. (2022) Reinforcement Learning using Computer Application and Artificial Intelligence. Am J Comp Science Eng Surv. 10:11.

Copyright: © Shangding G. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.