Abstract:When man-made or natural disasters occur suddenly, the rapid deployment of search and rescue (SAR) robots is crucial for saving lives. To accomplish rescue tasks, SAR robots need to autonomously plan paths in continuously dynamic and unknown environments to reach the rescue target locations. This paper proposes a sensor configuration scheme for SAR robots, applying a Q-learning algorithm based on Q-tables and neural networks to achieve autonomous control of SAR robots. It addresses the challenge of path planning in unknown environments, specifically how to avoid static and dynamic obstacles. Balancing the exploration and exploitation during the training process is one of the challenges in reinforcement learning. This paper introduces a mixed optimization method for dynamically selecting search strategies, building upon greedy search and Boltzmann search. Simulations were conducted using Matlab, and the results indicate that the proposed method is feasible and effective. Search and rescue robots equipped with this sensor configuration can effectively respond to environmental changes, reaching target locations while successfully avoiding both static and dynamic obstacles.