APPROXIMATE DYNAMIC PROGRAMMING FOR OPTIMAL SEARCH WITH AN OBSTACLE

In this paper, we study a class of optimal search problems where the search region includes a target and an obstacle, each of which has some shape. The search region is divided into small grid cells and the searcher examines one of those cells at each time period with the objective of finding the target with minimum expected cost. The searcher may either take an action that is quick but risky, or another one that is slow but safe, and incurs different cost for these actions. We formulate these problems as Markov Decision Processes (MDPs), but because of the intractability of the state space, we approximately solve the MDPs using an Approximate Dynamic Programming (ADP) technique and compare its performance against heuristic decision rules. Our numerical experiments reveal that the ADP technique outperforms heuristics on majority of problem instances. Specifically, the ADP technique performs better than the best heuristic policy in 17 out of 24 problem sets. The percent improvement in those 17 problem sets is on average 7.3%.

Engelli Optimal Arama için Yaklaşımsal Dinamik Programlama

Bu makalede her biri bir şekle sahip olan, arama bölgesi hedef ve engel içeren optimal arama problemlerini çalışmaktayız. Arama bölgesi küçük grid hücrelerine bölünmüştür ve araştırmacı her bir zaman peryodunda minimum maliyetle hedefi bulma amacıyla bu hücrelerden birini inceler. Araştırmacı hızlı ama riskli eylemi veya yavaş fakat güvenilir olanı seçebilir ve bu eylemler için farklı ücret öder. Bu problemleri Markov Karar Süreçleri (MKS) ile formüle etmekteyiz, fakat durum uzayının çetinliğinden dolayı MKS’leri bir Yaklaşımsal Dinamik Programlama (YDP) tekniği kullanarak yaklaşık olarak çözmekteyiz ve onun performansını sezgisel karar kurallarıyla karşılaştırmaktayız. Nümerik deneylerimiz problem örneklerinin çoğunda YDP tekniğinin sezgisel yöntemlerden üstün olduğunu ortaya çıkarmıştır. Spesifik olarak YDP tekniği 24 problem kümesinden 17’sinde en iyi sezgisel yöntemden daha iyi sonuç vermektedir. Bu 17 problem kümesindeki yüzde gelişme ortalama %7,3’tür.

___

Adelman D. “Price-directed replenishment of subsets: methodology and its application to inventory routing”. Manufacturing and Service Operations Management. 5, 4, 348-371, 2003.

Adelman D. “A price-directed approach to stochastic inventory routing”. Operations Research. 52, 4, 499-514, 2004.

Benkoski S J. Monticino, M. G., Weisinger, J. R.. “A survey of the search theory literature”. Naval Research Logistics. 38, 469-494, 1991.

Bertsekas D, Tsitsiklis J. “Neuro-Dynamic Programming”, Athena Scientific, 1996.

Botea A, Baier J, Harabor D, Hernandez C. “Moving target search with compressed path databases”. In Proceedings of ICAPS-13, 2013.

Chang HS, Fu MC, Hu J, Marcus SI. “Simulation-based algorithms for Markov Decision Processes”, Springer, 2007.

Chung TH, Burdick JW. “A decision-making framework for control strategies in probabilistic search”. Proceedings of IEEE International Conference on Robotics and Automation, 2007.

Chung TH. “On probabilistic search decisions under searcher motion constraints”. Algorithmic Foundations of Robotics, 2010.

De Farias DP, Roy BV. “The linear programming approach to Approximate Dynamic Programming”. Operations Research. 51, 850-865, 2003.

Dell RF, Eagle JN. Martins, G. H. A., Santos, A. G. “Using multiple searchers in constrained-path, moving-target search problems”. Naval Research Logistics. 43, 463-480, 1996.

Derr K, Manic, M. “Multi-robot, multi-target particle swarm optimization search in noisy wireless environments”. Proceedings of the 2nd conference on human system interactions, Catania, Italy, 2009.

Dobbie JM. “A survey of search theory”. Operations Research. 16(3), 527–537, 1968.

Eagle JN, Yee JR. “An optimal branch-and-bound procedure for the constrained path, moving target search problem”, Operations Research. 38, 110-114, 1990.

El-Hadidy, M. A. A., El-Bagoury, A. A. H., “Optimal search strategy for a three-dimensional randomly located target”, International Journal of Operational Research. 29, 2015.

El-Hadidy, M. A. A. “Fuzzy Optimal Search Plan for N-Dimensional Randomly Moving Target”, International Journal of Computational Methods. 13, 2016.

Gabal, H. M. A., El-Hadidy, M. A. A., “Optimal searching for a randomly located target in a bounded known region”, International Journal of Computing Science and Mathematics. 6, 2015.

Haley WB, Stone LD. “Search Theory and Applications”, Plenum Press, New York, 1980.

Kulich M, Preucil L, Jose J, Bront M. “Single robot search for a stationary object in an unknown environment”, IEEE International Conference on Robotics Automation, 2014.

Lossner U, Wegener I. “Discrete sequential search with positive switch cost”. Mathematics of Operations Research. 7(3), 426–440, 1982.

Maxwell MS, Henderson SG, and Topaloglu H. “Tuning Approximate Dynamic Programming Policies for Ambulance Redeployment via Direct Search”. Stochastic Systems. 3, 1-40, 2013.

Najemnik J, Geisler WS. “Eye movement statistics in humans are consistent with an optimal search strategy”, J. Vis. 8, 1-14, 2008.

Nitinawarat, S., Veeravalli, V. V., “Universal scheme for optimal search and stop”, Bernoulli. 23, 1759- 1783, 2017.

Powell WB. “Approximate Dynamic Programming: Solving the Curses of Dimensionality”, Wiley, 2007.

Ross SM. “A problem in optimal search and stop”. Operations Research. 17, 984–992, 1969.

Shechter SM, Ghassemi F, Gocgun Y, Puterman ML. “Trading off quick versus slow actions in optimal search”. Operations Research. 63, 353-362, 2015.

Singh S., Krishnamurthy, V. “The optimal search for a markovian target when the search path is constrained: the infinite-horizon case”. IEEE Transactions on Automatic Control, 2003.

Snider J. “Optimal random search for a single hidden target”. Physical Review. 83, 011105, 2011.

Stone LD. “Theory of Optimal Search”, Academic Press, 1975.

Suman B, Kumar B. “A survey of simulated annealing as a tool for single and multiobjective optimization”. Journal of the Operational Research Society. 57, 1143–1160, 2006.

Sutton RS, Barto AG. “Reinforcement Learning : an introduction”, MIT Press, 1998.

Washburn A.” Search and Detection”, Fourth Edition, Institute for Operations Research and the Management Sciences, 2002.

Wegener I. “The discrete sequential search problem with nonrandom cost and overlook probabilities”. Mathematics of Operations Research. 5, 373–380, 1980.

Wettergren TA. “Performance of search via track-before-detect for distributed sensor networks”, IEEE Transactions on Aerospace and Electronic Systems, 44, 2008.

Zhao Y, Patek SD, Beling PA. “Decentralized Bayesian Search Using Approximate Dynamic Programming Methods”. IEEE Transactions on Systems, Man, and Cybernetics Part B: Cybernetics, 38, 2008.