Agents for Integrating Distributed Data for Complex Computations

Algorithms for many complex computations assume that all the relevant data are available on a single node of a computer network. In the emerging distributed and networked knowledge environments, databases relevant for computations may reside on a number of nodes connected by a communication network. These data resources cannot be moved to other network sites due to privacy, security, and size considerations. The desired global computation must be decomposed into local computations to match the distribution of data across the network. The capability to decompose computations must be general enough to handle different distributions of data and different participating nodes in each instance of the global computation. In this paper, we present a methodology wherein each distributed data source is represented by an agent. Each such agent has the capability to decompose global computations into local parts, for itself and for agents at other sites. The global computation is then performed by the agent either exchanging some minimal summaries with other agents or travelling to all the sites and performing local tasks that can be done at each local site. The objective is to perform global tasks with a minimum of communication or travel by participating agents across the network.

Agents for Integrating Distributed Data for Complex Computations

Algorithms for many complex computations assume that all the relevant data are available on a single node of a computer network. In the emerging distributed and networked knowledge environments, databases relevant for computations may reside on a number of nodes connected by a communication network. These data resources cannot be moved to other network sites due to privacy, security, and size considerations. The desired global computation must be decomposed into local computations to match the distribution of data across the network. The capability to decompose computations must be general enough to handle different distributions of data and different participating nodes in each instance of the global computation. In this paper, we present a methodology wherein each distributed data source is represented by an agent. Each such agent has the capability to decompose global computations into local parts, for itself and for agents at other sites. The global computation is then performed by the agent either exchanging some minimal summaries with other agents or travelling to all the sites and performing local tasks that can be done at each local site. The objective is to perform global tasks with a minimum of communication or travel by participating agents across the network.

___

  • R. Agrawal, T. Imielinski, and A. Swami, Mining association rules between sets of items in large databases, In SIGMOD 93 International Conference on Management of Data, pp. 207-216, 1993.
  • R. Agrawal and R. Srikant, Fast algorithms for mining association rules in large databases, In VLDB Conference, A. V. Aho, J. E. Hopcroft and J. D. Ullman, The Design and Analysis of Computer Algorithms, Addison Wesley, MA, 1974.
  • S. J. Aki, Parallel Computation: Models and Methods, Prentice Hall, New Jersey, 1997.
  • V. C. Barbosa, An Introduction to Distributed Algorithms, MIT Press, 1996.
  • R. Bhatnagar and S. Srinivasan, Pattern discovery in distributed databases, In AAAI97, pp. 503-508, 1997.
  • L. Breiman, J. H. Friedman, R. A. Olshen, and C. J. Stone, Classification and Regression Trees, Belmont, CA: Wadsworth, 1984.
  • P. K. Chan and S. J. Stolfo, Sharing learned models among remote database partitions by local meta-learning, In KDD, pp.2-7, 1996.
  • M. H. Chia, D. E.Neiman, and V. R. Lesser, Poaching and distraction in asynchronous agent activities, In ICMAS, pp. 88-95, 1998.
  • E. H. Durfee and V.R. Corkill, Coherent cooperation among communicating problem solvers, IEEE Transactions on Computers, Vol. 36, No. 11, 1987.
  • J. W. Han, Y. Huang, N. Cercone, and Y. J. Fu, Intelligent query answering by knowledge discovery techniques, IEEE Transactions on Knowledge and Data Engineering, pp. 373-390, 1996.
  • M. N. Huhns, M. P. Singh, and T. Ksiezyk, Global Information management via Local Autonomous Agents, Morgan Kaufmann Publishers, 1997.
  • J. Jaja, An Introduction to Parallel Algorithms, Addison Wesley Publishers, 1992.
  • J. Mingers, An empirical comparison of pruning methods for decision-tree induction, Machine earning, Vol.4, pp. 227-243, 1989.
  • J. Mingers, An empirical comparison of selection measures for decision-tree induction, Machine Learning, Vol.3, pp. 319-342, 1989.
  • F. J. Provost and D. N. Hennessy, Scaling up: Distributed machine learning with cooperation, AAAI/IAAI, Vol. , pp. 74-79, 1996.
  • J. R. Quinlan, Induction of decision trees. Machine Learning, Vol.1, pp. 81-106, 1986.
  • S. Sen, Adaption co-evolution and learning in multi-agent systems, AAAI Press, 1996.
  • S. Stolfo, D. Fan, W. Lee, A. Prodromidis, and P. Chan, Jam: Java agents for meta-learning: Issues and initial results, In AAAI Workshop on AI Methods in Fraud and Risk Management, 1997.
  • C. Wang and M. Chen, On the complexity of distributed query optimization, IEEE Transactions on Knowledge and Data Engineering, Vol. 8, No.4, pp. 650-662, 1996.
  • G. Weiss, Distributed Artificial Intelligence Meets Artificial Intellig ence, Lecture Notes in Artificial Intelligence,Springer-Verlag, Vol. 1237, 1997.
  • G. Weiss and S. Sen, Adaption and Learning in Multi-Agent Systems, Lecture Notes in Artificial Intelligence, Springer-Verlag, Vol. 1042, 1996.
  • C. Yu, Z. M. Ozsoyoglu, and K. Kam, Optimization of distributed tree queries, Computer System Science, Vol. , No.3, pp.409-445. T. Zhang, R. Ramakrishnan, and M. Livny, BIRCH: An efficient data clustering method for very large databases, In SIGMOD Rec. 25, Vol. 2, pp. 103-114, 1996.