Bloom's Filters: Their Types and Analysis

Bloom filtrelerini ve çeşitlerini inceleyen bir çalışmanın özetidir. Bloom filtresi sorgulama üyeliklerini desteklemek amacıyla setleri temsil eden rasgele bir veri yapısıdır. 1970'lerde daha çok veri tabanı optimizasyonlarında kullanılmıştır. Bu yakınlarda bilgisayar ağları ile ilgili çalışma yapanlar daha sık kullanmaya başlamıştır. Bu çalışmada filtrelerin çeşitleri analiz edilecektir.

Bloom Filtreleri: Çeşitleri ve Analizi

In this paper we discuss Bloom filter in its original form and the varieties of its extensions. A Bloom filter is a randomized data-structure for concisely representing a set in order to support approximate membership queries. Although it was devised in 1970 for the purpose of spell checking, it was seldom used except in database optimization. In recent years, it has been rediscovered by the networking community, and has become a key component in many networking systems applications. In this paper, we will examine and analyse the different types of this filter.

___

  • BLOOM, B. (1970). Space/time tradeoffs in hash coding with allowable errors. Communications of the ACM, 13(7).
  • BRODERY, A. & MITZENMACHERZ, M. (2002). Network applications of Bloom Filters : a survey. Proceedings of 40th Annual Allerton Conference. Also available at _____http://www.eecs.harvard.edu/_michaelm>
  • BYERS, J., CONSIDINE, J., MITZENMACHER, M., & ROST, S. (2002). Informed content delivery across adaptive overly networks. Proceedings of ACM SIGCOMM 2002, pp. 47-60.
  • CARTER, L., & WEGMAN, M. (1979). Universal classes of hash functions. Journal of Computer and System Sciences, pp. 143-154. CS223 Final Project Report.
  • CHANG, F., FENG, W. & LI, K. (2004). Approximate caches for packet classification NSF Grant EIA-0130344, IEEE INFOCOM 2004.
  • CZERWINSKI, S., ZHAO, B.Y., HODES, T., JOSEPH, A.D., & KATZ, R. (1999). An architecture for a secure service discovery service. Proceedings of MobiCom- 99, pp. 24-35.
  • DHARMAPURIKAR, S., KRISHNAMURTHY, P., & TAYLOR, D. (2003). Longest prefix matching using Bloom Filters. Proceedings of the ACM SIGCOMM, pp. 201-212.
  • FAN, L., P. CAO, J. ALMEIDA, & BRODER, A. Z. (2000). Summary cache : a scalable wide-area web cache sharing protocol. IEEE/ACM Transactions on Networking, 8(3), pp. 281-293.
  • FELDMANN, A., & MUTHUKRISHNAN, S. (2000). Tradeoffs for packet classification, IEEE INFOCOM, 2000.
  • FENG, W., KANDLUR, D., SAHA D., SHIN, K., BLUE (1999). A new class of active queue management algorithms, U. Michigan CSE-TR-387-99.
  • FENG, W. & KANDLUR, D.D. (2001). Stochastic fair blue : a queue management algorithm for enforcing fairness. Proceedings of IEEE INFOCOM 2001.
  • HSIAO, P. & HUANG, J. (2001). Geographical region summary service for wireless routing. MobiHoc 2001, Long Beach, CA, USA, pp. 263-266. Also available at _____http://www.eecs.harvard.edu/~shawn/papers/courses/cs223_ final .pdf> as CS223 Final Project Report.
  • HSIAO, P (2001). Geographical region summary service for geographical routing. Mobile Computing and Communications Review, 5(4), pp. 25-39.
  • IAMNITCHI, A., RIPEANU, M. & FOSTER, I. (2002). Locating data in (Small-World?) peer-to-peer scientific collaborations. 1st International Workshop on Peer-to-Peer Systems IPTPS 2002. Also available at http://arxiv.org/abs/cs/0209031.
  • KOLONIARI, G. & PITOURA, E. (2003). Bloom-based filters for hierarchical data. 5th Workshop on Distributed Data Structures and Algorithms (WDAS'03).
  • -----. (2004). Filters for XML-based service discovery in pervasive computing. The Computer Journal, 47(4). Oxford University Press for the BCS.
  • KUMAR, A. & LI, L. (2003). Space-code Bloom Filter for efficient traffic flow measurement. IMC'03 2003, Miami Beach, Florida, USA. Also available at: http://www.imconf.net/imc-2003/papers/p312-kumarl .pdf.
  • LI, J. & JANNOTTI, J. & DE COUTO, D & KARGER, D. & MORRIS, R. (2000). A scalable location service for geographic ad-hoc routing. Proceedings ofMobiCom 2000, pp. 120-130.
  • MARAIS, H. & BHARAT, K. (1997). Supporting cooperative and personal surfing with a desktop assistant. In ACM Symposium, on User Interface Software and Technology, pp. 129-138
  • MITZENMACHER, M. (2001). Compressed Bloom Filters. Proceedings of the 20"' ACM SIGACT Symposium on Principles of Distributed Computing (PODC 2001).
  • ----- . (2001). Compressed bloom filters. Proceedings of the 20th ACM SIGACT- SIGOPS Symposium on Principles of Distributed Computing, pp. 144-150.
  • MULLIN, J.K. (1983). A Second look at Bloom Filters. Communications of the ACM, 26(8).
  • ----- . (1990). Optimal semijoins for distributed database systems. IEEE Transactions on Software Engineering, 16(5).
  • Passive measurement and analysis project, National Laboratory for Applied Network Research (NLANR). Available at http://pma.nlanr.net/Traces/Traces/
  • REYNOLDS, P. AND VAHDAT, A. (2003). Efficient Peer-to-peer keyword searching. Middleware 2003, Rio de Janeiro Brazil, pp. 21-40. Also available at http://issg.cs.duke.edu/search/.
  • RIVEST, R. (1991). THE MD5 Message-Digest Algorithm. RFC1321.
  • ROUSSKOV, A & WESSELS, D (1998). Cache digests. Computer Networks and ISDN Systems, 30(22-23), pp. 2155-2168.
  • SANCHEZ, L., MILLIKEN,W., SNOEREN, A.,TCHAKOUNTIO, F., JONES, CKENT, S.,PARTRIDGE,C, & STRAYER, W. (2001). Hardware support for a hash-based IP traceback. Proceedings of the 2nd D ARPA Information Survivability Conference and Exposition.
  • SNOEREN, A. C, PARTRIDGE, L. A. SANCHEZ, C. E. JONES, F. TCHAKOUNTIO, S. T. KENT, & STRAYER, W. T. (2001). Hash-Based IP traceback. Proceedings of the ACM SIGCOMM 2001 Conference (SIGCOMM- 01), volume 31:4 of Computer Communication Review, pp. 3-14.