k-anonymity based framework for privacy preserving data collection in wireless sensor networks

In this paper, k-anonymity notion is adopted to be used in wireless sensor networks (WSN) as a security framework with two levels of privacy. A base level of privacy is provided for the data shared with semi-trusted sink and a deeper level of privacy is provided against eavesdroppers. In the proposed method, some portions of data are encrypted and the rest is generalized. Generalization shortens the size of the data transmitted in the network causing energy saving at the cost of information loss. On the other hand, encryption provides anonymization with no information loss and without saving energy. Thus, there is a tradeoff between information loss and energy saving. In our system, this tradeoff is intelligently managed by a system parameter, which adjusts the amount of data portions to be encrypted. We use a method based on bottom up clustering that chooses the data portions to be encrypted among the ones that cause maximum information loss when generalized. In this way, a high degree of energy saving is realized within the given limits of information loss. Our analysis shows that the proposed method achieves the desired privacy levels with low information loss and with considerable energy saving.

k-anonymity based framework for privacy preserving data collection in wireless sensor networks

In this paper, k-anonymity notion is adopted to be used in wireless sensor networks (WSN) as a security framework with two levels of privacy. A base level of privacy is provided for the data shared with semi-trusted sink and a deeper level of privacy is provided against eavesdroppers. In the proposed method, some portions of data are encrypted and the rest is generalized. Generalization shortens the size of the data transmitted in the network causing energy saving at the cost of information loss. On the other hand, encryption provides anonymization with no information loss and without saving energy. Thus, there is a tradeoff between information loss and energy saving. In our system, this tradeoff is intelligently managed by a system parameter, which adjusts the amount of data portions to be encrypted. We use a method based on bottom up clustering that chooses the data portions to be encrypted among the ones that cause maximum information loss when generalized. In this way, a high degree of energy saving is realized within the given limits of information loss. Our analysis shows that the proposed method achieves the desired privacy levels with low information loss and with considerable energy saving.

___

  • A.PŞtzmann, M. Khntopp. Anonymity, unobservability, and pseudonymity - A proposal for terminology. in H. Federrath, editor, DIAU’00, Lecture Notes in Computer Science 2009/2001: 1-9, 2000.
  • D. Chaum. The dining cryptographers problem: Unconditional sender and receipent untraceability. in Journal of Cryptology, 1(1):65-75, 1988.
  • D. Chaum. Untraceable electronic mail, return addresses, and digital pseudonyms. in Communications of the Associations for Computing Machinery, 24 (2):84-88, 1981.
  • A. PŞtzmann, B. PŞtzmann, M. Waidner. ISDN-Mixes: Untraceable communication with very small bandwith overhead. in GI/ITG Conference: Communication in Distributed Systems, Mannheim, pp.451-463 Feb 1991.
  • C. Gulcu, G. Tsudik. Mixing e-mail with BABEL. in Symposium on Network and Distributed Systems Security (NDDS ’96), San Diego,California, 1996.
  • M. K. Reiter, A.D. Rubin. Anonymous web transactions with crowds. in Communications of the ACM, 42(2):32-48, Feb 1999.
  • M. G. Reed, P.F. Syverson, D.M. Goldschlag. Anonymous connections and onion routing. in IEEE Journal on Selected Areas in Communications,16 (4): 482-494, May 1998.
  • J. Kong, X. Hong, M. Gerla. An anonymous on demand routing protocol with untraceable routes for mobile ad hoc networks. in Proceedings of the 4th ACM International Symposium on Mobile Ad Hoc Networking & Computing, 291-302, 2003.
  • P. Samarati, L. Sweeney. Generalizing data to provide anonymity when disclosing information. in Proc. of the 17th ACM SIGMOD-SIGACT-SIGART Symposium on the Principles of Database Systems, 1998.
  • B. C. M. Fung, K. Wang, R. Chen, P. S. Yu. Privacy-preserving data publishing: A survey on recent developments. in ACM Computing Surveys, 2009.
  • A. Meyerson, R. Williams. On the complexity of optimal k -anonymity. in Proc. of the 23rd ACM SIGMOD-SIGACT- SIGART Symposium on the Principles of Database Systems, June 2004.
  • J. Yick, B. Mukherjee, D. Ghosal. Wireless sensor network survey. in Computer Networks, Vol:52 No: 12 page: 2292-2330, 2008.
  • D. W. Carman, P.S. Kruus, B. J. Matt. Constraints and approaches for distributed sensor network security. NAI Laboratories, Tech. Rep. 00-010, 2000.
  • C. D. Michener, R. R. Sokal. A quantitative approach to a problem in classiŞcation. in Evolution, 11:130-162, 1957. [15] H. Chan, A. Perrig, D. Song. Random key predistribution schemes for sensor networks. in Proceedings of 2003 Symposium on Security and Privacy, 2003.
  • W. Du, J. Deng, Y. S. Han, P. K. Varshney, J. Katz, A. Khalili. A pairwise key predistribution scheme for wireless sensor networks. in ACM Transactions on Information and System Security (TISSEC), 8(2):228-258 May 2005.
  • W. Heinzelman, J. Kulik, H. Balakrishnan. Adaptive protocols for information dissemination in wireless sensor networks. in Proc. of the 5th Annual ACM/IEEE International Conference on Mobile Computing and Networking (MobiCom’99), 1999.
  • C. Intanagonwiwat, R. Govindan, D. Estrin. Directed diffusion: A scalable and robust communication paradigm for sensor networks. in the Proc. of the 6th Annual ACM/IEEE International Conference on Mobile Computing and Networking (MobiCom’00), 2000.
  • B. C. M. Fung, K. Wang, P. S. Yu. Top-down specialization for information and privacy preservation. in Proc. of the 21st Int’l Conf. on Data Engineering, April 2005.
  • L. Sweeney. Achieving k -anonymity privacy protection using generalization and suppression. Int. J. Uncertain. Fuzziness Knowledge-Based Systems, 10(5):571- 588, 2002.
  • K. Wang, P. Yu, S. Chakraborty. Bottom-up generalization: A data mining solution to privacy protection. in Proc. of the 4th IEEE International Conference on Data Mining, November 2004.
  • P. Andritsos, V. Tzerpos. Software clustering based on information loss minimization. in Proc. of 10th Working Conference on Reverse Engineering(WCRE’03), page:334, 2003.
  • G. Aggarwal, T. Feder, K. Kenthapadi, R. Motwani, R. Panigraphy, D. Thomas, A. Zhu. k -anonymity: Algorithms and hardness. Technical Report, Stanford University, 2004.
  • M. Gruteser, D. Grunwald. Anonymous usage of location-based services through spatial and temporal cloaking. in First International Conference on Mobile Systems, Applications, Services (MobiSYS), USENIX, 2003.
  • M. Gruteser, G. Schelle, A. Jain, R. Han, D. Grundwald. Privacy-aware location sensor networks. in Proc. 9th USENIX Workshop on Hot Topics in Operating Systems (HotOS), 2003.
  • C. Ozturk, Y. Zhang, W. Trappe. Source-location privacy in energy-constrained sensor network routing. in Proc. of the 2004 ACM Workshop on Security of Ad Hoc and Sensor Networks, pp.88-93, 2004.
  • Y. Jian, S. Chen, Z. Zhang, L. Zhang. Protecting receiver-location privacy in wireless sensor networks. in Proc. of IEEE INFOCOM, 2007.
  • A. Wadaa, S. Olariu, L. Wilson, M. Eltoweissy, K. Jones. On providing anonymity in wireless sensor networks. in Proc. of the Tenth International Conference on Parallel and Distributed Systems (ICPADS’04), 1521, 2004.
  • C. Castelluccia, E. Mykletun, G. Tsudik. Efficient aggregation of encrypted data in wireless sensor networks. in the Second Annual International Conference on Mobile and Ubiquitous Systems: Networking and Services, 2005, page: 109-117, July 2005.
  • B. Gedik, L. Liu. Protecting location privacy with personalized k -Anonymity: Architecture and algorithms. IEEE Transactions on Mobile Computing, Volume. 7, No. 1, January 2008.
  • R. Agrawal, R. Srikant. Privacy preserving data mining. in Proc. of the ACM SIG-MOD Conference on Management of Data, pp.439-450, May 2000.
  • D. Agrawal, C. C. Aggarwal. On the design and quantiŞcation of privacy preserving data mining algorithms. in Proc. of the 20th ACM SIGMOD-SIGACT-SIGART Symposium on Principal of Database Systems, pp. 247- 255, May 2001.
  • P. Samarati. Protecting respondent’s privacy in microdata release. IEEE Transactions on Knowledge and Data Engineering, 13(6):1010- 1027, 2001.
  • L. Sweeney. k -anonymity: A model for protecting privacy. Int’l Journal on Uncertainty, Fuziness, and Knowledge- based Systems, 10(5):557-570, 2002.
  • G. Aggarwal, T. Feder, K. Kenthapadi, R. Motwani, R. Panigraphy, D. Thomas, A. Zhu. Anonymizing tables. in Proc. of the 10th Int’l Conference on Database Theory, January 2005.
  • A. Machanavajjhala, J. Gehrke, D. Kifer, M. Venkitasubramaniam, l-diversity: Privacy beyond k-anonymity. in Proc. 22nd Intnl. Conf. Data Engg. (ICDE), page:24, 2006.
  • T.M. Truta, V. Bindu. Privacy protection: p-sensitive k-anonymity property. in Proc. of the Workshop on Privacy Data Management, In Conjunction with 22th IEEE International Conference of Data Engineering (ICDE), Atlanta, Georgia, 2006.
  • N. Li, T. Li, S. Venkatasubramanian. t-Closeness: privacy beyond k-anonymity and l-diversity. CERIAS Tech. Report 2007-78, Purdue University, 2007.
  • J. W. Byun, Y. Sohn, E. Bertino and N. Li. ”Secure anonymization for incremental datasets.” in SDM, pp. 48-63, 2006.
  • X. Xiao and Y. Tao. ”m-variance: towards privacy preserving re-publication of dynamic datasets.” in SIGMOD, pp. 689-700, 2007.
  • F. Li and S. Zhou. ”Challenging more updates: Towards anonymous re-publication of fully dynamic datasets” in arXiv.org:0806.4703v2, 2008. Appendix
  • Analytical derivation of energy saving with k -ACM
  • (15) Here f(x) and f(y) are the probability distribution functions of the sensor coordinates. Since sensor nodes are uniformly distributed in the sub-region, they are chosen as f(x) = 1/Xsregionand f(y) = 1/Ysregion. The expected number of hops an event message travels from a sensor node to the aggregation node is:
  • hsregion= dsregion/R
  • From the sensor node to the aggregation node, an event message travels hsregionhops, which is calculated as follows: dsregion= x=0 y=0 XsregionYsregionR dxdy
  • (17) The number of hops between aggregation node and sink, hregion, is calculated as follows: dregion= x=0 y=0 XregionYregionR dxdy
  • Table 10. Energy consumption ratios.
  • Energy Consumption Ratios
  • Transmission/Reception
  • Transmission/Encryption
  • Encryption/Decryption
  • Ratio Value 1.5 2333.34 1
  • The total consumed energy without k -anonymization is denoted as CN. In this case, all event messages are sent to the aggregation node, but the aggregation node just relays them to the sink without any performing k -anonymization operation. At each hop, each event packet is transmitted and received once, so the energy consumed in one hop is (TT+ TR)e, which is actually 2.5e units. The number of hops that each event message is transmitted is hregion+ hsregion. The energy consumption, CN, can now be calculated as follows:
  • CN= (hregion+ hsregion)(TT+ TR)e = (hregion+ hsregion)2.5e
  • (19) Total consumed energy in the case where k -anonymization is used is denoted by CK. The length of an event message, which is transferred from a sensor node to an aggregation node, is assumed to be e bytes. However, this length is reduced to (1− D)e after the aggregation node due to the shortening affect of the k -anonymization. Here, D is the decrease ratio of the k -anonymization operation. Suppose that β bytes of the event message are encrypted. The energy consumption, CK, can now be calculated as follows:
  • CK= hsregione(TT+ TR) + hregion(1− D)e(TT+ TR) + β(TE+ TD)
  • (20) The total energy savings CGis calculated as follows: CG= 1− CN = 1−
  • (hregion+ hsregion)2.5e
  • Complexity analysis of k -ACM