Please use this identifier to cite or link to this item: https://knowledgecommons.lakeheadu.ca/handle/2453/3901
Full metadata record
DC FieldValueLanguage
dc.contributor.advisorWei, Ruizhong
dc.contributor.authorHsu, Shih-Ying Peter
dc.date.accessioned2017-06-08T13:27:20Z
dc.date.available2017-06-08T13:27:20Z
dc.date.created2008
dc.date.issued2008
dc.identifier.urihttp://knowledgecommons.lakeheadu.ca/handle/2453/3901
dc.description.abstractThe importance of database anonymization has become increasingly critical for organizations that publish their database to the public. Current security measures for anonymization poses different manner of drawbacks. k-anonymity is prone to many varieties of attack; !-diversity does not work well with categorical or numerical attributes; t-closeness erases too much information in the database. Moreover, some measures of information loss are designed for anonymization measure, such as k-anonymity, where sensitive attributes do not play a part in measuring database's security. Not measuring the re-distribution of sensitive attributes will result in an underestimate for information loss such as 1- diversity or t-closeness which intentionally tries removing the association between non-sensitive attributes and sensitive attributes for better protecting individuals from being indentified. This thesis provides a more generalized version of !-diversity that will better protect categorical attributes and numerical attributes and analyzes the effectiveness and complexity of our new security scheme. Another focus of this thesis is to design a better approach of measuring information loss and lay down a new standard for evaluating information loss on security measures such as 1- diversity and t-closeness and quantify actual information loss from deliberately hiding relations between non-sensitive attributes and sensitive attributes. This new standard of information loss measure should provide a better estimation of the data mining potential remained in a generalized database. This thesis also proves that unlike k-anonymity which can be solved in polynomial time when k=2. 1-diversity in fact remains NP-Hard in the special case where 1=2, and even when there are only 2 possible sensitive attributes in the alphabet.
dc.language.isoen_US
dc.subjectDatabase security
dc.titleDatabase anonymization and protections of sensitive attributes
dc.typeThesis
etd.degree.nameMaster of Science
etd.degree.levelMaster
etd.degree.disciplineComputer Science
etd.degree.grantorLakehead University
Appears in Collections:Retrospective theses

Files in This Item:
File Description SizeFormat 
HsuS2008m-1b.pdf3.34 MBAdobe PDFThumbnail
View/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.