art photo



We've had some interest on campus in developing a data archive for data generated by research performed by the college's students and faculty. We're working on developing procedures for migrating disclosure risk for human published guidance on what is an acceptable disclosure risk level. My statistical consultant informs me that it is mathematically impossible to get the disclosure risk (as measured by mu-Argus) down to 0, but nobody seems to be able to tell me what level of disclosure risk other archives consider appropriate , what level the U.S. government uses when preparing its microdata products, etc.


Here is a link to a disclosure project out of ICPSR:

Disclosure project

You are right that you can never the disclosure risk to 0. Data providers often add some element of uncertainty to the data via data swaping. By doing this even, when it appears that you are able to identify a respondent, you can never be certain whether this is the actual respondent or a swapped respondent. The data providers never provide the details on the # of swapped cases.

Direct Links:

Related Question Groups: