Displaying posts tagged with


Data anonymization is hard – this time shown with NYC taxi data

One often hears that some massive collection of data will not have privacy implications because it has been “anonymized”. Any time you hear that, treat the statement with great skepticism. It turns out that effectively anonymizing data, making it impossible to identify the individuals in the data set, is much harder than you might think. [...]