Washington: MIT researchers have developed a new technology that tracks how your private data is used online.
Researchers in the Decentralized Information Group (DIG) at Massachusetts Institute of Technology's Computer Science and Artificial Intelligence Laboratory (CSAIL) are developing a protocol they call “HTTP with Accountability,” or HTTPA, which will automatically monitor the transmission of private data and allow the data owner to examine how it's being used.
At the IEEE's Conference on Privacy, Security and Trust next month in Toronto, Oshani Seneviratne, an MIT graduate student in electrical engineering and computer science, and Lalana Kagal, a principal research scientist at CSAIL, will present a paper that gives an overview of HTTPA.
With HTTPA, each item of private data would be assigned its own uniform resource identifier (URI) that would convert the Web form, essentially, a collection of searchable text files into a giant database.
Remote access to a Web server would be controlled much the way it is now, through passwords and encryption.
But every time the server transmitted a piece of sensitive data, it would also send a description of the restrictions on the data's use.
An HTTPA-compliant programme also incurs certain responsibilities if it reuses data supplied by another HTTPA-compliant source, researchers said.
Suppose, for instance, that a consulting specialist in a network of physicians wishes to access data created by a patient's primary-care physician, and suppose that she wishes to augment the data with her own notes.
Her system would then create its own record, with its own URI. But using standard Semantic Web techniques, it would mark that record as “derived” from the PCP's record and label it with the same usage restrictions.
The network of servers is where the heavy lifting happens. When the data owner requests an audit, the servers work through the chain of derivations, identifying all the people who have accessed the data, and what they've done with it.
Seneviratne uses a technology known as distributed hash tables — the technology at the heart of peer-to-peer networks like BitTorrent — to distribute the transaction logs among the servers.
To test the system, Seneviratne built a rudimentary healthcare records system from scratch and filled it with data supplied by 25 volunteers.
She then simulated a set of transactions — pharmacy visits, referrals to specialists, use of anonymised data for research purposes, and the like — that the volunteers reported as having occurred over the course of a year.
Seneviratne used 300 servers on the experimental network PlanetLab to store the transaction logs; in experiments, the system efficiently tracked down data stored across the network and handled the chains of inference necessary to audit the propagation of data across multiple providers.