What Latanya Sweeney Knows About Data Privacy Will Trouble You
The Harvard professor and data privacy pioneer will speak at this week’s Women in Cybersecurity conference about cyber problems we’re still not addressing.
Voter registration hacks, insecure personal health data during the COVID-19 pandemic, insidious patterns of online discrimination—data privacy could not be more critical to navigating what is shaping up to be an even more challenging year.
Latanya Sweeney, the Daniel Paul Professor of the Practice of Government and Technology at Harvard and director and founder of Harvard’s Data Privacy Lab, has some important insights to share about challenges like these. The data privacy pioneer is a keynote speaker at this month’s Women in Cybersecurity (WiCyS) conference (March 17-19) in Cleveland.
Sweeney studies how technology can help safeguard voter registration files and solve other societal, political, and governance problems. She has been a leader in the fields of data privacy and algorithmic fairness.
Technology designers are the new policymakers.
Her work was cited in the Health Insurance Portability and Accountability Act (HIPAA), which set medical privacy regulations at the federal level. “I just had an uncanny history of being early to a technology and society clash, doing an experiment that highlights the problem, and then getting a thousand great minds to help try to solve it,” Sweeney said in an interview with Endpoint.
Previously, Sweeney was chief technology officer at the U.S. Federal Trade Commission and a recipient of the prestigious Louis D. Brandeis Privacy Award. She was the first Black woman to earn a Ph.D. in computer science from the Massachusetts Institute of Technology, in 2001. Sweeney is scheduled to present a speech at the WiCyS 2022 conference on March 18.
(The following interview has been condensed and edited for clarity.)
Could you give us highlights of your keynote speech, “How Technology Will Shape Our Civic Future,” at the Women in Cybersecurity conference?
Technology designers are the new policymakers. We don’t elect them; we don’t vote for them. But the arbitrary decisions they make and the technology they create dictate how we live our lives. Our ability to enforce any particular value or standard or law depends a lot on what that technology allows us to do.
How did you get interested in data privacy?
Actually, I wasn’t [at first]. So the year was 1996. All my life I wanted to be a computer scientist, and I wanted to build a “thinking machine,” a computer that could learn and think like humans. I was well on my way as a Ph.D. student in computer science at MIT.
I was walking through the lab and I heard an ethicist say, ‘Computers are evil.’
While I was a graduate student, one day I was walking through the lab and I heard an ethicist say, “Computers are evil.” The example she gave was health data that had been collected on state employees, their families and retirees and given to researchers and companies. She said, “Don’t they realize, people could use that data to blackmail judges or to compromise law enforcement? Is it anonymous?”
[The health data] did have month, day and year of birth, gender and five-digit zip code. So I went up to City Hall and I bought the Cambridge, Mass., voter list for 20 bucks. The reason I went to Cambridge City Hall to buy the voter list was twofold. One, William Weld was the governor of Massachusetts at that time and he had collapsed. Information about his collapse would be in that health data. And the second reason was because the voter list shared date of birth, gender, and zip code.
It turns out six people in Cambridge had his date of birth. Only
three of them were men, and he was the only one in his zip code.
That meant that date of birth, gender, and zip code were unique
for William Weld in the voter file and would be unique for him
in the health data, and that I could link those [data sets] and put his
name to his health record uniquely. And that was a pretty eye-popping experience.
[Read also: Remote work increases data privacy concerns]
Very quickly laws around the world changed because of that experiment. Here, in the United States, that would be the HIPAA privacy rule. It was really an amazing moment.
You encountered some obstacles trying to publish the results of that experiment, right?
That’s absolutely true. There was a huge debate for years about: Is this computer science? Some people would say it’s not really computer security because with the privacy we’re talking about, nobody broke in to get [the data]. It’s the data I give away freely.
That was a pretty eye-popping experience. Very quickly laws around the world changed.
Computer security at that time was pretty much around perimeter security, or making sure people weren’t breaking into information, or that they used encryption. None of those ways of thinking, with that kind of toolbox, were going to solve this problem. It made it very, very difficult to publish because it was just not the kind of problem that computer science had gotten used to solving.
Recently you established the Public Interest Tech Lab at Harvard, which helps scholars “reimagine how technology can be used by governments and civil society for public good.” But it’s not just a place to discuss theory. You’re producing actual apps and platforms.
In many ways, the Public Interest Tech Lab, which we just call the Tech Lab, is a natural extension of the work that started when I was a graduate student. The first wave of these technology and society clashes was around data privacy. And then along comes algorithmic fairness and all these issues with the 2016 election. And in many ways, all of these waves are still with us. We really haven’t solved them in their entirety.
What new technologies could we offer that could begin to solve these problems? And sometimes the solution is a legal or regulatory change, and so we’ve done that. But we also produced new technologies.
[Read also: Shining a light on dark patterns and personal data collection]
But we still need more and more technologists—in government, in Congress. We need technologists who are thinking like this in the public interest, even working at the big tech companies. And so the question is: How do you create that pipeline? That’s the reason for the Tech Lab.
What are you learning through this work?
In 2016, among other things, we were the first to point out vulnerabilities in voter registration websites. And for the most part, those vulnerabilities are still with us. There’s no quick fix.
The problem is that you could impersonate a voter and then change their voter registration in a way that could make their vote not
count or not count fully. There’s nothing the voter would know or
be able to do, and you could do this at scale, so that you could shave off one or two percentage points across the state without anyone really noticing.
We came up with this idea of VoteFlare. The idea is to monitor people’s voter registration in real time. It’s kind of like credit monitoring, where if something goes wrong with your credit, the credit company will let you know. Well, we do the same thing with voter registrations and mail-in ballots. If something goes wrong, we’ll send you a text message, an email, or a voice message, depending on what you select, to let you know something’s wrong, so that you have a chance to correct it and so that your vote will count fully.
[Read also: The Strengthening American Cybersecurity Act aims to fix critical gaps]
We were able to stand it up for the Georgia runoff election. And so we were able to bring a lot of relief across the political spectrum to voters in Georgia. And this year we’re trying to roll it out to 48 states and the District of Columbia. Texas had its primary, and it’s already operational in Texas.
You mentioned algorithmic fairness. What stands out to you in terms of bias and stereotypes?
When I first became a professor at Harvard, I was being interviewed by a reporter, and I needed to show him a particular paper [of mine]. So I typed my name into the Google search bar, and he points to an advertisement that showed up on the right, and he says, “Well, forget that paper. Tell me about the time you were arrested.”
The ad said, ‘Latanya Sweeney arrested’ with a question mark.
And I told him, ‘Well, I’ve never
been arrested.’
The ad said, “Latanya Sweeney arrested” with a question mark, and “Find out about her arrest and more.” And I told him, “Well, I’ve never been arrested.”
The ad came in two varieties: Either it implied you had an arrest record, or it was neutral and said, “Find out more information.”
Why was this happening? Which names were coming up with the arrest ads, and which ones weren’t? When I typed “Latanya” into Google’s image search, up pops all these black faces, and when I typed “Tanya” into the image search, up pops all these white faces.
[Read also: Bridging the gender gap in cybersecurity will keep us safer]
That led me to do a study. Sure enough, if your first name is given more often to black babies than white babies, you got these ads implying an arrest. Remember the ad is placed on the person’s
full name. It’s not just on a first name. It’s not some Latanya. It’s Latanya Sweeney.
It was the first time an algorithm, in this case Google AdSense, qualified to be investigated by the Department of Justice as a violation of the Civil Rights Act.
What do you tell women considering entering your field?
It’s incredibly important, because different [women] will see different things. If you think about that story around discrimination and online ads, it never would’ve happened if I didn’t have a name given more often to black babies than white babies.
Nothing should be off-limits for women. We need the best and brightest working in an area—not the one who checks one box or the other, but the one who really is the best and brightest and helps us understand better how to compete and perform in a global world.