AI 안전과 효율적 이타주의

2025-10-26

Chapter 14. A Vague Sense of Doom

AI safety 진영에 어마어마한 자금이 몰리는 이유 중 일부는 효율적 이타주의:

Why has so much money gone to engineers tinkering on larger AI systems on the pretext of making them safer in the future, and so little to researchers trying to scrutinize them today? The answer partly comes down to the Silicon Valley became fixated on the most efficient way to do good and the ideas spread by small group of philosophers at Oxford university, England.

Back in the 1980s, an Oxford philosopher, Derek Parfit, started writing about a new kind of utilitarian ethics, one that looked far into the future. Imagine, he said, that you left a broken bottle on the ground and one hundred years later, a child cuts their foot on it. They might not yet be born, but you would shoulder the same burden of guilt as if that child was injured today. …

Peter Singer의 The life you can save:

In 2009, an Austrailian philosopher named Peter Singer expanded on Parfit’s work with a book called The life you can save. Here now was a solution: wealthy people should not just donate money based on what felt right but use a more rational approach to maximize the impact of their charitable giving and help as many people as possible. By helping many of those yet-to-be-born people in the future, you could be even more virtuous.

Will MacAskill의 80,000 Hours와 Centre for Effective Altruism:

These ideas started to make the leap from academic papers to the real world and form the basis of an ideology in 2011, when a twenty-four-year-old Oxford philosopher named Will MacAskill cofounded a group called 80,000 Hours. … It often steered the technically minded ones toward AI safety work. But the group also encouraged graduates to pick careers that paid the highest salaries, allowing them to donate as much money as possible to high-impact causes.

MacAskill and his young team eventually reincorporated themselves as the Centre for Effective Altruism and a new credo was born. …

This offered graduates a counterintuitive way of looking at all the inqeualities of modern capitalism. Now there was nothing wrong with a system that allowed a handful of humans to become billionaries. By amassing unfathomable amounts of wealth, they could help more people!

Sam Bankman-Fried와의 만남:

The movement picked up its biggest name in 2012, when MacAskill reached out to someone whom he hoped to recruit to the cause, an MIT student with dark curly hair named Sam Bankman-Fried. The two had coffee, and it turned out that Bankman-Fried was already a fan of Peter Singer and interested in causes related to animal welfare. …. He took a job at a quantitative trading firm and eventually founded the cryptocurrency exchange FTX in 2019. …

“I’m in on crypto because I want to make the biggest global impact for good.” He positioned himself as an ascetic character who, despite his billionaire status, drove a Toyota Corolla, lived with roommates, and often looked disheveled.

실리콘 밸리와 효율적 이타주의의 만남:

Many technologists saw this approach to morality as a breath of fresh air. When engineers saw a problem, they often solved it formulaically, debugging code and optimizing software through constant testing and evaluation. Now you could also quantify moral dilemmas, almost like they were math. …

As EA took greater hold in Silicon Valley, its focus shifted from buying cheap malaria nets and helping as many people as possible in Africa, to issues with a more science fiction flavor. Elon Musk, who tweeted that MacAskill’s 2022 book was a “close match for my philosophy,” had wanted to send people to Mars to ensure the long-term survival of humans. And as AI systems became more sophisticated, it made sense to keep it from going rogue and wiping out humanity too. Many of the staff at OpenAI, Anthropic, and DeepMind were effective altruists. …

Bankman-Fried could rationalize his duplicity because he was working toward a bigger goal of maximizing human happiness. Musk could wave off his own inhumane actions, from baselessly calling people pedophiles on Twitter to alleged widespread racism at his Tesla factories, because he was chasing bigger prizes, like turning Twitter into a free speech utopia and making humans an interplanetary specices. And the founders of OpenAI and DeepMind could rationalize their growing support for Big Tech firms in much the same way. So long as they eentually attained AGI, they would be fulfilling a greater good for humanity.

FTX 부도 후 SBF의 인터뷰:

Soon after FTX’s collapse, Bankman-Fried gave a remarkable interview with news site Vox:

“So the ethics stuff - mostly a front?” the reporter asked.

“Yeah,” Bankman-Fried replied.

“You were really good at talking about ethics, for someone who kind of saw it all as a game with winners and losers,” the reporter noted.

“Ya,” said Bankman-Fried. “Hehe. I had to be.”

FTX로부터 발을 빼는 MacAskill:

After FTX imploded, MacAskill took to Twitter to do damage control:

“A clear-thinking [effective altruist] should strongly oppose ‘ends justify the means’ reasoning,” he tweeted. Yet the movement’s own principles incentivized people like Bankman-Fried to reach their goals by whatever means necessary, even if that meant exploiting people.