Why the World Wants to Abandon US AI Models

In late February, while attending the digital rights conference RightsCon in Taiwan, I watched in real time as civil society organizations around the world, including in the United States, grappled with the loss of one of the largest funders of global digital rights work: the United States government.

As I’ve written before, the Trump administration’s shocking and accelerated dismantling of the U.S. government (and its shift toward what some prominent political scientists have called “competitive authoritarianism”) is also affecting the operations and policies of U.S. technology companies—many of which, of course, have users far beyond the U.S. borders.

People at RightsCon said they were already seeing changes in the willingness of these companies to engage with and invest in smaller communities—especially those that don’t speak English. As a result, some policymakers and business leaders—especially in Europe—are rethinking their reliance on U.S.-based technology and wondering whether they can quickly develop better, local alternatives. This is particularly true in the case of AI.

One of the clearest examples of this is on social media. Yasmin Curzi, a postdoctoral fellow at the Karsh Institute of Democracy at the University of Virginia and a professor on leave at FGV Direito Rio who researches national technology policies, summed it up this way: “Since the second Trump administration, we can no longer count on [US social media platforms] to do even the bare minimum.”

Social media content moderation systems—which already use automation and are also experimenting with using large-scale language models (LLMs) to flag problematic posts—are failing to detect gender-based violence in countries as diverse as India, South Africa, and Brazil. If platforms become even more reliant on LLMs for content moderation, this problem will likely get worse, says Marlena Wisniak, a human rights lawyer specializing in AI governance at the European Nonprofit Law Center. “LLMs are already poorly moderated, and those same poorly moderated LLMs are being used to moderate other content,” she told me. “It’s so circular, and the mistakes keep repeating themselves and amplifying themselves.”

Part of the problem is that these systems are trained primarily on data from the English-speaking world (particularly American English), which causes them to underperform in local languages ​​and contexts.

Even multilingual language models, which are supposed to be able to process multiple languages ​​at once, still perform poorly in non-Western languages. For example, an evaluation of ChatGPT’s responses to health questions showed that it performed much worse in Chinese and Hindi—languages ​​that are less represented in the North American datasets—than in English and Spanish.

For many RightsCon attendees, this validates calls for more community-based approaches to AI development—both within and outside the context of social media. Such approaches could include small language models, chatbots, and datasets designed for specific uses and targeted to particular languages ​​and cultural contexts. These systems could be trained to recognize slang and insults, interpret words or phrases written in a mix of languages ​​(and even alphabets), and identify “re-meaningful language” (offensive terms that have come to be positively adopted by the group they were previously targeted in). All of these aspects tend to be ignored or misclassified by language models and automated systems trained primarily on Anglo-American English.

The founder of startup Shhor AI, for example, hosted a panel at RightsCon to showcase her new content moderation API for Indian vernacular languages.

Many similar solutions have been in development for years — and we’ve covered several of them, including a voluntary initiative facilitated by Mozilla to collect training data in languages ​​other than English, and promising startups like Lelapa AI, which is developing AI for African languages. Earlier this year, we even included small language models in our list of the top 10 disruptive technologies for 2025.

Still, this moment feels a little different. The second Trump administration, which is directly influencing the actions and policies of American tech companies, is obviously a key factor. But there are other elements at play, too.

First, recent research and advancements in language model development have reached a point where the size of the dataset is no longer a determining factor in performance — meaning more people can build them. In fact, “smaller language models can be a worthy competitor to multilingual models in specific languages ​​with limited resources,” says Aliya Bhatia, a visiting scholar at the Center for Democracy & Technology who studies automated content moderation.

Then there’s the global landscape. Competition over AI was a major theme at the recent Paris AI Summit, held the week before RightsCon. Since then, a series of announcements have been made highlighting “sovereign AI” initiatives, which aim to give one country (or organization) complete control over all aspects of AI development.

AI sovereignty is just one part of a broader push for “tech sovereignty,” which is also gaining momentum, driven by broader concerns about privacy and the security of data transferred to the United States. The European Union appointed its first commissioner for technology sovereignty, security and democracy last November, and has been working on plans to create a “Euro Stack,” or “digital public infrastructure.” The definition of this concept is still under development, but it could include the energy, water, chips, cloud services, software, data and AI systems needed to sustain modern society and future innovation.

All of these elements are currently largely provided by U.S. technology companies. The European efforts are partly inspired by the “India Stack,” India’s digital infrastructure that includes the Aadhaar biometric identity system. Last week, Dutch lawmakers approved several motions to disassociate the country from U.S. technology providers.

All of this is in line with what Andy Yen, CEO of Swiss digital privacy company Proton, told me at RightsCon. Trump, he said, is “moving Europe faster…recognizing that the continent needs to reclaim its technological sovereignty.” That’s partly because of the president’s influence over tech CEOs, Yen said, but also because “technology is where the future economic growth of any country is.” But just because governments are getting involved doesn’t mean that issues of inclusion in language models will go away.

“I think there needs to be clear boundaries around what the government’s role should be in this. It gets complicated when the government decides, ‘These are the languages ​​we want to promote,’ or ‘These are the types of views we want to represent in a dataset,’” Bhatia says.

“Fundamentally, the training data for a model is the worldview it develops.” It’s still too early to know how all this will play out — and how much of it will turn out to be reality or just hype. But whatever happens, this is a scenario we’ll continue to monitor closely.

( fontes: MIT Technology Review )