Most Supposedly ‘Open’ AI Systems Are Actually Closed—and That’s a Problem

Most Supposedly ‘Open’ AI Systems Are Actually Closed—and That’s a Problem

The Illusion of Openness: Why "Open" AI Needs A Revamp

Some of the most powerful tools for understanding language – "open-source" AI models – are sparking intense debate. These tools boast transparency and accessibility, a sense that code and information are freely available for anyone to delve into or modify. But a new analysis in Nature proposes a radical idea: this rosy picture might just be an illusion, masking a more complex reality.

Authors David Widder, Meredith Whittaker, and Sarah West argue that the term "open" has been co-opted, often used by large tech giants to repackage their own powerful tools while doing little to truly democratization. In essence, "openness" becomes a marketing tactic, a tool of control rather than empowerment.

Open-source software, with roots in programs like Linux, hinges on the concept of accessible code, allowing anyone to modify and build upon it. These principles have empowered startups and individuals, offering a fairer alternative to proprietary software.

Applied to the world of AI, the ideal of openness should empower researchers and startups, allowing them to build upon existing models, fostering innovation and agility.

However, "open " today often stumbles; its meaning deteriorates, distorted by corporate strategy.

Currently, "open" often refers solely to making the weights – numerical values defining an AI’s understanding – publicly accessible. This is akin to releasing a recipe without the ingredients or the kitchen to actually bake the cake.

While some accessibility is afforded through APIs
, these interfaces restrict usage. Think of it as ordering a customized cake from a bakery; you can specify ingredients, but you can’t peek behind the scenes or use their oven.

Perhaps most concerningly, these "open-source" AI models often founder on a reliance on massive datasets, massive warehouses of information scraped from the internet. This approach is problematic on several fronts.

Massive datasets can perpetuate biases present in the picture the world paints, refracting society’s inequalities.

Furthermore, these datasets are often shrouded in secrecy. Their contents are difficult to inspect for bias, and legality arises when copyrighted material is scraped without permission.

The core of the issue arises: when powerful AI development is concentrated in the hands of tech giants, "openness" bends towards serving these powerful entities, not democratization. Consider the case studies mentioned by the Nature Analysis.

Meta’s Llama 3, while celebrated as open, only allows modification through an API. This means developers can customize, but they are still reliant on Meta’s infrastructure and control. A Stanford forum explored further:

Developing truly open AI should renegotiate these power structures, allowing for true scrutiny and control by a wide range of developers.

Breaking down this illusion of openness requires a comprehensive understanding of the ecosystem, from training data to deployment.

As AI surges forward, policymakers must tread carefully. A true "open" AI future hinges on vigilance and a broader understanding of how AI is built and controlled, ensuring benefits extend beyond the walls of tech giants.

How can the‌ concept of “open” AI be redefined to be more ⁢equitable and beneficial for the broader AI community?

⁢## ​ The Illusion of Open AI: ⁣A Conversation with Dr. Sarah West

**Host:** Welcome back to the show. Today we’re diving ⁣into a fascinating topic: the concept of “open” AI and its potential pitfalls. Joining us is Dr. Sarah​ West, one of the authors of a provocative new analysis published in *Nature* titled “The Illusion of Openness: Why ‘Open’ AI​ Needs a Revamp”. Dr. West, thank you for being here.

**Dr. West:** Thanks for having me.

**Host:** Your ⁢analysis ‌suggests that the way we currently understand⁣ “open” AI‌ might be‍ misleading. Can you explain?

**Dr. West:** Absolutely. We⁤ often think of “open” ‌as meaning free access ⁢to code and ⁤data, right? But⁢ in the context of AI, simply releasing the “weights” – those numerical⁣ values that ​define an AI’s understanding – is like⁤ giving someone a recipe⁣ but not ⁤the ingredients or the tools to cook. It’s insufficient for true development and innovation.

**Host:** So ‍you’re arguing that the term “open”⁣ is being co-opted?

**Dr. ⁣West:** Exactly. ‌Large tech⁣ companies often release models in a way that seems “open”, but it’s often a carefully‍ curated release designed to benefit them more than ‌the broader community. It can ‍act as a marketing tool, giving the ⁢impression of transparency while maintaining⁣ control over crucial aspects of the technology.

**Host:** Can you ‍give an example of ‌what this might look like ⁢in practise?

**Dr. West:** Imagine ⁤a company releases the ⁤weights of a powerful language model. They might claim it’s “open”, but they withhold ​the training data, the specific hardware required to run it ‍efficiently, or the ​expertise needed ​to fine-tune it for​ specific tasks. This creates a significant barrier to entry for startups⁤ and independent researchers, limiting the true democratization of ⁢AI.

**Host:** What solutions do ​you‌ propose?

**Dr. West:** We need to move beyond simply releasing ⁢weights. True openness requires making ⁣the entire development⁣ process transparent, including ⁤the data used, the training methods, and the infrastructure required. This would empower researchers and developers to truly ⁢build‍ upon existing models,‌ leading to faster innovation and a more equitable AI landscape.

**Host:** Powerful stuff. Thank you for shedding light on⁤ this important issue, Dr. West.

**Dr.‍ West:** My pleasure.

Leave a Replay