Not all AI models should be freely available, argues a legal scholar | The more capable they are, the greater the risk of catastrophe, reckons Lawrence Lessig

Maxie445 on August 3, 2024 2:56 am

(Some context: Larry Lessig founded Creative Commons and is a long time open source advocate, which makes this rather unusual)

—

“There are important differences between ordinary software and AI technology …

AI is more a category than a technology. Like the category “weapon”, it ranges from the relatively harmless to the potentially catastrophic. No one would believe that the access we allow to pea-shooters should be the same for stinger missiles. Neither should we believe that the software norms developed for operating systems or media players must apply in the same way to highly capable AI systems with the potential to cause immense harm.

Nor is it even obvious how the norms of free and open-source software should apply. AI models consist of at least four types of digital components, only three of which are actually software. The fourth — model weights—is both the most potent and the most obscure.

Model weights are the variable or numerical values used to translate inputs into outputs. They encapsulate all that the model learned during its training. Thus, if the training cost $1bn, the model weights reflect that value. If the training cost $1,000, they are obviously less powerful and less valuable.

So, which among these four components must be shared to be consistent with open-source values?

Source code is certainly one, for it teaches how the model was built. But model weights are just strings of numbers. On their own, they don’t teach anything. With the other software components and the data used to train the model, they certainly could teach how the model understands. But distinct from what they teach, they are simply the power of the model. On the analogy to weapons, model weights are not the design or plans for a weapon. They are the weapons.

In my view, all four components should be freely available for models of limited capability. Hugging Face, an AI community platform, offers over 350,000 AI and machine-learning models, 75,000 datasets and 150,000 demonstration applications, all open-source and publicly available. These models are likely not powerful enough to do significant harm. Making them available supports an ecology of free knowledge that is critical to improving the understanding of AI.

Yet that same logic does not apply to highly capable AI models, especially when it comes to releasing model weights. Whatever model weights can teach, that benefit must be weighed against the enormous risk of misuse that highly capable models present. At some point, that risk is clearly too great.

Mark Zuckerberg, founder of Meta, the creator of Llama, the most powerful open-weight release to date, assures us that open releases “should be significantly safer since the systems are more transparent and can be widely scrutinised”. They can be widely scrutinised, but when? If the danger is discovered after the code is in the wild, the assurance that all can see the problem equally is not much consolation.

Mr Zuckerberg promises that the foundation models behind freely released model weights have guardrails to protect against harmful or dangerous misuse, and that “using Llama with its safety systems like Llama Guard will likely be safer and more secure than closed models.” However, researchers are now demonstrating just how easily these guardrails can be removed. Llama 2 had guardrails to block users from deploying it for improper or unsafe purposes. But in 2023, and for less than $200, a team from Palisade Research was able to disable these and produce an unconstrained version of Llama 2. Just how dangerous could these Frankenstein open-weight models become, as the foundation models behind them become more powerful and the techniques for removing guardrails become more sophisticated?

The point is not that only open-weight releases can be hijacked. But they do create a unique risk because once released, they cannot be recalled. By contrast, models that give access through web portals or regulated APIs could, in principle, identify when users are attempting a hijack. In principle, then, they could more easily shut down malicious use than could models that have been freely distributed.

For low-capability models, we should encourage the Hugging Face ethic. The risks are low and the contribution to understanding is vast. For high-capability models, we need regulation that ensures both closed and open models are safe before they are released—and that they are not released in ways that could create catastrophic risk. No simple line will divide low capability from high. But if we’re to secure the potential for open-source development, we must develop the regulatory capacity to draw this line and enforce it.

Private companies alone, in fierce competition with each other, do not have sufficient incentives to avoid catastrophic risk. Neither would simply banning open-source AI avoid the risk of great harm.

Today, these risks are imposed upon all of us by private actors with little public oversight. That formula has not worked with dangerous technologies in the past. It will not work with the AI systems of the future.”

View 1 Comment

1 Comment

Maxie445 on August 3, 2024 2:56 am

(Some context: Larry Lessig founded Creative Commons and is a long time open source advocate, which makes this rather unusual)

—

“There are important differences between ordinary software and AI technology …

AI is more a category than a technology. Like the category “weapon”, it ranges from the relatively harmless to the potentially catastrophic. No one would believe that the access we allow to pea-shooters should be the same for stinger missiles. Neither should we believe that the software norms developed for operating systems or media players must apply in the same way to highly capable AI systems with the potential to cause immense harm.

Nor is it even obvious how the norms of free and open-source software should apply. AI models consist of at least four types of digital components, only three of which are actually software. The fourth — model weights—is both the most potent and the most obscure.

Model weights are the variable or numerical values used to translate inputs into outputs. They encapsulate all that the model learned during its training. Thus, if the training cost $1bn, the model weights reflect that value. If the training cost $1,000, they are obviously less powerful and less valuable.

So, which among these four components must be shared to be consistent with open-source values?

Source code is certainly one, for it teaches how the model was built. But model weights are just strings of numbers. On their own, they don’t teach anything. With the other software components and the data used to train the model, they certainly could teach how the model understands. But distinct from what they teach, they are simply the power of the model. On the analogy to weapons, model weights are not the design or plans for a weapon. They are the weapons.

In my view, all four components should be freely available for models of limited capability. Hugging Face, an AI community platform, offers over 350,000 AI and machine-learning models, 75,000 datasets and 150,000 demonstration applications, all open-source and publicly available. These models are likely not powerful enough to do significant harm. Making them available supports an ecology of free knowledge that is critical to improving the understanding of AI.

Yet that same logic does not apply to highly capable AI models, especially when it comes to releasing model weights. Whatever model weights can teach, that benefit must be weighed against the enormous risk of misuse that highly capable models present. At some point, that risk is clearly too great.

Mark Zuckerberg, founder of Meta, the creator of Llama, the most powerful open-weight release to date, assures us that open releases “should be significantly safer since the systems are more transparent and can be widely scrutinised”. They can be widely scrutinised, but when? If the danger is discovered after the code is in the wild, the assurance that all can see the problem equally is not much consolation.

Mr Zuckerberg promises that the foundation models behind freely released model weights have guardrails to protect against harmful or dangerous misuse, and that “using Llama with its safety systems like Llama Guard will likely be safer and more secure than closed models.” However, researchers are now demonstrating just how easily these guardrails can be removed. Llama 2 had guardrails to block users from deploying it for improper or unsafe purposes. But in 2023, and for less than $200, a team from Palisade Research was able to disable these and produce an unconstrained version of Llama 2. Just how dangerous could these Frankenstein open-weight models become, as the foundation models behind them become more powerful and the techniques for removing guardrails become more sophisticated?

The point is not that only open-weight releases can be hijacked. But they do create a unique risk because once released, they cannot be recalled. By contrast, models that give access through web portals or regulated APIs could, in principle, identify when users are attempting a hijack. In principle, then, they could more easily shut down malicious use than could models that have been freely distributed.

For low-capability models, we should encourage the Hugging Face ethic. The risks are low and the contribution to understanding is vast. For high-capability models, we need regulation that ensures both closed and open models are safe before they are released—and that they are not released in ways that could create catastrophic risk. No simple line will divide low capability from high. But if we’re to secure the potential for open-source development, we must develop the regulatory capacity to draw this line and enforce it.

Private companies alone, in fierce competition with each other, do not have sufficient incentives to avoid catastrophic risk. Neither would simply banning open-source AI avoid the risk of great harm.

Today, these risks are imposed upon all of us by private actors with little public oversight. That formula has not worked with dangerous technologies in the past. It will not work with the AI systems of the future.”

Tags

Not all AI models should be freely available, argues a legal scholar | The more capable they are, the greater the risk of catastrophe, reckons Lawrence Lessig

1 Comment