Last week, the CCC responded to the National Telecommunications and Information Administration’s Request for Information on Dual Use Foundation Artificial Intelligence Models with Widely Available Model Weights. The CCC’s own Daniel Lopresti (CCC Chair and Lehigh University) and David Danks (CCC Executive Committee and University of California, San Diego) helped author this response along with several other members of the computing community. Markus Buehler (Massachusetts Institute of Technology) and Duncan Watson-Parris (University of California, San Diego), who both spoke at the CCC sponsored AAAS panel this year, titled, Generative AI in Science: Promises and Pitfalls, both contributed to the RFI response, along with Casey Fiesler (University of Colorado, Boulder), who attended the CCC’s Future of Research on Social Technologies workshop in November.
In their response, the authors focused on a few specific questions from the RFI, one of which asked how the risks associated with making model weights widely available compare to those associated with non-public model weights. The authors responded that the majority of risks associated with generative models are minimally exacerbated by making model weights widely available. Most of the risks related to generative models are inherent to these models, because of their capacity to quickly generate enormous amounts of believable content based on user inputs and their almost limitless application areas. Making the model weights publicly available does not affect the functionality of generative models, and so currently there is little evidence that making the weights widely available creates significant additional risk beyond what could already be done with proprietary or closed systems. One risk that could potentially be worsened if weights for proprietary models are made widely available is the possibility of training data being exposed. It is unlikely that model weights could be reverse engineered to expose training data, but it has not been shown to be mathematically impossible. However, in our response we emphasized that, because generative models are likely to continue to be used heavily by the general public, the biggest risks, in our opinion, come from not making weights to representative foundation models openly available. Denying researchers and interested community members access to some model weights for proprietary models will prevent society from gaining a better understanding as to how these models function and how to design more inclusive and accessible models.
Continuing the practice of releasing closed models will continue to perpetuate a lack of diversity in tech and will prevent certain kinds of research from being conducted, such as bias audits of these models, which large tech companies are not incentivized to conduct. Education of the future workforce is another incredibly important consideration. The United States can not hope to maintain leadership in the field of generative AI without training the future generation of developers on these types of models in graduate and post-graduate education. It is important that students can explore these models during their education to understand their basic functionality, and to learn how to incorporate ethical considerations in developing new models. Allowing only large tech companies to possess the tools to train the next generation could also result in siloed thinking, and these organizations may overlook a holistic education that access to these models can provide in favor of a more efficient learn-as-needed framework. In our response, we also highlighted the importance of establishing a culture of openness surrounding these models’ development, emphasizing that establishing such a culture can be as important as regulating these technologies. If there is an expectation for tech companies to create generative models in a transparent fashion, then future regulation becomes much easier to conduct.
Finally, the CCC stressed the need for additional research on foundational models, citing the public’s current lack of knowledge about how these models actually function and arrive at the results they output. In our response, we listed a number of unanswered research questions that researchers, scientists, scholars, and experts in social issues are poised to start answering, provided they receive the open access they need to the kinds of large foundation models that industry is now exploiting. Our continued success as a society depends on it.