Computing Community Consortium Blog

The goal of the Computing Community Consortium (CCC) is to catalyze the computing research community to debate longer range, more audacious research challenges; to build consensus around research visions; to evolve the most promising visions toward clearly defined initiatives; and to work with the funding organizations to move challenges and visions toward funding initiatives. The purpose of this blog is to provide a more immediate, online mechanism for dissemination of visioning concepts and community discussion/debate about them.


CCC @ AAAS 2024: Generative AI in Science: Promises and Pitfalls Recap – Part Four

March 21st, 2024 / in AAAS / by Catherine Gill

CCC supported three scientific sessions at this year’s AAAS Annual Conference. This week, we will summarize the highlights of the session, “Generative AI in Science: Promises and Pitfalls.” This panel, moderated by Dr. Matthew Turk, president of the Toyota Technological Institute at Chicago), featured Dr. Rebecca Willett, professor of statistics and computer science at the University of Chicago, Dr. Markus Buehler, professor of engineering at the Massachusetts Institute of Technology, and Dr. Duncan Watson-Parris, assistant professor at Scripps Institution of Oceanography and the Halıcıoğlu Data Science Institute at UC San Diego. In Part Four, we summarize the Q&A portion of the panel. 

 

A Q&A session followed the panelist’s presentations, and Dr. Matthew Turk kicked off the discussion. “‘Promises and Pitfalls’ is in the title of this panel. We’ve discussed many of the promises, but we haven’t addressed many of the pitfalls. What worries you about the future of generative AI?”

 

“The reliability and trustworthiness of these models is a big concern”, began Dr. Rebecca Wilett. “These models can predict things that are plausible, but are missing key, salient elements; Can I, as a human, recognize that there is something missing there?”

 

Dr. Markus Buehler added that the actual prediction of a model may take a second, but the experimental process of validation can take months or a year, or longer. So how should we operate in the interim when we have not verified the results? “We also need to educate the next generation of generative AI developers so that they design models which are trustworthy and verifiable, and that we can use physics-based insights in our construction of these models.”

 

Dr. Duncan Watson-Parris built on both of the previous points, saying “Because these models are designed to generate plausible results, we can’t just look at the results to verify their accuracy. Generative AI researchers need to have a deep understanding of how these models work in order to verify their results, which is why correctly educating the next generation is so important.”

 

Audience Member: “In materials science, we know the direction forward for studying some materials, but for others, like room temperature superconductors, we don’t know how to move forward. What do you think the path forwards in studying these unknown materials will look like? And how should this type of research be enabled from a regulatory standpoint?”

 

“Well, I’m not an expert in superconductor research,” said Dr. Buehler, “so I won’t speak directly to that, but I can talk generally about how we make advances in materials science, specifically in my area of protein and biomaterials development. The way we make advances is having the ability to push the envelope. We run new experiments and test outlandish ideas and theories and see which ones work and why. As for how we should enable this research, we need more open-source models with collective access. I would encourage politicians to not over-regulate these technologies, such that researchers and the public have access to these types of models. I don’t think it is a good idea to prevent people from using these models, especially when we can crowdsource ideas and developments and introduce knowledge from diverse fields of human activity. For example, when the printing press was invented, authorities tried to limit the availability of this technology so few books could be read en masse, but this effort failed miserably. The best way to protect the public is to facilitate access to these models in such a way that we can develop, explore, and evaluate them extensively for maximum benefit of society.”

 

Audience Member: “Most generative AI models today are regression models that focus on simulating or emulating different scenarios. However, discovery in science is fueled by the hypotheses and predictions we dream up. So how do we create models which are intended to conceive new predictions instead of the current models which are used mostly for experimentation?”

 

Dr. Buehler responded first, saying, “You’re right, most traditional machine learning models are often regression based, but the models we spoke about today work differently. When you put together multi-agent systems with many capabilities, they actually begin to explore new scenarios and they begin to reason and make predictions based on the experiments they’ve run. They become more human. You, as a researcher, wouldn’t run an experiment and just be finished – you would run an experiment and then begin to look at the data and validate it and make new predictions based off of this data, to connect the dots and extrapolate by making hypotheses and imaging how a new scenario would unfold. You would experiment, collect new data, develop a theory and propose perhaps an integrated framework about a particular matter of interest. Then you would defend your ideas against your colleagues’ critiques and perhaps revise your hypothesis when new information is used. This is how new multi-agent adversarial systems work, but of course they complement human skills with a far greater ability to reason over vast amounts of data and representations of knowledge. These models can already generate new hypotheses that push the envelope far beyond what has been studied already, adding to the scientific process of discovery and innovation.”

 

“I would complement that,” interjected Dr. Willett, “with the area of completion discovery and symbolic regression as being another area much more targeted towards hypothesis generation. There is a lot of ongoing work in this space.”

 

Audience Member: “How do we increase access to these types of models and overcome hurdles, such as most models being created for English speakers?”

 

Dr. Rebecca Willett answered, saying, “Lots of people have access to using these models, but designing and training them costs many millions of dollars. If only a small set of organizations are able to set up these models, then only a very small set of people are making the decisions and setting priorities in the scientific community. And often the priorities of these organizations and individuals are profit driven. That said, I think that landscape is starting to change. Organizations like the NSF are trying to build infrastructure that can be accessed by the broader scientific community. This effort resembles the early development of supercomputers. In the early days, researchers had to submit lengthy proposals to get access to a supercomputer. I think we are going to see similar emerging paradigms in AI and generative AI.”

 

“I agree,” said Dr. Watson-Parris. “Adding to that from a regulatory side, I don’t think we should regulate basic research, perhaps the application spaces, but not the research itself.”

 

Thank you so much for reading, and stay tuned for the recaps of our other two panels at AAAS 2024.

CCC @ AAAS 2024: Generative AI in Science: Promises and Pitfalls Recap – Part Four

Comments are closed.