Ein offenes und ein geschlossenes Buch liegen schräg aufeinander, aus deren Seiten fallen einzelne Buchstaben heraus und sammeln sich daneben unsortiert.

The Use of ChatGPT in Science: Is Text just a Means to an End?

The emergence of generative large language models opens new questions about the relevance of textual work in science. We spoke with researchers Anne Krüger and Ingmar Mundt about publication surges, research ethics and labeling requirements.

You’re currently researching how AI language models are being used by scientists. Where are these tools already applied and where can they be helpful, or even improve the quality of work?

Anne Krüger: ChatGPT and other text-generating AI have made quite an impact in the field of science and have been extensively discussed since. On the one hand, their impact on teaching and education is highly debated, and on the other hand, they have implications for research and the daily routines of scientific work, which often involve numerous administrative tasks. Particularly in the latter, the potential of text-generating AI is seen in a positive light, as it can assist in administrative processes by generating context-appropriate text from keywords, for example.

One potential improvement discussed is the support in creating scientific texts. For instance, it's argued that individuals whose native language is not English, the predominant "language of science," can benefit from translation capabilities and linguistic suggestions. This could also help mitigate existing inequalities within the scientific system.

Ingmar Mundt: However, it quickly becomes evident that the extent of support varies significantly depending on the academic discipline. For a computer scientist, using ChatGPT to write or review a program code is quite different from employing AI in the social sciences to edit an article. The crucial factor here is the role that text plays within a discipline. Is it a means to an end for disseminating results, or is the text and its argumentation the very product of scholarly work?

What sort of problems are we facing when using ChatGPT in scientific work?

Anne Krüger: Crucial here are both technical skills and critical reflection. On one hand, it’s necessary to have a comprehensive understanding of how to operate these tools. On the other hand, it remains essential to assess the quality of the AI-generated text independently. One problem is, that it is often unclear what underlying data the generated text is based on. This has raised concerns about potential new inequalities because text generation – in simplified terms – relies on probabilities and frequencies, which can further marginalize less frequently addressed topics and theses.

Ingmar Mundt: Another question to consider is whether these texts have any informative novelty at all. Some even fear the end of innovation as text-generating AI increasingly relies on AI-generated texts. While we may not go that far for now, we do see that this allows for the rapid production of scientific or scientific-sounding texts. There are tools designed to create a summary of the current state of the literature. There are tools meant to automate data analysis and present it in graphics. ChatGPT assists in both the development of research questions and the creation of linguistically suitable texts for a corresponding journal. The concern is that, alongside responsible use, there will be many individuals who lack scientific rigor in this process, which will be much more challenging to verify in the future.

Anne Krüger: At the same time, there is also the issue of commercialization of such digital infrastructures. This could become an even more significant concern in the future if AI-supported tools for the research process become more relevant but continue to be distributed by commercial providers. This situation could potentially lead to research institutions' available resources determining who can use them, thereby gaining an advantage in grant applications, research practices, and research publications.

Are these problems new, or do they reveal existing weaknesses in the scientific system?

Anne Krüger: The sheer volume of the potential flood of publications is certainly new. However, the incentive to produce as many publications as possible has been there for a while, not at least because publications are the main "currency" in science. Far too often, scientific achievement is measured just by the number of publications. We have long needed a change here, so that the focus returns to the content, emphasizing that science is not solely defined by publications. Perhaps this deluge of publications and grant applications will lead to an inflation of this "currency." After all, all these publications and proposals need to be reviewed. Maybe it's better to invest time in evaluating scientists based on their substantive contributions rather than relying solely on quantitative metrics.

What questions arise from this for your research?

Anne Krüger: Text-generating AI is going to change scientific practices. Everyone agrees on that. The question is, how? That's what we need science and technology research for, to understand the impact this technology will have on science. We’re studying – depending on the discipline – what new potentials and risks arise, how this will affect scientific quality control, and what effect the use of text-generating AI has on the overall scientific system, its political governance, and resource distribution.

We need to consider which AI activities can facilitate scientific work in the future and which tasks will continue to require human abilities and competencies. What skills will scientists need to learn to use these tools and enhance the quality of their research? And how does this alter the scientific process?

Another critical aspect here is understanding where these different tools come from, who develops them under what conditions and with what notions of scientific practice, and what kind of services will be emerging. It's becoming evident that companies that have positioned themselves in the area of research analytics in recent years, boasting extensive databases, are currently exploring new commercial applications.

Ingmar Mundt: An important consideration when directly engaging with these tools is discerning the boundary between support and independent contribution, as well as navigating the use of AI-generated text. Should the text be just adopted uncritically, or should this opportunity for rapid data processing be used to derive insights and information against the backdrop of personal reflections and knowledge? How much expertise in the research field is required not only to generate corresponding text, but also to assess and evaluate the text produced in this manner?

Ultimately, we must ponder how to handle the outcomes of AI-supported scientific work and consider the possible necessity of labeling requirements to transparently outline the technologies employed in the process.

Many universities are currently grappling with the challenge of verifying ChatGPT's use in term papers and dissertations. How useful is the implementation of these AI-based checks?

Ingmar Mundt: As seen in many other aspects of digitalization, the technological advancement of AI tends to outpace its societal or political means of control. Controlling it is thus somewhat limited; what's more important is imparting the proper handling of these technologies, teaching how to utilize them as tools, and making them accessible. Otherwise, inequalities will arise between those who have access and those who do not.

Those seeking to misuse technologies will continue to find new methods in the future. However, this doesn't change the fact that these tools will or should play a significant role. In a way, it echoes past discussions about Wikipedia. Initially, the knowledge stored there wasn't considered trustworthy and wasn't citable in educational or scientific contexts. While legitimate criticisms about certain Wikipedia practices persist, today, the platform is often the primary resource for getting an overview of a topic. From there, one can delve deeper into the subject matter using the cited sources or critically question the knowledge presented. Similarly, in due course, there will be a normalization effect concerning ChatGPT, even though developing an appropriate approach to it will undoubtedly present us with several challenges.

How could comprehensive guidelines for handling ChatGPT and other tools be established? Which stakeholders are relevant in this context?

Anne Krüger: Essentially, multiple stakeholders are involved in achieving a consensus on the use of language models in science. Primarily, it falls upon the respective professional societies and associations to establish regulations aligning with good scientific practice within each discipline. Scientific journals also need to agree on the extent to which text-generating AI can be utilized and how it should be disclosed. Additionally, funders need to specify guidelines for proposal writing. It's crucial to collectively agree on the understanding of authorship and authenticity across these domains and insist on those standards. This doesn't solely pertain to drafting publications and grant applications but also extends to their peer review. Especially in this often unpopular yet profoundly important task, which is extremely labor-intensive for the involved scientists, it should be considered regulating the use of text-generating AI.

Ingmar Mundt: Similar to many technological advancements, an ethical commitment from researchers is crucial. Research ethics, such as handling sources or anonymizing personally identifiable data, serve as the foundation for good scientific practice today. Of course, they do not entirely prevent misuse. Those seeking an apparent advantage will always find ways to exploit and leverage new digital technologies. However, over time, methods will likely emerge to trace, for instance, unmarked uses.

A political challenge for regulation - not of the users but of the manufacturers - pertains to data protection. Everything inputted into ChatGPT is stored there. This raises concerns regarding copyright, as well as the protection of intellectual property that is still in the research process and currently evolving. For projects with high confidentiality, it is probably advisable to refrain from using ChatGPT until there are binding regulations in place regarding these matters.

Thank you for the interview!

Anne Krüger leads the Weizenbaum research group Reorganisation of Knowledge Practices. As a sociologist, her research is situated at the intersection of Critical Data Studies, Organizational Sociology, the Sociology of Valuation and Evaluation, and Science and Technology Studies.

Ingmar Mundt is a research associater esearch group Reorganisation of Knowledge Practices. Also a sociologist, he researches the role of digital technologists (esp. predictive algorithms), data and knowledge in future and risk predictions. He focuses in particular on the sociology of technology and knowledge, science and technology studies, as well as critical algorithms/data studies.

They were interviewed by Leonie Dorn.

artificial&intelligent? is a series of interviews and articles on the latest applications of generative language models and image generators. Researchers at the Weizenbaum Institute discuss the societal impacts of these tools, add current studies and research findings to the debate, and contextualize widely discussed fears and expectations. In the spirit of Joseph Weizenbaum, the concept of "Artificial Intelligence," is also called into question, unraveling the supposed omnipotence and authority of these systems. The AI pioneer and critic, who developed one of the first chatbots, would have celebrated his 100th birthday this year.

The Use of ChatGPT in Science: Is Text just a Means to an End?

Prof. Dr. Anne K. Krüger

Ingmar Mundt