Recent major advances in large language models — sophisticated generative artificial intelligence (AI) algorithms trained on massive amounts of language data — have led to widely available writing tools such as OpenAI’s popular chatbot, ChatGPT, that are able to analyze text and produce new content in response to user prompts. This technology has important and immediate implications for scholars who write articles and for the journals that publish them. In May 2023, the International Committee of Medical Journal Editors (ICMJE) updated its recommendations for scholarly work published in medical journals with specific directives related to AI-assisted technologies.1 As CMAJ follows the ICMJE guidance, these new recommendations now apply to all manuscripts submitted to CMAJ.
Large language models have a powerful capacity to search and repackage information from their training data set in a wide variety of formats and styles that users can specify. They can be used to generate ideas and outlines for scholarly manuscripts, or even the full text of articles. Because contemporary AI tools can be remarkably well trained to imitate human speech and writing styles, their outputs may seem very much as if they came from a human author and can convey the impression of accuracy and authority, as well as emotional connection.
However, this impression is an illusion. Because of the way that computers compress and store extremely large data sets, large language models estimate much of the information that they retrieve and compile instead of being able to reproduce it accurately, rather like trying to reconstruct the exact text of a lecture you didn’t attend based on point-form notes taken by someone else. As a result, the outputs of large language models are highly error prone and much content can be fabricated (e.g., references), while what is reproduced accurately may constitute plagiarism.2
Current AI algorithms can’t reliably distinguish whether the information with which they have been programmed is true or false and can’t identify when they are fabricating (“hallucinating”) information rather than accurately reproducing it, further contributing to the unreliability of their outputs. Additionally, because current versions of large language models can’t update in real time the data on which they were trained, their outputs may be out of date. For example, ChatGPT was trained on information accessible on the Internet as of 20213 — a long time ago for some research fields with very rapid information turnover.
And while all of the above assumes that authors are trying to be accurate and honest, AI creates unprecedented potential for unscrupulous individuals to commit scientific misconduct by generating convincingly fabricated or falsified papers. Such individuals would be foolish, however, as tools are being rapidly developed to detect use of AI-assisted technologies in scientific articles.4
For these reasons, the ICMJE — and CMAJ — now require that authors disclose any use of AI-assisted technologies in the generation of any part of a submitted manuscript (Box 1). Disclosure should occur in both the manuscript itself and the cover letter that accompanies manuscript submission, and authors should be prepared to provide detailed information regarding the nature of its use. Artificial intelligence–assisted technologies must not be named as authors of articles, because they are incapable of fulfilling several required ICMJE criteria for authorship, including being able to take responsibility for the published work, declare competing interests and enter into copyright and licensing agreements.1 Instead, human authors must assume full responsibility for ensuring that all content generated by AI is accurate and free from error, fabrication and plagiarism. Similarly, AI-assisted technologies must not be cited as a primary source for any information provided in a manuscript, as AI merely reproduces (often inaccurately) other primary sources, the identity and quality of which may not be known.
Box 1: Requirements for reporting use of artificial intelligence–assisted technologies for manuscripts submitted toCMAJ (based on recommendations of the International Committee of Medical Journal Editors1)
At article submission, CMAJ requires authors to disclose any use of artificial intelligence (AI)–assisted technologies (e.g., large language models, chatbots, image creators) in any aspect of the creation of the submitted work. Authors should describe the nature of such use in the cover letter as well as in the manuscript itself.
Artificial intelligence and AI-assisted technologies must not be listed as an author or co-author of a manuscript.
Artificial intelligence and AI-assisted technologies must not be cited as a reference or other primary source or as an author of a reference.
Human authors are responsible for any submitted material that includes the use of AI-assisted technologies, including its correctness, completeness and accuracy.
Authors must be able to assert that there is no plagiarism in the article, including in text and images produced by AI-assisted technologies, and must ensure appropriate attribution of all material, including full citations where appropriate.
Peer reviewers must not upload CMAJ manuscripts to software or other AI technologies where confidentiality cannot be assured.
Even if AI-assisted technologies are used in a manner in which manuscript confidentiality can be guaranteed, peer reviewers who choose to use such technologies to facilitate their review must disclose their use and its nature to CMAJ and are responsible for ensuring that any AI-generated content incorporated into reviews is correct, complete and unbiased.
Because AI-assisted technologies can be highly efficient at processing and analyzing text, people who agree to act as peer reviewers of manuscripts might be tempted to use these tools to assist their review process. However, the ICMJE recommendations forbid this in many cases. Many AI tools retain a record of all content uploaded to them as part of their ongoing process of development. Uploading part or all of an unpublished manuscript to such an AI tool would therefore constitute a violation of the requirement that reviewers must keep these manuscripts confidential.
Incidental use of simpler, commonplace AI tools — such as those that correct spelling and grammar in many modern word-processing software applications — need not be disclosed. Moreover, research designed to study the nature and effect of AI is, of course, permitted, with appropriate description of methodologies employed.
These new recommendations prudently address some foreseeable harms of emerging AI-assisted technologies to science and scholarly publishing. Although these technologies may offer promising theoretical benefits for clinical practice, administrative tasks and everyday life, their overly rapid adoption risks serious unintended consequences.5 The propensity of current versions of AI-assisted technologies to generate misinformation and error is unacceptable in science, which depends fundamentally on accuracy, precision and reproducibility, and all researchers and scholars who are tempted to use AI tools in their work should be aware of this. Development and use of AI-assisted technologies must proceed with transparency, accountability and caution. However sophisticated AI technologies become, they remain algorithms designed by humans; they will never replace the human creativity, curiosity and ingenuity that comprise the foundation of science and scholarship.
Footnotes
Competing interests: www.cmaj.ca/staff
This is an Open Access article distributed in accordance with the terms of the Creative Commons Attribution (CC BY-NC-ND 4.0) licence, which permits use, distribution and reproduction in any medium, provided that the original publication is properly cited, the use is noncommercial (i.e., research or educational use), and no modifications or adaptations are made. See: https://creativecommons.org/licenses/by-nc-nd/4.0/