Jump to content

Wikipedia:Village pump (proposals)/RfC LLMCOMM guideline

From Wikipedia, the free encyclopedia

RfC: Turning LLMCOMM into a guideline

[edit]

Request for Comment

[edit]

Should the proposed guideline at User:Athanelar/Don't use LLMs to talk for you be accepted?

Please see the collapsed sections above for the pre-RfC workshopping and discussion of the topic.

Please indicate whether you Support or Oppose the proposed guideline and why. Athanelar (talk) 19:37, 15 January 2026 (UTC)[reply]

  • Oppose for all the reasons that WhatamIdoing explained in the discussion far more eloquently than I can. It's not a guideline to help editors undestand the issues and good practice around LLM use on talk pages it's an overly-long essay proslethising the evils of LLMs (well, that's a bit hyerbolic, but not by huge amounts). Don't get me wrong, we should have a guideline in this area, but this is not it. Thryduulf (talk) 19:44, 15 January 2026 (UTC)[reply]
  • Support I see no problem with it. Nononsense101 (talk) 19:51, 15 January 2026 (UTC)[reply]
  • Oppose the guideline per WhatamIdoing and support her alternative proposal at User:WhatamIdoing/Sandbox. This policy conflates all sorts of problems with AI (what is the section User:Athanelar/Don't use LLMs to talk for you#Yes, even copyediting doing here when the substance of that section is about copyediting articletext in a guideline that is about talk page comments?), makes a number of dubious claims about LLMs that rather than supported by evidence are supposed to be taken on faith, and is once again either dubiously unclear or internally contradictory (the claim that guideline does not aim to restrict the use of LLMs [for those with certain disabilities or limitations], for example). This would be great as an WP:Essay, but definitely not as a guideline. Katzrockso (talk) 23:03, 15 January 2026 (UTC)[reply]
    what is the section User:Athanelar/Don't use LLMs to talk for you#Yes, even copyediting doing here when the substance of that section is about copyediting articletext in a guideline that is about talk page comments? Showing that LLMs have trouble staying on task when copyediting is relevant regardless of where that copyediting takes place, whether it's in articletext or talk page comments. It's a supplement to the caution in the 'Guidance' section about using LLMs to cosmetically enhance comments. Athanelar (talk) 23:11, 15 January 2026 (UTC)[reply]
    Elsewhere I have used LLMs to copyedit few times and I have noticed this phenomenon (LLMs making additional changes beyond what you asked) using the freely available LLMs (I believe that the behavior of models is wildly variable so I cannot speak about the paid ones, which I refuse to pay for on principle). However, this was not a problem when I gave the LLM more specific instructions (i.e. do not change text outside of the specific sentence I am asking you to fix). The gist of the argument in that section is a non-sequitur: from the three examples given, the conclusion LLMs cannot be trusted to copyedit text and create formatting without making other, more problematic changes does not follow. Katzrockso (talk) 23:41, 15 January 2026 (UTC)[reply]
    Athanelar, guidelines don't normally spent a lot of time trying to justify their existence. Think about an ordinary guideline, like Wikipedia:Reliable sources. You don't expect to find a section in there about what would happen to Wikipedia if people used unreliable sources, right? This kind of content is off topic for a guideline. WhatamIdoing (talk) 00:24, 16 January 2026 (UTC)[reply]
    Sure, but LLMs are a topic that people are uniquely wont to quibble about, whether because their daily workflow is already heavily LLM-reliant or simply because they have no idea why anybody would want to restrict the use of LLMs. I think it's sensible to assume that our target audience here will be people who aren't privy to LLM discourse, especially Wikipedia LLM discourse, and so some amount of thesis statement is sensible. Athanelar (talk) 01:44, 16 January 2026 (UTC)[reply]
    I think that separating simple clearly defined guidelines from longer essays justifying the reasoning would be the prudent thing to do here. Some people just need to be told that no means no without getting bogged down in the weeds, while others might genuinely be interested in a deeper understanding of the issues. ChompyTheGogoat (talk) 09:25, 28 January 2026 (UTC)[reply]
  • Oppose We should do something, but this manifesto isn't it. For example:
    • This is supposed to be about Talk: pages, and it spends 200+ words complaining about LLMs putting errors into infoboxes and article text.
    • Sections such as A large language model can't be competent on your behalf repeatedly invoke an essay, while apparently ignoring the advice in that same essay (e.g., "Be cautious when referencing this page...as it could be considered a personal attack"). In fact, that same essay says If poor English prevents an editor from writing comprehensible text directly in articles, they can instead post an edit request on the article talk page – something that will be harder for editors to do, if they're told they can't use machine translation because the best machine translation for the relevant language pair now uses some form of LLM/AI – especially DeepL Translator.
    • Anything an LLM can do, you can do better is factually wrong. It's a nice slogan, but LLMs are better at some tasks than humans, for the same reason that Wikipedia:Bots are better at some tasks than humans.
  • Overall, this is an extreme, maximalist proposal that doesn't solve the problems and will probably result in more drama. In particular, if adopted, I expect irritable editors to improperly revert comments that sound like it was LLM-generated (in their personal opinion) when they shouldn't. IMO "when they shouldn't" includes comments pointing out errors and omissions in articles, people with communication disorders such as severe dyslexia (because they'll see "bad LLM user" and never stop to ask why they used it), people with autism (whose natural, human writing style is more likely to be mistaken for LLM output), and people who don't speak English and who are trying to follow the WP:ENGLISHPLEASE guideline. WhatamIdoing (talk) 23:07, 15 January 2026 (UTC)[reply]
    This has probably (hopefully?) been mentioned elsewhere and I just haven't found it yet, but there needs to be a clear distinction drawn between simple tools that correct spelling and grammar or provide direct translation vs LLMs that actually generate original (using the term loosely) content. I can't imagine anyone taking issue with using spellcheck. Direct translations are usually sufficiently intelligible for a simple request like examples given here when they're in common languages, though that may be less true if the original language is obscure, and I don't know whether or not LLMs are any better at those. Personally I'm quite willing to engage in conversations via simple translations and attempt to work out any errors, whereas blatant LLM text from a user that can't possibly understand what it's saying on their behalf is very off-putting - and concerning IMHO. I would suggest a simple requirement for a disclaimer when someone is using any method of translation for their comments to help prevent misunderstandings. I would also like to see a preference for simple tools, but in the name of WP:ACCESS LLMs could be permitted if the user truly feels they cannot explain themselves sufficiently without it, with emphasis on minimizing use as much as possible. That would require an assumption of good faith as no one else is entitled to make such a judgement call on someone else's behalf. Of course this still only applies to TPs etc, not article content. ChompyTheGogoat (talk) 10:03, 28 January 2026 (UTC)[reply]
  • Oppose We should be judging edits and comments on their actual content, not on whether any of it appears to be LLM-generated. - Donald Albury 00:31, 16 January 2026 (UTC)[reply]
    I agree in principle. That said, from the discussion above, I do think we need to redesign the unblock process to make it less dependent on English skills, because needing to post a well-written apology is why many people turn to their favorite LLM. I'm looking at the Wikipedia:Unblock wizard idea, which I think is sound, but it still wants people to write "in your own words". For most requests, it would probably make more sense to offer tickboxes, like "Check all that apply: □ I lost my temper.  □ I'm a paid editor.  □ I wrote or changed an article about myself, my friends, or my family.  □ I wrote or changed an article about my client, employer, or business" and so forth. WhatamIdoing (talk) 00:40, 16 January 2026 (UTC)[reply]
    From my limited experience reading unblock requests, it appears that the main theme that administrators are looking for is admission of the problem that led to the block and a genuine commitment to avoiding the same behavior in future editing. I think some people might object to such a formulaic tickbox (likely for the same reasons they oppose the use of LLMs in unblock requests) as it removes the ability of editors to assess whether the appeal is 'genuine' (whether editors are reliable arbiters of whether an appeal is genuine or not is a different question), which is evinced from the wording and content of the appeal. Katzrockso (talk) 01:25, 16 January 2026 (UTC)[reply]
    I think we need to move away from model in which we're looking for an emotional repentance and towards a contract or fact-based model: This happened; I agree to do that. WhatamIdoing (talk) 04:06, 16 January 2026 (UTC)[reply]
    I think the key thing that needs to be communicated is that they understand why they were blocked. Not just a "I got blocked for editwarring" but an "I now understand editwarring is bad because...". Agreeing what happened is a necessary part of that (if you don't know why you were blocked you don't know what to avoid doing again) but not sufficient because if you don't understand why we regard doing X as bad, then you're likely to do something similar to X and get blocked again. Thryduulf (talk) 04:33, 16 January 2026 (UTC)[reply]
    @WhatamIdoing, I have good news for you: most or even all of us already use the model you desire (for which tickboxes would be effectively useless). You may be interested in watching CAT:RFU yourself to see how these work out. I've also got an in-progress guide on handling unblocks. -- asilvering (talk) 17:43, 16 January 2026 (UTC)[reply]
    My thought with tickboxes is that there is no opportunity to use an LLM when all you're doing is ticking a box.
    l partly agree with your view that "It doesn't matter whether they're sorry". It doesn't matter in terms of changing their behavior, but it can matter a lot in terms of restoring relationships with any people they hurt. This is one of the difficulties. WhatamIdoing (talk) 17:50, 16 January 2026 (UTC)[reply]
    Sure, there's no opportunity to use an LLM. But then we have exactly the same problem that we have when they're using LLMs: we don't actually know that they understand anything at all. -- asilvering (talk) 18:38, 16 January 2026 (UTC)[reply]
    I think it depends on what you put in the checkboxes. Maybe "□ I believe my actions were justified under the circumstances" or "□ I was edit warring, but it was for a good reason". WhatamIdoing (talk) 20:28, 16 January 2026 (UTC)[reply]
    Are we still talking about language barriers, or just trolls? Because if someone doesn't have a firm grasp on English it's quite possible they also don't grasp the underlying problem behind the block for that reason, as opposed to being an ESL troll. Neither tick boxes or LLM rules can solve that. Perhaps (and feel free to correct me if such things already exist) there needs to be an easy and obvious way to request translation assistance regarding blocks and bans, so the translator can ensure the user knows what they're being told and has communicated the extent of their understanding in their own words? I know that might cause a delay, but if users choose to bypass it because they're impatient they'd need to live with the consequences. ChompyTheGogoat (talk) 10:14, 28 January 2026 (UTC)[reply]
    We're talking about a general purpose unblock request system. Currently, there is no easy and obvious way to do anything related to unblock requests. WhatamIdoing (talk) 19:04, 28 January 2026 (UTC)[reply]
    I have no experience with the process, but there's generally a template involved with blocks, right? My recommendation would be to add something along the lines of a warning that their request/response may have permanent consequences, should be in their own words, and a link to wherever translation requests are usually handled. It won't make any difference with bad actors, but might help when someone with a language barrier is acting in good faith and doesn't realize that a well phrased LLM response will hurt their case. ChompyTheGogoat (talk) 11:48, 29 January 2026 (UTC)[reply]
  • I'd support it if it was tweaked First, a preamble. We continue to nibble around the edges of the LLM issue without addressing the core issues. I still think we need to make disclosure of AI use mandatory before we're going to have any sort of effective discussion about how to regulate it. You can't control what you don't know is happening. That might take software tools to auto-tag AI likely revisions, or us building a culture where its okay to use LLM as long as you're being open about it.
    General grumbles aside, lets approach the particular quibbles with this proposal. This guideline is contradictory. The lead says that using LLM is forbidden...but the body is mostly focused on trying to convince you that LLM use is bad. Its more essay than guideline. I also think that it doesn't allow an exemption for translation, which is...lets be honest...pervasive. Saying you can't use translate at all to talk to other editors will simply be ignored. I think this needs more time on the drawing board, but I'd tentatively support this if the wording was "therefore using them to generate user-to-user communication is strongly discouraged." rather than forbidden. CaptainEek Edits Ho Cap'n! 01:33, 16 January 2026 (UTC)[reply]
    The text of the guideline explicitly allows translation subject to review; but if that's unclear that's a problem in itself. Athanelar (talk) 01:45, 16 January 2026 (UTC)[reply]
    Just one small point, but from a literal reading of two current rules, you are already required to disclose when you produce entirely LLM generated comments or comments with a significant amount of machine generated material; the current position of many Wikipedia communities (relevantly, us and Commons) is that this text is public domain, and all editors, whenever they make an edit with public domain content, "agree to label it appropriately". [5]. Therefore, said disclosure is already mandatory - mainspace, talkspace, everywhere. The fact that people don't disclose, despite agreeing that they will whenever they save an edit, is a separate issue to the fact that those rule already exists. GreenLipstickLesbian💌🧸 06:31, 16 January 2026 (UTC)[reply]
    @GreenLipstickLesbian, I think that's a defensible position, but not one that will make any sense to the vast majority of people who use LLMs. So if we want people to disclose that they've used LLMs, we have to ask that specifically, rather than expecting them to agree with us on whether LLM-generated text is PD. -- asilvering (talk) 18:40, 16 January 2026 (UTC)[reply]
    @Asilvering Yes, but the language not being clear enough for people to understand is, from my perspective, a separate issue as to whether or not the rule exists. We don't need to convince editors to agree with us that LLM generated text is PD, just the same way I don't actually need other editors to agree with me on whether text they find on the internet is public domain or that you can't use the Daily Mail for sensitive BLP issues- there just needs to be a clear enough rule saying "do this", and they can follow it and edit freely, or not and get blocked.
    And just going to sandwich on my point to @CaptainEek - it is becoming increasing impossible to determine if another editor's text in any way incorporates text from a LLM, given their ubiquity in translator programs and grammar/spellcheck/tone checking programs, which even editors themselves may not be aware use such technology. So LLMDISCLOSE, as worded, will always remain unenforceable and can never be made mandatory - an that's before getting into the part where it says you should say what version on an LLM you used, when a very large segment of the population using LLMs simply is not computer literate enough to provide that information. (Also, I strongly suspect that saying "I used an LLM to proofread this" after every two line post which the editor ran through Grammarly, which is technically what LLMDISCLOSE calls for, would render the disclosures as somewhat equivalent to the Prop 65 labels - somewhere between annoying and meaningless in many cases, and something which a certain populace of editors stick on the end of every comment because you believe that's less likely to get you sanctioned than forgetting to mention you had Grammarly installed)
    However, conversely, what the average enWiki editor cares about is substantial LLM interference - creation of entire sentences, extensive reformulation - aka, the point at which the public domain aspect of LLM text and the PD labeling requirement starts kicking in. It's not a perfect relationship, admittedly, but it covers the cases that I believe most editors view should be disclosed, while leaving alone many of the LLM use cases (like spellcheck, limited translation, formatting) that most editors are fine with or can, at the very least, tolerate. GreenLipstickLesbian💌🧸 19:35, 16 January 2026 (UTC)[reply]
    Perhaps a mandatory checkbox:
    I confirm that no part of this text was generated by AI/LLM, with an accompanying "What's this?" explanation script. A second option disclosing AI use (actual generated content) could require an additional review process, and that delay might help discourage it. ChompyTheGogoat (talk) 10:31, 28 January 2026 (UTC)[reply]
    People will tick whatever boxes they need to tick to accomplish their goal. If they have to swear to hand over their first-born child to post something, they'll do that. WhatamIdoing (talk) 19:05, 28 January 2026 (UTC)[reply]
    Longer explanation now at Wikipedia:Checking boxes encourages people to tell lies. WhatamIdoing (talk) 20:53, 28 January 2026 (UTC)[reply]
    People will (and do) lie in any format if they're acting in bad faith. @GreenLipstickLesbian said So LLMDISCLOSE, as worded, will always remain unenforceable and can never be made mandatory...when a very large segment of the population using LLMs simply is not computer literate enough to provide that information.
    This would make the point clear every time something is posted (with the "What's this" explanation being an important inclusion), providing simple justification for deleting/ignoring if they DO lie, which is the bigger purpose. "You checked the box agreeing that you understand the policy and did not use AI, then copypasta'd blatant slop anyway. Into the bin." Done.
    People who truly believe they're using it ethically could agree to the additional review option, so they don't feel forced into the "lie or don't post anything" corner. And I don't think it's strictly necessary for built in review tools, because if it's just copyediting their own work it's not likely to cause the same issues as full on intentional generation. Reviewers can use their own judgement and let it slide or maybe just give a warning if the content itself is acceptable. It usually won't be obvious that a LLM was used in that case anyway, which is kind of the point. We want to discourage problematic use and minimize time and energy spent on trolls without violating WP:BITE. ChompyTheGogoat (talk) 12:11, 29 January 2026 (UTC)[reply]
    If the "reviewers" (you mean "ordinary editors who are supposed to be focusing on content", right?) take the "simple" step of reverting or warning people who use AI tools under this proposal, they'd probably be screwing up. This proposal would explicitly authorize their use for people who have relevant disabilities or are English language learners. Therefore, complying with the proposed guideline would probably mean that your first response as a reviewer would be a polite question: "Did you use ChatGPT or a similar tool to write this? We don't normally like those on wiki, though we make a few exceptions as explained in User:Athanelar/Don't use LLMs to talk for you#Caution." WhatamIdoing (talk) 19:00, 29 January 2026 (UTC)[reply]
    Sorry, I got a bit off track - I was referring to the hypothetical checkbox option and people who DON'T disclose more subtle uses. Theoretically that could qualify it to be summarily deleted for lying the same way a wall of slop would be under the same potential policy, but editors could give more leeway if it's less obvious and/or possibly unintentional. By "reviewers" I meant volunteers for the additional AI process, or anyone who notices that's been dodged. Ideally people who are using it legitimately abs knowingly would disclose and go through that process, so not doing so would either be an mistake or bad faith. If it's ChatGPT copypasta it's hard to claim that checking the "no AI" box was a mistake because they didn't know. ChompyTheGogoat (talk) 23:25, 29 January 2026 (UTC)[reply]
    Do you want to check a "no AI" box every time you post a comment? I don't. WhatamIdoing (talk) 23:58, 29 January 2026 (UTC)[reply]
    @GreenLipstickLesbian WP:LLMDISCLOSE isn't mandatory though, just advised. In a system where it is not mandated, it won't be done unless folks are feeling kindly. But I acknowledge that with the current text of LLMDISCLOSE, we could begin to foster a culture that encourages, rewards, and advertises the importance of LLM disclosure. We may need a sort of PR campaign where it's like "are you using AI? You should be disclosing that!" But I think it'd be more successful if we could say you *must*. CaptainEek Edits Ho Cap'n! 18:57, 16 January 2026 (UTC)[reply]
    For the most part, people do what's easy and avoid what's painful. If you want LLM use disclosed, then you need to make it easy and not painful. For example, do we have some userboxes, and can that be considered good enough disclosure? If so, let's advertise those and make it easy for people to disclose. Similarly, if we want people to disclose, we have to not punish them for doing so (e.g., don't yell at them for being horrible LLM-using scum). WhatamIdoing (talk) 20:18, 16 January 2026 (UTC)[reply]
    Userbox is easy to make and a good start. A checkbox for every edit, like the minor edit checkbox...now that could get us somewhere! CaptainEek Edits Ho Cap'n! 21:09, 16 January 2026 (UTC)[reply]
    Maybe... or maybe a checkbox would just get some people to check it always, even if they're not using an LLM (we don't have a rule against false disclosures), and still be ignored by most LLM-using editors. WhatamIdoing (talk) 21:23, 16 January 2026 (UTC)[reply]
    One argument I've seen against a per-edit checkbox is that it presumes acceptability. I.e., if you tick the box then it means your use of LLMs was fine because you disclosed it. Athanelar (talk) 21:28, 16 January 2026 (UTC)[reply]
    See my above suggestion re: additional review process. That would help to discourage it without an outright ban. It could also include a dropdown menu of common AI programs if someone does disclose it. An example for the explanation box might be something like: Wikipedia strongly encourages editors to use their own words when writing articles or comments and to avoid AI use whenever possible. However, we recognize that some users find value in such tools, so use is permitted with full disclosure. If you are not sure whether a program you use is considered AI, please check the dropdown list below. There is an additional review process for AI content, which may result in a delay.

    Someone else could probably phrase it better - maybe with a mention of accessibility, but you don't want people to assume "I don't have a disability so it doesn't apply to me." Keep it broad. But you get the gist. Heavy emphasis on avoidance but an option to disclose without being punished, and justification for swift consequences if they reject those options in favor of lying. It could also contain a "Why is this required" link, preferably to a new guideline written for that express purpose.

    As far as said review process, my suggestion would be to send them into a queue at WP:AIC with any article content requiring review BEFORE it goes live (and maybe a bot could create a TP section for it on the article in question) while TP comments could be posted immediately and just given a quick skim whenever someone gets to them. That might take longer to implement, but it could start as just a simple flag that anyone can glance over if they see it. ChompyTheGogoat (talk) 12:53, 29 January 2026 (UTC)[reply]
    Please be mindful of WP:BLUDGEON, I see you're replying a lot in this RfC. Especially considering you're proposing something entirely different and unrelated to the topic here. Athanelar (talk) 13:05, 29 January 2026 (UTC)[reply]
    I wouldn't say wholly unrelated, though I do see how it's off track in terms of !votes for your proposal. My intention with multiple comments was only to expand on the concept as I see more input from others, not to keep pushing the same point, but I'll step back regardless since it's better suited for discussion elsewhere (somewhere). My brain tends to take off like that whenever something speaks an idea. ChompyTheGogoat (talk) 13:29, 29 January 2026 (UTC)[reply]
  • Support, enough is enough and the level of obstructionism is mind-blowing. Gnomingstuff (talk) 01:45, 16 January 2026 (UTC)[reply]
  • Unfortunately, there's no foolproof way to tell whether a comment was LLM generated or not (sure, there are WP:AISIGNS, but again, those are just signs). Agree with Katzrockso that this would work better as an essay than a guideline. Some1 (talk) 02:30, 16 January 2026 (UTC)[reply]
  • Support creating a guideline banning LLM usage on talk pages. Oppose User:Athanelar/Don't use LLMs to talk for you for being too verbose and complex. Perhaps we can just add a paragraph or two to an existing talk page guideline. –Novem Linguae (talk) 05:39, 16 January 2026 (UTC)[reply]
  • Oppose this proposal, would support a simpler proposal, especially a simple addition to existing guidelines as NL says above. ~~ AirshipJungleman29 (talk) 13:30, 16 January 2026 (UTC)[reply]
    'Simpler' in terms of scope or prose? Athanelar (talk) 13:53, 16 January 2026 (UTC)[reply]
  • Oppose. Too long, and I don't think a fourth revision would address the problems; this is trying to do too much, some of which is unnecessary and some of which is impossible to legislate. I agree with those who say a paragraph (or even a sentence) somewhere saying LLMs should not be used for talk page communication would be reasonable. Mike Christie (talk - contribs - library) 13:38, 16 January 2026 (UTC)[reply]
  • Support the crux of the proposal, which would prohibit using an LLM to "generate user-to-user communication". This is analogous to WP:LLMCOMM's "Editors should not use LLMs to write comments generatively", and would close the loophole of how the existing WP:AITALK guideline does not explicitly disallow LLM misuse in discussions or designate it as a behavioral problem. A review of the WP:ANI archives shows that editors are regularly blocked for posting LLM-generated arguments on talk pages and noticeboards, and the fact that our policies and guidelines do not specifically address this very common situation is misleading new editors into believing that this type of LLM misuse is acceptable. Editors with limited English proficiency are, of course, welcome to use dedicated machine translation tools (such as the ones in this comparison) to assist with communication. The passage of the WP:NEWLLM policy suggests that LLM-related policy proposals are more likely to succeed when they are short and specific, so I recommend moving most of the proposed document to an information or supplemental page that can be edited more freely without needing a community-wide review. — Newslinger talk 14:17, 16 January 2026 (UTC)[reply]
    I do wonder if some of the 'overlong' complainants, @Mike Christie @Novem Linguae etc would support the guideline with the explanatory sections removed and only the substance (i.e., the 'Guidance for Editors' section) remaining Athanelar (talk) 14:24, 16 January 2026 (UTC)[reply]
    For what it's worth, @Athanelar, I think User:Athanelar/Don't use LLMs to talk for you would make an excellent essay. Agree its too long to be a Guideline (as learnt from my recent LLM policy RfCs). qcne (talk) 14:37, 16 January 2026 (UTC)[reply]
    I said above under version 2 that I don't think much of what is being addressed here is legislatable at all, but if anything is to be added I'd like to see a sentence or two added to a suitable guideline as Novem Linguae suggests. I think making this into an essay is currently the best option. Essays can be influential, especially when they reflect a common opinion, so it's not the worst thing that can happen to your work. Mike Christie (talk - contribs - library) 16:41, 16 January 2026 (UTC)[reply]
    @Newslinger, I've read that some of the "dedicated machine translation" tools are using LLMs internally (e.g., DeepL Translator). Even some ordinary grammar check tools (e.g., inside old-fashioned word processing software like MS Word) are using LLMs now. Many people are (or will soon be) using LLMs indirectly, with no knowledge that they are doing so. WhatamIdoing (talk) 17:44, 16 January 2026 (UTC)[reply]
Which is one of the reasons why 1) people who can't communicate in English really shouldn't be participating in discussions on enwiki and 2) people who use machine translation (of any type) really should disclose this and reference the source text (so other users who either speak the source language or prefer a different machine translation tool can double-check the translation themselves). -- LWG talk (VOPOV) 17:52, 16 January 2026 (UTC)[reply]
We sometimes need people who can't write well in English to be communicating with us. We need comments from readers and newcomers that tell us that an article contains factual errors, outdated information, or a non-neutral bias. When the subject of the article is closely tied to a non-English speaking place/culture, then the people most likely to notice those problems is someone who doesn't write easily in English. If one of them spots a problem, our response should sound like "Thanks for telling us. I'll fix it" instead of "People who can't communicate in English really shouldn't be participating in discussions on enwiki. This article can just stay wrong until you learn to write in English without using machine translation tools!" WhatamIdoing (talk) 19:51, 16 January 2026 (UTC)[reply]
IMO if they are capable of identifying factual errors, outdated information, or non-neutral bias in content written in English, then they should be capable of communicating their concerns in English as well, or at least of saying "I have some concerns about this article, I wrote up a description of my concerns in [language] and translated it with [tool], hopefully it is helpful." With that said, I definitely don't support biting newbies, and an appropriate response to someone who accidentally offends a Wikipedia norm is "Thanks for your contribution. Just you know, we usually do things differently here, please do it this other way in the future." -- LWG talk (VOPOV) 20:04, 16 January 2026 (UTC)[reply]
Because English is the lingua franca of the internet, millions of people around the world use browser extensions that automatically translate websites into their preferred language. Consequently, people can be capable of identifying problems in articles but not actually be able to write in English. WhatamIdoing (talk) 20:22, 16 January 2026 (UTC)[reply]
I don't speak Dutch or Portuguese beyond a handful of words but I can tell you that if I found an article about a political party or similar group saying "De leider is een vreselijke man." or "O líder é um homem horrível." that it needs to be changed. Similarly I can tell you that an article infobox saying the gradient of a railway is 40% is definitely incorrect, but I can't tell you what either needs changing to and I can't articulate what the problem is in Dutch or Portuguese, but I can use machine translation to give any editors there who don't speak English enough of a clue that they can fix it. The same is true in reverse. Thryduulf (talk) 21:48, 16 January 2026 (UTC)[reply]
@WhatamIdoing @Thryduulf those are fair points, and situations like that wouldn't bother me (though transparency and posting the source text would still be preferred to reduce possible misunderstandings). -- LWG talk (VOPOV) 22:17, 16 January 2026 (UTC)[reply]
I don't know. A recent example: I removed a paragraph containing a hallucinated source from an article here recently. That paragraph had made it to the Korean Wikipedia (I checked dates to confirm the direction of transit), so I removed it there too, and used Google Translate to post an explanation because otherwise it'd look like I was just removing text for no reason. Gnomingstuff (talk) 01:38, 18 January 2026 (UTC)[reply]
Yes, I also recognize that many machine translation tools now incorporate LLMs in their implementation. (The other active RfC at Wikipedia talk:Translation § Request for comment seeks to address this for translated article content, but not translated discussion comments.) When an editor in an LLM-related conduct dispute mentions that they are using an LLM for translation, I have always responded that there is a distinction between using a dedicated machine translation tool (such as Google Translate or DeepL Translator) that aims to convey a faithful representation of one's words in the target language, and an AI chatbot that can generate all kinds of additional content. If someone uses a language other than English to ask an AI chatbot to generate a talk page argument in English, the output would not be acceptable in a talk page discussion. But, if someone uses an LLM-based tool (preferably a dedicated machine translation tool) solely to translate their words to English without augmenting the content of their original message, that would not be a generative use of LLM and should not be restricted by the proposal. — Newslinger talk 01:00, 17 January 2026 (UTC)[reply]
Unless there is some reliable way for someone other than the person posting the comment to know which it was then the distinction is not something we can or should incorporate into our policies, etc. Thryduulf (talk) 01:02, 17 January 2026 (UTC)[reply]
Oppose this version, support in principle. I agree with every point, but it's overly long and essay-y. I'd back something more like User:WhatamIdoing/Sandbox. JustARandomSquid (talk) 16:17, 16 January 2026 (UTC)[reply]
Support. I agree with the concerns that it is too long, and certainly far from a perfect proposal, but having something imperfect is better than a consensus against having any regulation at all. I do also agree with Newslinger's proposal of moving the bulk of it to an information page if there is consensus for it. Chaotic Enby (talk · contribs) 17:22, 16 January 2026 (UTC)[reply]
  • Oppose - To restrictive and long. There is a reasonable way to use LLMs and this effectively disallows it which is a step to far. That coupled with the at best educated guessing on if it is actually is an LLM and assuming it is all unreviewed makes it untenable. PackMecEng (talk) 17:31, 16 January 2026 (UTC)[reply]
  • Support the spirit of the opening paragraph, but too long and in need of tone improvements. Currently the language in this feels like it is too internally-oriented to the discussions we have been having on-wiki about this issue, whereas I would prefer it to be oriented in a way that will help outsiders with no context understand why use-cases for LLMs that might be accepted elsewhere aren't accepted here. The version at User:WhatamIdoing/Sandbox is more appropriate in length and tone, but too weak IMO. I would support WhatamIdoing's version if posting the original text/prompt along with LLM-polished/translated output were upgraded from a suggestion to an expectation. With that said, upgrading WP:LLMDISCLOSE and WP:LLMCOMM to guidelines is the simplest solution and is what we should actually do here. -- LWG talk (VOPOV) 17:34, 16 January 2026 (UTC)[reply]
    I would also support those last two proposals, with the first one being required from a copyright perspective (disclosure of public domain contributions) and the second one being a much more concise version of the proposal currently under discussion. Chaotic Enby (talk · contribs) 17:36, 16 January 2026 (UTC)[reply]
    Also, if we did that then User:Athanelar/Don't use LLMs to talk for you would be a useful explanatory essay. -- LWG talk (VOPOV) 17:38, 16 January 2026 (UTC)[reply]
Support per Chaotic Enby and Newslinger, I don't see an issue with length since the lead and nutshell exist for this reason, but am fine with some of it being moved to an information page. LWG's idea above is also good, though re LLMDISCLOSE, Every edit that incorporates LLM output should be marked as LLM-assisted by identifying the name and, if possible, version of the AI in the edit summary is something nobody is going to do unprompted (and personally I've never seen). Kowal2701 (talk) 19:01, 16 January 2026 (UTC)[reply]
something nobody is going to do unprompted true, but it's something people should be doing. Failing to realize you ought to disclose LLM use is understandable, but failing to disclose it when specifically asked to do so is disruptive - there's simply no constructive reason to conceal the provenance of text you insert into Wikipedia. So while I don't expect people to do this unprompted, I think we should be firmly and kindly prompting people to do it. -- LWG talk (VOPOV) 19:11, 16 January 2026 (UTC)[reply]
I'd rather something like Transparency about LLM-use is strongly encouraged., and we should have practically zero tolerance for people denying LLM-use in unambiguous cases, ought to be met with a conditional mainspace block. I'll be bold and add something Kowal2701 (talk) 20:47, 16 January 2026 (UTC)[reply]
Oppose as written. For an actual guideline, I would prefer something like User:WhatamIdoing/Sandbox. It makes clear the general expectations of the community should it be adopted. This proposal reads like an essay; it's trying to convince you of a certain viewpoint. Guidelines should be unambiguous declarations about the community's policies. For me, the proposed guideline is preaching to the choir, I agree with basically all of it, but I don't see it as appropriate for a guideline. I second what Chaotic Enby, Newslinger, and CaptianEek have said, and absolutely support the creation of a guideline of this nature. -- Agentdoge (talk) 19:27, 16 January 2026 (UTC)[reply]
  • Support per Newslinger and Chaotic Enby. fifteen thousand two hundred twenty four (talk) 19:50, 16 January 2026 (UTC)[reply]
  • Oppose. This guideline is too long and too complicated. The guideline should be pretty simple - the length and the complexity should be similar to User:WhatamIdoing/Sandbox. Explaining the problem of the LLM may be added as a later explanatory essay, but not in this guideline. Provisions that editors are expected to WP:AGF before blaming LLM should also be made. SunDawn Contact me! 02:05, 17 January 2026 (UTC)[reply]
  • Weak support for the original proposal because the crux of it is still better than the status quo in spite of its flaws (too long, unfocused, essay-like); Strong support WhatamIdoing's version, whether or not tweaks happen to it. Choucas0 🐦‍⬛ 14:32, 17 January 2026 (UTC)[reply]
  • I support a ban on LLM-written communication, but oppose this draft, as I find it to be poorly written in several respects. It is too long. The "Remedies" section entirely duplicates current practice elsewhere, some (maybe all?) of which is already documented in other guidelines. It says something is "forbidden" but then goes on to give an example of how LLMs actually can be used. And it generally contains far too much explaining of its own logic, which makes it much weaker and open to wikilawyering. "Anything an LLM can do, you can do better"? Nope. "Large language models cannot interpret and apply Wikipedia policies and guidelines"? Dubious. Toadspike [Talk] 18:22, 17 January 2026 (UTC)[reply]
    I support WhatamIdoing's proposal, which essentially addresses all of my concerns with the original proposal by Athanelar. Toadspike [Talk] 18:24, 17 January 2026 (UTC)[reply]
Oppose as written, support a ban/ regulation on LLM communication. Toadspike and WhatamIdoing expressed my concerns here quite nicely, I think this proposal is not yet ready to become a guideline. For example it uses examples from article editing for comparisons to communication. I also think that it insufficiently addresses WP:CIVILITY, some people may write in a way others interpret as LLM-generated, if we the fire the harshest wording we have at them we might scare away some great contributors from the project. Therefore I too support WhatamIdoing's proposal. Best, Squawk7700 (talk) 22:23, 17 January 2026 (UTC)[reply]
Just to be clear: The thing I bashed together in my sandbox is IMO not ready for a WP:PROPOSAL. I created it only as an illustration of what could be done. WhatamIdoing (talk) 23:10, 17 January 2026 (UTC)[reply]
Thanks for the clarification! I myself meant it as more of a literal proposal (not a WP:PROPOSAL), can't speak for the others here of course and do think we'd probably need to so some workshopping. That said I think you already did quite a good job on that draft. Kind regards Squawk7700 (talk) 23:21, 17 January 2026 (UTC)[reply]
Now I feel bad knowing I supported someone's bashed together draft over a third iteration proposal. JustARandomSquid (talk) 23:49, 17 January 2026 (UTC)[reply]
Don't feel bad. I've been writing Wikipedia's policies and guidelines for longer than some of our editors have been alive, and we spent hours discussing the problems we're having with AI comments before Athanelar launched this RFC. Given all that, it would be surprising if I couldn't throw together something that looks okay. WhatamIdoing (talk) 23:59, 17 January 2026 (UTC)[reply]
Any outcome here which results in more restriction against LLM usage is a positive one for me. I might not be fully satisfied with WAID's approach to the matter, but more community consensus against AI will only ever be an improvement as far as I'm concerned. I'll be happy if my RfC leads to that, whether I wrote the final result is irrelevant. Athanelar (talk) 02:19, 18 January 2026 (UTC)[reply]
It could use some tweaks but I definitely think you're closer to a viable solution. It needs to be simple and concise, and it needs to recognize that some people ARE going to use them regardless of we decide, so a blanket ban probably isn't the best way to go. There's nothing preventing others from composing longer essays on the matter, but newbies and casual users probably aren't going to read something that long. ChompyTheGogoat (talk) 13:20, 29 January 2026 (UTC)[reply]
Regarding examples from articles: Finding an example of a talk page edit of this nature would be difficult bordering on impossible; people aren't supposed to edit others' comments, and they almost never do except to vandalize them or occasionally fix typos. Gnomingstuff (talk) 23:33, 18 January 2026 (UTC)[reply]
Oppose. Most of the section "Large language models are not suitable for this task" digresses from the main topic and relies on a very outdated understanding of what LLMs can do. Among its issues, the idea that LLMs only repeat text from their training data is untenable in 2026. The section also completely ignores LLM fine-tuning. However, like dlthewave, I would Support something similar to WhatamIdoing's sandbox, which is well-reasoned, properly nuanced and relatively concise. Alenoach (talk) 20:37, 18 January 2026 (UTC)[reply]
[It] relies on a very outdated understanding of what LLMs can do.[citation needed] Among its issues, the idea that LLMs only repeat text from their training data is untenable in 2026.[citation needed]
SuperPianoMan9167 (talk) 21:08, 18 January 2026 (UTC)[reply]
For example, it claims that LLMs are not able to perform arithmetic operations, and that they instead only retrieve memorized results. But what if you ask to calculate e.g. 73616*3*168346/4? Surely, this can't be in its training data, and yet ChatGPT gets the exact answer even without code execution. Alenoach (talk) 21:19, 18 January 2026 (UTC)[reply]
ChatGPT secretly writes Python code to do the math. The actual LLM cannot do math. SuperPianoMan9167 (talk) 21:21, 18 January 2026 (UTC)[reply]
Who cares about the LLM part of the final product, people aren't going in and using specifically the LLM part of ChatGPT. Katzrockso (talk) 21:44, 18 January 2026 (UTC)[reply]
It can use code interpreter, but it generally doesn't and it's often visible in the chain-of-thought when it uses it. You can also see that when the number of digits in a multiplication is too high (e.g. > 20 digits), it will start making mistakes (similarly to a human brain), whereas a Python interpreter would get an exact answer. I haven't found any great source assessing ChatGPT on multiplications, but the graph shown here gives an idea of what ChatGPT's performance profile on multiplications looks like. Alenoach (talk) 21:54, 18 January 2026 (UTC)[reply]
Because there's such a staggering amount of correct calculations in its training data that it usually gets things right, but it's fundamentally still a large language model. JustARandomSquid (talk) 21:23, 18 January 2026 (UTC)[reply]
That, and it literally writes Python code to do the math when some non-AI code detects the prompt has arithmetic. See this article. SuperPianoMan9167 (talk) 21:25, 18 January 2026 (UTC)[reply]
...which means that we're wrong to say that these tools can't do these things.
Remember that not everyone is going to draw a distinction between "the LLM part of ChatGPT" and "the non-LLM parts of ChatGPT". Non-technical people will often say "LLM" or "AI" when they mean something vaguely in that general category. The proposal here doesn't distinguish between the LLM and non-LLM components of these tools. It seeks to ban them all. WhatamIdoing (talk) 21:32, 18 January 2026 (UTC)[reply]
Mine doesn't seem to disclose that. If they've started secretly doing stuff under the hood that's messed up. Still though, guidelines shouldn't have to apologetically explain themselves like this proposal does. JustARandomSquid (talk) 21:32, 18 January 2026 (UTC)[reply]
The section "Large language models are not suitable for this task" relies on outdated information about LLMs, which undermines its relevance and accuracy. (Or possibly even earlier.) The subsection's premise that LLMs are limited to "repeating training data" is outdated and has been disproven by recent research showing a "neuron-level differentiation" between memorization and generalization. Distributional generalization benchmarks also show that LLMs are not merely recalling information, but actively generating new data. A further fatal flaw with this subsection is that it does not account for Parameter-Efficient Fine-Tuning (PEFT) and LoRA, techniques which allow for fine-grained control over a model's domain and rigidly enforcing task-specific constraints, nor does it reflect the reality of how LLMs are used in practice. Furthermore, it fails to acknowledge the rapid pace at which developers are delivering improvements and addressing concerns.
I would, like dlthewave, support WhatamIdoing's sandbox, which is well-reasoned and properly nuanced and does not fall prey to any of the above flaws. Sparks19923 (talk) 22:21, 28 January 2026 (UTC)[reply]
[citation needed] SuperPianoMan9167 (talk) 00:12, 29 January 2026 (UTC)[reply]
recent research showing a "neuron-level differentiation" between memorization and generalization Would you be willing to provide a link to said research?
Distributional generalization benchmarks also show that LLMs are not merely recalling information, but actively generating new data. Which benchmarks? And what about this research? SuperPianoMan9167 (talk) 00:16, 29 January 2026 (UTC)[reply]
Support LLM outputs tend to be repetitive, irrelevant and reference common guidelines that everyone knows to support the poster's argument, they are a time sink because even an AI generated response with no effort put in it, has to be addressed by human effort. It is better to just ban it all. Zalaraz (talk) 01:50, 20 January 2026 (UTC)[reply]
  • Oppose per Thryduulf, "we should have a guideline in this area, but this is not it." Benjamin (talk) 06:28, 20 January 2026 (UTC)[reply]
  • Oppose as guideline, but retain this laudable effort on the part of Athanelar to address a growing problem as a proto-essay. While I am in favor of adapting WhatamIdoing's proposal into a badly-needed guideline, I fully understand OP's frustration. I say this as someone fresh out of a not-so-pleasant encounter with a disruptive LLM-using sockpuppet. As someone who uses next to no automated tools or bots himself (I appreciate 'Find on page'—though it feels a bit like cheating—and (sarcasm alert) it's nice to not have to constantly format a custom sig) you can imagine the frustration of having to watch someone spit out (I use the phrase advisedly) rapid-fire artificial responses to your questions. Worse, the LLM argues on its own behalf due to the lack of a clear guideline forbidding itself to do so, while repeating things you've brought up—such as articles, guidelines, and shortcuts—in an imbecilic way. I don't feel the same way about participating in a translated discussion, and the fact that I've never noticed being in one probably means the process works well. In any case, the sooner an LLM-talk guideline forbidding AI-produced discussion is implemented, the better. StonyBrook babble 07:55, 20 January 2026 (UTC)[reply]
    As a practical matter, if someone is breaking the WP:SOCK rules, then they'll also break any anti-LLM rules we adopt. Having rules only prevents behavior if the user wants to follow the rules (which is most people) and they know about the rules and the cost to them(!) of following the rule isn't too high in their(!) opinion. WhatamIdoing (talk) 17:52, 20 January 2026 (UTC)[reply]
    It's actually not about the cost - it's about the perceived cost to them of complying with the rules vs the perceived cost to them of not complying with the rules. In some cases whether they understand why the rule exists and whether they think it makes sense are also part of the equation. Thryduulf (talk) 18:22, 20 January 2026 (UTC)[reply]
    This was essentially my line of thinking in including the 'essay-like' content of the guideline. Athanelar (talk) 01:27, 21 January 2026 (UTC)[reply]
    I agree. It is not unreasonable to expect our colleagues to use their own words for communicating with eachother instead of relying on AI. Zalaraz (talk) 01:14, 21 January 2026 (UTC)[reply]
    Earlier today, I removed some controversial and misleading content from the Polish Wikipedia. I don't speak any Polish. How would you have me "use my own words" there? I assume you think the rules should be the same, and that the English Wikipedia shouldn't make rules that we wouldn't want to follow ourselves. WhatamIdoing (talk) 03:02, 21 January 2026 (UTC)[reply]
    Not sure how this is related. You can use a language you do speak to speak your own words. There's no ideal way to edit in a language you don't speak. CMD (talk) 03:09, 21 January 2026 (UTC)[reply]
    I can use English, and get the equivalent of WP:ENGLISHPLEASE in reply, because not everyone there reads English. Alternatively, I can use some type of machine translation, and everyone will figure it out. Which would you prefer, if the situation was reversed? WhatamIdoing (talk) 05:00, 21 January 2026 (UTC)[reply]
    Why is this a hypothetical? People post in non-English here all the time. Usually I take a second or two to translate with browser tools. (99% of the time it's nonsense, but that is not limited to non-English posts.) CMD (talk) 13:04, 21 January 2026 (UTC)[reply]
    You do know that those browser tools incorporate LLM technologies to the same degree as things like Google Translate (if you are using Chrome then it is Google Translate). Thryduulf (talk) 13:07, 21 January 2026 (UTC)[reply]
    Not on Chrome, but yes, and? I am not posting the results as my own words. CMD (talk) 13:13, 21 January 2026 (UTC)[reply]
    So LLM translations are good enough for you to use, but not good enough for the editor who is posting? Even though you don't have access to machine translation for every language, and even though you often won't know which translation tool is best for the language in question?
    I prefer our recommendation of providing the original plus a machine-translated version for a talk-page comment, but I don't think it makes sense for each of us to use machine translation while trying to ban the editor from doing exactly what we're each individually doing and posting the results. WhatamIdoing (talk) 18:19, 21 January 2026 (UTC)[reply]
    This is a bizarre interpretation. Machine and llm translations are not "good enough", when I used them they are the best at hand. They are not optimal. Furthermore, when I am using a translator I know I am using a translator. I would have the original text of what the person was saying on hand, and I would know that this original text had gone through both a linguistic filter and through a pattern matching filter. For some language's I am even aware of common translation issue, and if other editors can see the original text as well that means a higher chance another editor will have such information or even know the original language. I note the motte and Bailey switch in the second paragraph, that would be great but was not your original argument and what flows from it follows the bizarre interpretation. CMD (talk) 07:29, 22 January 2026 (UTC)[reply]
    They're obviously good enough (for some purposes), because you keep using them. If they weren't doing something desirable for you, you'd stop using them entirely.
    I do take your point about the benefits of knowing that machine translation was being used.
    From my POV, and remembering that we're specifically talking here about editor-to-editor communication (and not, e.g., writing an article), there's a spectrum of functionality: labeled machine translation → unlabeled machine translation → non-English → not getting information we need. WhatamIdoing (talk) 18:57, 22 January 2026 (UTC)[reply]
    Something being good enough "for some purposes" does not mean that thing is good enough for other purposes. Those are distinct purposes and equating them is a fallacy. Editor to editor communication is enhanced if editors know what they're saying; the Dirty Hungarian Phrasebook does not enhance functionality beyond the native language. "unlabeled machine translation" can equal "not getting the information we need" in some situations, and in all situations does equal "not getting some of the information we need" because it means we lack the very knowledge of machine translation. CMD (talk) 08:20, 23 January 2026 (UTC)[reply]
  • oppose as written very wordy and convoluted. and honestly, User:WhatamIdoing/Sandbox seems to be a much more straightforward guideline. -- Aunva6talk - contribs 18:29, 20 January 2026 (UTC)[reply]
  • Support something. Athanelar's version is more of an explanatory essay, WhatamIdoing's is to weak. Fully support LWG's idea of upgrading LLMDISCLOSE and LLMCOMM, they would work well with LLMTALK. -- LCU ActivelyDisinterested «@» °∆t° 19:09, 20 January 2026 (UTC)[reply]
Oppose. This guideline is too strict, and will chill legitimate use of assistive devices. This deletion request would essentially prohibit LLM-assisted text when the fundamental point and argument are presented by the editor themselves. I don't feel that this treatment squares with Wikipedia's mission of fostering helpful communication. Just as we currently expect editors to communicate in their own voice and ideas, grammarly or LLM tone checkers assist with quality contributions without taking away an editors ability to have original thoughts. Allow editors to edit grammar/tone while maintaining the argument. Sparks19923 (talk) 19:59, 20 January 2026 (UTC)[reply]
  • Oppose: I would prefer a simpler rule: if you write a comment, by whatever means you use to do so, that comment is yours and you are responsible for it. If you use an AI and the AI misrepresents your idea and you did not notice... that is your problem, not ours. Besides, the page has a strong anti-AI bias: "AI may commit mistakes, therefore it always does and it's useless". Actually, more often than not AI does a good job, that's why it's so widespread. Cambalachero (talk) 07:18, 21 January 2026 (UTC)[reply]
  • Oppose as written, works better as an esssay than as a guideline. WAID's proposed guideline reads far better and I would support it, with caution (as we have no tool to reliably detect AI-generated comments other than the "duck test" cases already covered by WP:HATGPT). JavaHurricane 14:17, 21 January 2026 (UTC)[reply]
Oppose for the eight reasons I listed in the RFCBEFORE discussion. WAID's draft is much, much better, for the various reasons people have listed above, but I wouldn't support it as written (e.g., I disagree with "Don't outsource your thinking"). I prefer to expand an existing guideline rather than make a new one; WAID's is a good start for that expansion.
Fundamentally, though, the rule should simply be "you are responsible for your edits, regardless of what technology you used to make them". Doesn't matter if you used a pen or Visual Editor or WikiPlus or a script or a bot or an API or LLM or typewriter or carrier pigeon, what you publish on this website with your account is your responsibility. If you cut and paste LLM text and it's a hoax, that's the same as if you wrote the hoax yourself. If it has fake cites, it'll be treated as if you added fake cites, because you did. If the talk page comment is verbose and repetitive, no one might read it, whether it's AI slop or you're just a bad writer. That's what the guidelines should warn editors about. Telling people not to use LLM at all is not only wrong, but pointless. It's like trying to tell people not to use a computer or not to use their mobile phones. Good luck with that. :-) Levivich (talk) 18:40, 21 January 2026 (UTC)[reply]
Support – I don't think it's perfect or even decent, but it is something that we can discuss later and improve, rather than proposing a bran new guideline every time. Ping on reply. FaviFake (talk) 16:34, 26 January 2026 (UTC)[reply]
Support per Gnomingstuff, enough is enough. More workshopping will just lead to this becoming even more verbose and complex, when the core of do not use LLMs to communicate just really needs to be written as a PAG yesterday. Dealing with LLM editors/pasters is exhausting enough as it is. --Gurkubondinn (talk) 11:12, 27 January 2026 (UTC)[reply]
Oppose - A classic case of "scope creep." I can understand why people would want to do something about LLM slop, but this proposed policy is literally unenforcable, and will do more damage than it will prevent.
Sparks19923 (talk) 22:02, 28 January 2026 (UTC)[reply]
Support. Chatbot text is spam applied to human attention. Nobody should be expected to treat spam with human consideration as if it were not spam attacking human consideration David Gerard (talk) 21:33, 21 January 2026 (UTC)[reply]
  • Support. WhatamIdoing's take is, however, probably more workable than Athanelar's, even if it doesn't go as far as I might like. I am actually of the opinion that LLM-generated content should be banned outright on Wikipedia (except, obviously, in direct quotes and the like when it's the subject of an article!), with no other exceptions of any kind under any circumstances whatsoever -- though I'm going to admit such a hardline stance is both hard to practically enforce and unlikely to gain traction. Something like this, at any rate, is the least we can do. Gimubrc (talk) 20:36, 23 January 2026 (UTC)[reply]
  • Support the proposal of turning the prohibition of using LLMs for even user to user communication into a guideline. There isn't a need for AI junk here, it's a waste of time if people aren't communicating themselves and I can't imagine any circumstance where it is better than direct communication other than those who may not be proficient enough in English but if you can't communicate directly then you are likely not going to be able to edit well. LLMs should be just kept outside Wikipedia under any circumstance. Omen2019 (talk) 10:21, 25 January 2026 (UTC)[reply]
  • Oppose - it does not differentiate sample AI content, presented as such. Like "here's what the AI can do", or "here's what the AI says about that, and I'm inclined to agree". Ask perplexity.ai what will likely happen to Wikipedia, and you might be shocked. If all you want to do is ask for other's input on it, why would you have to re-compose it from scratch? Just quote the AI and be done with it.    — The Transhumanist   11:28, 25 January 2026 (UTC)[reply]
  • Oppose - Too wordy (which causes WP:CREEP issues) and essay-like. I would support the further development and future consideration of User:WhatamIdoing/Sandbox. Cheers, Suriname0 (talk) 18:07, 25 January 2026 (UTC)[reply]
  • Support any version that can reach consensus. We need some sort of guideline, ideally yesterday; LLMs threaten to overwhelm volunteers and waste huge amounts of editorial time and energy. This is certainly better than nothing, and putting it in place will encourage people to compromise on future incremential improvements rather than stalling out and leaving us with no guideline at all. And the fear that it will lead to arguments due to the inability to clearly identify LLM-generated comments is absurd; the same is true for huge swaths of our other conduct policies, such as WP:CANVASS, WP:MEAT, sockpuppetry and COIs - sometimes it is easy to tell when someone is in violation; sometimes it is harder; and sometimes, yes, we do get spurious accusations. We've always been able to deal with them in the past and we'd be able to deal with them in the future; it's not a reason to have no guideline at all. Likewise, many people say "not this one but some hypothetical better one in the future" - we've been doing this song-and-dance with AI policies for years now; no version will be perfect. It's better to get something in to encourage people to compromise on improving it, rather than constantly arguing over a perfect version that will never exist. All our other policies took time to grow incrementially; expecting AI policies to come into existence fully-formed and perfect from the get-go is unrealistic and not a valid reason to oppose something that can, at least, serve as a good starting point. --Aquillion (talk) 17:32, 29 January 2026 (UTC)[reply]
    Do you really think that LLM misuse specifically in discussions threatens to overwhelm editors? My experience is that it irritates editors (extremely so, for a small number of editors), but there's just not that much of it total. I did a quick survey of AFDs some time ago, and I think it turned up less than 1% of AFDs having any LLM-style wall of text. I glanced through the Wikipedia:Teahouse and found that less than 10% of the conversations mention LLMs (and mostly that people were asking or disclosing their use in an article or draft, not so much using it in the discussion). There are over 100 discussions there and almost 700 comments. That's not what I'd expect to find if LLMs were threatening to overwhelm us. Similarly, there are about 300 comments on this page, and I think they're also (detectable) LLM-free.
    Therefore I wonder whether the "threatening to overwhelm" part is about article creation, rather than in discussions. WhatamIdoing (talk) 19:14, 29 January 2026 (UTC)[reply]
  • Support in principle, per the rationale provided by Aquillion above. I agree that, at the very least, give this time to mature if it becomes a guideline (and while it is one). XtraJovial (talkcontribs) 03:37, 1 February 2026 (UTC)[reply]

Discussion

[edit]
  • As policy, this proposal assumes that it is the case (and will for some to continue to be the case) that people can confidently identify the output of Large Language Models. I am skeptical that, for short texts, this can reliably be done today. Worse, I can see this shifting debate on talk pages and drama boards toward partisan allegations of AI use that can neither be confirmed nor refuted. Worse, in the coming months, expect to see LLM integration into operating systems, browsers, and text editors for writing and editing assistance; many people won’t know whether the dotted red underlines derive from an LLM or a dictionary. MarkBernstein (talk) 23:08, 19 January 2026 (UTC)[reply]
    in the coming months, expect to see LLM integration into operating systems, browsers, and text editors for writing and editing assistance We're already there. Gemini is in everything Google, you can't really use any Meta product without accidentally triggering Meta AI, Microsoft Word has Copilot, Windows 11 is an "AI-focused operating system", there are multiple competing AI browsers like ChatGPT Atlas, etc. etc. etc. SuperPianoMan9167 (talk) 23:19, 19 January 2026 (UTC)[reply]
  • Putting this in the discussion section, as I feel it's too lengthy to put in the survey section (and I don't think a simple support or oppose would be nuanced enough)
I generally agree with most points made in the proposed guideline. In my opinion, there's no place for LLM content on Wikipedia. Obviously, writing articles is the primary thing I think most people can agree that LLMs shouldn't be involved in, but furthermore I don't think there should be LLM-generated content in any part of the consensus/decision making process on here; whether that be on talk pages, discussion and deletion venues, and administrative boards. The latter part is where it seems many people disagree on the methods of finding LLM-generated content and what to do with it, both proactively and after it happens. We ultimately need some kind of guideline, as the current method of beating around the bush and dealing with LLM usage as it comes up is not a productive use of anyone's time. And this proposal gets many things right, if it is a bit lengthy. I disagree with the restriction of translation, and the Examples section, but even then I wouldn't say they're big enough issues to make me oppose the proposition. Even if this proposal specifically isn't accepted, I would be amenable to one of the many other suggestions above simply for the fact that there needs to be some kind of guideline about LLM usage. SmittenGalaxy | talk! 01:18, 22 January 2026 (UTC)[reply]
  • Those commenting on this RfC may want to take a look at WP:Administrators' noticeboard#Should people be using AI to create AfDs? which has links to a few AI-generated AfDs which cite now-obsolete notability guidelines (presumably leftovers in the AI's older training data) as it's very relevant to the point made in my proposal (and other supporting arguments) that LLMs are no good at formulating arguments based on or interpretations of PAGs. Athanelar (talk) 12:11, 25 January 2026 (UTC)[reply]
  • (The discussion below is moved from the survey section above to avoid clogging the survey) Athanelar (talk) 19:17, 25 January 2026 (UTC)[reply]
    Ask perplexity.ai what will likely happen to Wikipedia, and you might be shocked The question is why on Earth one should care? I will never understand the compulsion to "ask" an AI for an answer to opinion-based questions and then treat the result like it has any weight whatsoever.
    It's also interesting to me (and I'm not saying this applies to you) that such things are often said by the same people who argue that AI is just a tool and should be viewed morally-neutrally as a result. Personally I've never asked my screwdriver or hammer for its opinion on the project I'm using it in. Athanelar (talk) 12:03, 25 January 2026 (UTC)[reply]
    I've been given plenty of answers from AIs without asking. Sometimes that answer is useful, sometimes it isn't, sometimes (including for the most recent google search I did) the answer was partly useful and party not useful. At Wikipedia:Redirects for discussion/Log/2025 December 4#Thank you amendment I didn't quote the AI, but I did extensively reference the answer it gave me (without asking). In slightly different circumstances it would have made sense to quote the AI response when analysing that response. While I'm not sure what relevance asking an AI about the future of Wikipedia has to this discussion, TTH's underlying point is directly relevant. LLMs are tools, they are very different tools to hammers and screwdrivers but that doesn't stop them being tools - you wouldn't use a hammer to unscrew something, nor would you use a dishwasher to mow your lawn or a lawnmower to clean your dishes but not being appropriate for a task unrelated to their purpose does not make them any less of a tool. How good a tool is at the task it is intended to perform is also not related to whether or not it is a tool - the other day the disposable wooden knife I was given with my meal at a foot outlet on Paddington station snapped rather than cut a piece of chicken, but that doesn't mean knives are not a tool, it just means that this knife was a bad tool. Thryduulf (talk) 13:37, 25 January 2026 (UTC)[reply]
    I agree TTH's underlying point is relevant, I didn't address it because I don't think any supporter here, including my original proposal, is suggesting that any ban on AI generated communication would include a ban on quoting AI for demonstrative purposes, considering that is, in my eyes, very obviously not covered by a prohibition against generating user-to-user communication.
    Unless TT means, like, "I asked the AI to generate a response to your comment and here's what it said:" which would be banned, but would also very clearly just be an attempt at finding a loophole to the ban in the first place, so I don't think needs a carveout as an acceptable use case. Athanelar (talk) 13:47, 25 January 2026 (UTC)[reply]
    p.s., my point re: tools is one of agency. Every tool has a different function, but the thing that makes it a tool is that it needs to be utilised in order to do anything. You don't ask a screwdriver about a screw or your lawnmower about your lawn, you use it to accomplish the task.
    I think people who simultaneously say "AI is just a tool" but then also "ask" AI 'how' or 'why' or other opinion-based topics are essentially trying to have it both ways. I think you would agree with me that anything capable of having an opinion cannot be described as simply a tool, and so you cannot simultaneously believe that AI is merely a tool which assists in human workflows and also that it is caosble of giving you an original opinion worth listening to. Athanelar (talk) 13:51, 25 January 2026 (UTC)[reply]
    I don't agree with that. (Also, AI doesn't have opinions.) Lots of tools perform analysis. Like software, for example, is a tool that has been performing that function for almost a century. Levivich (talk) 14:02, 25 January 2026 (UTC)[reply]
    Even analysis software doesn't draw conclusions for you, though. Sure, it can tell you that there's a certain likelihood a Wikipedia edit is vandalistic based on its training data, but it's not going to go that next step of saying "This is probably vandalism, you should remove it." It's a tool, it gives you the information and leaves you to do what you will with it.
    I'm well aware that LLMs can't actually have opinions, but they act as though they do, with enough efficacy that people genuinely use them for that purpose, as seen above. "Ask perplexity what it thinks is going to happen to Wikipedia" is something we're told to do, as if the conclusion perplexity will assemble based on its training data is something we're supposed to consider compelling. It's exactly the kind of 'thought-outsourcing' I'm cautioning against in my proposal. Athanelar (talk) 18:25, 25 January 2026 (UTC)[reply]
    but it's not going to go that next step of saying "This is probably vandalism, you should remove it." Cluebot and similar tools do exactly that. Thryduulf (talk) 18:27, 25 January 2026 (UTC)[reply]
    I suppose I'm not communicating my point well, I don't want to sound like I'm constantly moving the goalposts; those things are still far more deterministic than LLMs are. "If the likelihood of potential vandalism is >X, then revert" is still quite a markedly different sort of task than "Tell me what is going to happen to Wikipedia in the future." The way that people use LLMs is uniquely 'personal' because of their nature as language models, in a way that I think both hampers their usefulness as tools and also damages the people using them. Athanelar (talk) 18:45, 25 January 2026 (UTC)[reply]
    think both hampers their usefulness as tools and also damages the people using them Are they tools (as you say now) or not tools (as you said earlier)? Whether or not they "damage the people using them" is an entirely subjective opinion that, regardless of whether it is correct or not, is completely irrelevant to this proposed guideline. Thryduulf (talk) 19:03, 25 January 2026 (UTC)[reply]
    All I said earlier was that I perceive a contradiction in the pro-AI camp between the defense that they are merely tools (which has been often invoked as an argument for why they should not be too harshly restricted) and the tendency to use them in ways that are distinctly non tool-like. Athanelar (talk) 19:10, 25 January 2026 (UTC)[reply]
    Even a calculator draws conclusions (2+2=4), and as Thryd points out, some software like Cluebot even acts on those conclusions. Levivich (talk) 18:40, 25 January 2026 (UTC)[reply]
    Cluebot has near perfect accuracy in detecting vandalism btw. Levivich (talk) 18:40, 25 January 2026 (UTC)[reply]
    ....no? Not to get sidetracked, but Cluebot is very good. It's not "near perfect" by any means - I just looked at its ten most recent reverts, and these this one[6] is certainly not vandalism, while [7] is most likely not; the very first episode memorialized her. Good, not perfect. Nowhere close. GreenLipstickLesbian💌🧸 19:05, 25 January 2026 (UTC)[reply]
    LLMs have (more or less stable) opinions. The issue is that the opinions come either from the pre-training (reading and predicting text from the internet) or the fine-tuning (where the developers shape the character and writing style, including via RLHF). These opinions don't necessarily come from methodical, reflective reasoning, it may just come from common opinions on the internet or things the company trained the LLM to say.
    The question of whether they have moral agency may hinge on the exact definition, but events like LLMs strategically lying in order to remain harmless suggest at least some functional moral agency. Alenoach (talk) 19:11, 25 January 2026 (UTC)[reply]
    (edit conflict) I do disagree with you (mostly). A tool doesn't have agency, but neither does an AI - it requires a prompt in order to do anything. You prompt an AI in order to accomplish the task you want to achieve, which is a very different task to a screwdriver but not so different to a grammar checker or reading ease calculator, which also give opinions about the input based on their training data and programming.
    I frequently ask search engines how to do something, and it gives me answers that allow me to accomplish the task (at least in theory, you'd be more than welcome to refit the soft closer to my cupboard door as despite numerous diagrams, descriptions and youtube videos I have consistently failed!). Search engines are unquestionably a tool. Thryduulf (talk) 14:04, 25 January 2026 (UTC)[reply]
    I've assumed that if people are asking a chatbot for an opinion, they're mostly getting a summary of what the internet says about the subject. That is, if you asked something like "Is Wikipedia reliable?", then the chatbot would assemble a string of words similar to other webpages that say something about Wikipedia's reliability. WhatamIdoing (talk) 18:16, 25 January 2026 (UTC)[reply]
    If anyone wants to see an example of what LLMs can do when you ask for suggestions improving an article, I did a demonstration of that last year at VPT here. LLMs are demonstrably useful for this purpose (but still require human review before making any edits, of course). That was before GPT-5 was released, so it'd probably be even better now. Levivich (talk) 14:00, 25 January 2026 (UTC)[reply]
  • Question: Can anybody point to an example of an instance where a LLM actually helped correct an error on a page, or was not used disruptively (intentionally or unintentionally)? Rhinocratt
    c
    19:20, 26 January 2026 (UTC)[reply]
    Like a link to the revision? Rhinocratt
    c
    19:21, 26 January 2026 (UTC)[reply]
    @Rhinocrat, I think you're looking for something like Talk:Böksta Runestone#Clarification on Image Caption – "Possibly Depicting" vs. "Showing", in which someone who isn't a native English speaker pointed out an error in an article. I don't think it'd be fair to blame his use of an LLM for the response from the Wikipedia editors, but of course YMMV. WhatamIdoing (talk) 19:51, 26 January 2026 (UTC)[reply]
    You may be interested in this Signpost article, which evaluates ChatGPT's ability to find errors in featured articles. The author (HaeB) agreed with 68% of the errors reported by ChatGPT, and disagreed with 23%. ChatGPT found at least one confirmed error in 90% of the featured articles. Alenoach (talk) 19:49, 26 January 2026 (UTC)[reply]
    Using it to identify errors for editors to fix is a different implementation than generating content, however. ChompyTheGogoat (talk) 11:31, 29 January 2026 (UTC)[reply]
    Here's a couple thousand LLM edits. Still working through them, and they are fairly indiscriminate, but the infoboxe additions are mostly OK, and some of the extremely minor copyedits are fine or only in need of minor tweaks. (when the AI copyediting goes beyond extremely minor, it starts to not be fine). Still a bit of using a sledgehammer to hammer a nail, though.
    Similarly I am OK with people who already know what they're doing using Claude Code or the like for plugins, etc. Gnomingstuff (talk) 15:30, 27 January 2026 (UTC)[reply]
    I looked at one (1) of those thousands. All it did was cite sources. One had a broken URL (easily fixed). I'd class the added sources in the "kind of weak, but better than nothing, I guess" range. WhatamIdoing (talk) 20:38, 27 January 2026 (UTC)[reply]