Entry

Abdulhai, Marwa, Isadora White, Yanming Wan, Ibrahim Qureshi, Joel Leibo, Max Kleiman-Weiner, and Natasha Jaques. 2026. "How LLMs Distort Our Written Language." arXiv preprint arXiv:2603.18161.

Simple Title

How LLMs Distort Our Written Language

Description

Marwa Abdulhai, Isadora White, Yanming Wan, Ibrahim Qureshi, Joel Leibo, Max Kleiman-Weiner, Natasha Jaques

Type

Paper

Year

@June 18, 2024 9:30 PM (GMT+2)

Posted at

March 23, 2026 4:46 PM (GMT+9)

Overview

大規模言語モデル(LLM)を使って文章を書き直させると、単なる文法修正では済まず、元の意味まで変わってしまうことを実証しました。
チャットGPTなどのLLMを日常的に使う人ほど、自分らしさが失われ、より中立的でつまらない文章になる傾向が見つかりました。

チャットGPTやClaudeなどのAIライターは今や10億人以上が使っています。便利だからと頼り過ぎると、実は思った以上に私たちの文章や考え方が変わってしまうかもしれません。この研究は、AIとの付き合い方について深く考えるきっかけをくれます。

Abstract

Large language models (LLMs) are used by over a billion people globally, most often to assist with writing. In this work, we demonstrate that LLMs not only alter the voice and tone of human writing, but also consistently alter the intended meaning. First, we conduct a human user study to understand how people actually interact with LLMs when using them for writing. Our findings reveal that extensive LLM use led to a nearly 70% increase in essays that remained neutral in answering the topic question. Significantly more heavy LLM users reported that the writing was less creative and not in their voice. Next, using a dataset of human-written essays that was collected in 2021 before the widespread release of LLMs, we study how asking an LLM to revise the essay based on the human-written feedback in the dataset induces large changes in the resulting content and meaning. We find that even when LLMs are prompted with expert feedback and asked to only make grammar edits, they still change the text in a way that significantly alters its semantic meaning. We then examine LLM-generated text in the wild, specifically focusing on the 21% of AI-generated scientific peer reviews at a recent top AI conference. We find that LLM-generated reviews place significantly less weight on clarity and significance of the research, and assign scores that, on average, are a full point higher.These findings highlight a misalignment between the perceived benefit of AI use and an implicit, consistent effect on the semantics of human writing, motivating future work on how widespread AI writing will affect our cultural and scientific institutions.

Motivation

LLMが文章作成に使われるようになったけれど、それが実際に人間の書き方や伝えたいことにどう影響するのか、ほとんど調べられていませんでした。
AIの「便利さ」ばかり注目されていて、もしかしたら目に見えない悪い影響が起きているのでは、という疑問から出発しています。
数十億人がAIを使って文章を書く時代だからこそ、それが文化や学問の世界にどんな影響を与えるのか知る必要があると考えたからです。

Method

実際のユーザーがLLMをどう使っているかを調査し、さらに2021年以前に書かれた人間の論文をLLMに修正させて何が変わるかを測定し、さらに国際学会に投稿された実在のAI生成査読文をなぞの詳しく分析しました。

人間ユーザーがLLMを実際にどう使っているかを調べるユーザー研究と、2021年以前のエッセイ論文データセットを使った実験を組み合わせています。
修正前後のテキストを人間が読んで、意味がどれだけ変わったかを評価し、その結果を数値化して分析しました。
LLMが修正した文章と、AIを使わずに人間だけが修正した文章を比較し、さらに実際の査読では査読スコアや重視される観点にどう違いが出るかを調べました。

Results

AIを頻繁に使う人の文章は、創意工夫が失われ、質問に対して中立的なままになる傾向が70%近く増えてしまいます。

AIをよく使うユーザーの文章では、質問に中立的なまま回答するパターンが約70%増加し、独自の声や創意工夫が著しく減少することが確認されました。
AIに『文法だけ修正してください』と指示しても、結果として33%の確率で文章の意味が大きく変わってしまいます。
AI生成の査読(全体の21%)は人間の査読に比べて、研究の明確さや重要性にあまり注目せず、平均で1点以上高いスコアをつける傾向が見られました。

Further Thoughts

LLMの開発企業は、文法修正だけのつもりでも意味を変えるリスクを認識する必要があり、ユーザーにそのことを伝える責任があります。
学校の教育現場や科学論文の査読など、重要な場面でAIライターを使うときは要注意です。無意識のうちに伝えたいことが変わってしまうかもしれません。
今後はAIが文章の意味を変えないようにするための技術や、AIを使うときのガイドライン作りが重要になってきます。また、AIが生成した文章と人間の文章をどう区別していくかも課題です。

AIライターが私たちの文章を知らぬ間に変えてしまう — How LLMs Distort Our Written Language