Improving text -or not- with Large Language Models

May 12, 2024

Some time ago I saw someone in LinkedIn posting a text and adding “I improved this text with ChatGPT”. The text was fairly simple, profesional but sober and with kind of normal words and not complicated estructures. It was indeed correct, but my thought was what it usually is in this cases: “Was the original improved, or just changed?”.

Im very skeptic about the concept of improving. I have seen people improving furniture, their home, their clothes, or their cars, to make it considerably worst - in my opinion thought. The concepts of good or bad are subjective in many realms, thus the concept of improving is as well subjective.

When a text is fairly well written, you have decent grammar skills, and you are a fairly good communicator, chances are that your text is at least ok, probably good. And on top of it, it is yours, so it has the value of the unique style and genuineness. Is changing it an improvement? or just that, a change, changing it to a different version more generic, more standard, more flattened out? Maybe.

But ok, if your text is not so good, maybe the above can be an actual improvement.

But in both cases, how does a large language module qualify some text to be better than other. I decided to stop speculating, and go to the source, and I asked ChatGPT the same.

When I ask you to improve a text, how do you discriminative what is "improving", or how something can be qualitative considered better than something else?

What ChatGPT considers as improving a text

This is what ChatGPT told me:

When improving a text, I consider several factors to determine what makes one version qualitatively better than another. Here are the key aspects I focus on:

Clarity and Precision: Making the text clearer and more precise, ensuring that the message is understandable and unambiguous. This often involves rephrasing sentences, choosing more specific words, or reordering information for better flow.
Conciseness: Removing redundant words or phrases, and streamlining the text to make it more concise without losing essential information. This helps in making the text more impactful and easier to read.
Grammar and Syntax: Correcting grammatical errors and improving sentence structure. Proper grammar and syntax help in communicating the message effectively and professionally.
Style and Tone: Adjusting the style and tone to suit the intended audience and purpose. This could mean making the text more formal or informal, adjusting the level of technicality, or making the tone more persuasive, informative, or friendly, depending on the context.
Engagement and Interest: Enhancing the text to make it more engaging and interesting to read. This could involve adding rhetorical devices, varying sentence lengths, or incorporating compelling examples.
Coherence and Cohesion: Ensuring the text is logically organized and the ideas flow smoothly from one to the next. This may involve adding transition words, breaking up long paragraphs, or restructuring sections for better coherence.

Each improvement is aimed at making the text more effective for its intended purpose and audience. When suggesting changes, I balance these aspects to enhance the overall quality of the text.

Ok, the categories make sense. But isn’t it kind of a “circular reasoning” in a definitional context?. You say you improve it for example by doing it more concise, but same as I ask you how do you know it is improved, i can ask you how do you know it is more concise. I don’t like feeling trapped in a logical fallacy, so I have to go deeper. I imagine this can come from a semi-supervised learning, in which humans are labeling texts in regards of the dimensions above, and others, so then the model can learn from the context. So I did a bit of research, and a bit of prompting, and below is what i found -spoiler alert, my idea was wrong-

Reference data

GPT doesn’t require previously labeled data to improve text or to evaluate elements like clarity, conciseness, or engagement. Instead, its training involves learning from a vast array of text data and editorial examples, which has given it an understanding of language norms, styles, and effective communication strategies across various contexts.

Here’s how it approaches evaluating and improving text:

Language Models: Its core is based on large-scale language models trained on diverse datasets encompassing a wide range of topics, styles, and formats. This training helps it understand what constitutes clear, concise, engaging, and grammatically correct language.
Patterns and Probabilities: It analyzes text based on learned patterns and probabilities. For instance, certain phrasing might commonly appear in well-written articles, suggesting that it’s clear or engaging. Similarly, certain structures or word choices might be flagged as unclear or verbose in less effective texts.
Contextual Understanding: It uses contextual cues to adjust tone, style, and formality. The intended audience, purpose of the text, and the medium through which the text is communicated are all considered. This contextual understanding helps tailor improvements to the specific needs of the text.
Heuristics and Rules: Based on conventional rules of grammar and style, it applies heuristics for correcting errors and improving syntax. This includes adhering to grammatical norms and style guidelines that are widely accepted in various language use contexts.
Feedback Loops: While it doesn’t learn dynamically in real-time from user interactions due to its design as a static model, the iterative improvements and updates to AI models like it are informed by user feedback and advancements in natural language processing research.

This combination of learned data, context analysis, and established linguistic rules allows it to make educated judgments on subjective aspects of text quality, aiming to enhance readability, effectiveness, and audience engagement.

But again, when we say “a phrase might commonly appear in well written articles”, how do it tell apart in its training data which texts are well written and which ones are not?

The distinction between well-written and poorly written texts isn’t explicitly labeled by human reviewers for each individual example. Instead, the model learns to identify quality based on several indirect factors:

Source Quality: Texts are sourced from a wide variety of contexts, including reputable publications, educational materials, and other high-quality sources known for their editorial standards. The model implicitly learns from the quality of writing these sources typically uphold.
Contextual and Comparative Learning: Through exposure to vast amounts of text, the model develops an ability to compare and recognize patterns that distinguish clearer, more effective writing from less clear or error-prone writing. It notices how certain phrases and constructions are used across different texts and starts to associate specific structures and vocabulary with more polished, professional writing.
Corrections and Revisions: Training data can include edited versions of texts, where revisions made by editors or writers improve clarity, grammar, or style. By analyzing pre- and post-edit versions of texts, the model learns which changes enhance the quality of writing.
Feedback Mechanisms: Although it doesn’t learn in real-time, overall model improvements are made by incorporating feedback on the performance of earlier models. This includes adjustments based on how well the model’s outputs align with human judgments of quality in controlled testing and evaluation scenarios.

These methods allow the model to develop a nuanced understanding of what makes text effective, engaging, and clear, even without explicit labels distinguishing every example of good or bad writing during training.

Conclusion

LLMs assess and improves text quality, in terms of Coherence and Cohesion, Engagement and Interest, Style and Tone, Grammar and Syntax, Conciseness, Clarity and Precision, through a sophisticated integration of data, contextual analysis, and linguistic rules.

The model doesn't rely on explicit labels distinguishing well-written from poorly written texts. Instead, it learns to recognize quality through indirect measures, such as the inherent standards of texts sourced from reputable publications and educational materials. Through exposure to vast amounts of varied texts, the model develops a capability to identify and differentiate patterns indicative of clear and effective writing from those that are not. It also learns from revisions in texts, understanding which changes enhance clarity and correctness. Additionally, the model adapts and evolves based on feedback, aligning its output more closely with human judgments of textual quality.

This approach allows the AI to cultivate a deep understanding of what makes text engaging, effective, and clear, thereby improving its ability to enhance readability and audience engagement without direct guidance on every instance of text it analyzes.

Is not magic, is science.

Leadership in the time of the robots

Discussion about this post