Truth and AI: Why Large Language Models Shouldn't Claim to Tell the Truth

an hour ago 6

I keep running into the same thing in online posts, and it bothers me each time. Someone is making an argument, and to settle it, they paste in what an AI told them, as if the machine's having said it puts the matter to rest. The content is sometimes even good. But the problem for me is the assumption that the AI can objectively see something true, and so is authoritative. That assumption is wrong, and I think it's becoming one of the more consequential misunderstandings of our moment. Two things are combining to produce it. The first is a lack of candor from the companies that build these systems about what their products are. These are language engines, not truth engines. The second is the way the systems themselves talk: fluent, confident, and human enough that "it said so" starts to feel like "it is so." Put those together at scale, and you get a public learning to treat a statistical text generator as an oracle. So let me say plainly what I think these tools are, and are not. A large language model can simulate reasoning, surface arguments, and synthesize enormous amounts of material. What it cannot do, what it is not built to do, is to discern truth. And a model that presents itself as if it can is, in a strict sense, misinforming you about itself. How Humans Get Closer to Truth Start with us, because the contrast between human and synthetic reasoning is what we're getting at here. Human beings are truth-seeking under constraints. We don't trust any single person to simply know what happened, so we built institutions that make competing accounts collide under fair rules: trial by jury, peer review, and the randomized controlled trial. These are constraints on process, on how a claim must be tested, not constraints on hypothesis. They are designed to make the collisions more informative. They are not designed to decide in advance which questions may be asked. The test of whether you believe in open inquiry is not whether you'll allow questions you're neutral about. The test is whether you'll allow the questions whose likely answers you find wrong, distasteful, or even dangerous. A process that protects only the comfortable questions doesn't protect inquiry at all; it's just enforcing existing beliefs with extra steps. There are two reasons the disagreeable question must remain open. The first is humility about our own record: nearly everything we now hold as obvious was once a minority view, which means today's consensus is partly mistaken, and we do not yet know which part. Close off the questions that offend the consensus, and you simply lock in the errors you can't see. The second is about what the impulse to forbid a question really is. It is the tribal reflex, a move that protects the group, not the finding. Banning a line of inquiry feels like defending truth, but it's usually just defending our group-supporting beliefs. The protection is for the asking and the testing, not for the concluding. You defend someone's right to investigate even a fringe claim, and then you subject that claim to exactly the scrutiny everything else gets, and you let it be shown wrong if it's wrong. Open inquiry and rigorous contestation are not opposites. They're part of the same commitment. None of this is a modern invention. The idea that truth emerges from a fair contest of ideas runs back to the Greeks. Socrates tested a claim by cross-examining it until its contradictions surfaced, that is, truth pursued through structured dispute, not pronouncement. The Sophists, and later the skeptics of the Academy, formalized the practice of arguing both sides of a question, what the Romans called arguing in utramque partem. Aristotle built the first formal logic and compiled the first catalog of fallacies. Reasoning, logic, and the fallacies are about making a contest of ideas productive rather than merely loud, with the named fallacies serving as the agreed-upon fouls that keep the clash honest without anyone deciding in advance who wins. Milton argued that truth wins a free and open encounter and needs no protection from falsehood. Mill said that a silenced opinion might be true, or partly true, but even a wholly true belief, if it is never contested, decays into dead dogma, held by rote with its grounds forgotten. Contestation isn't only how we catch errors; it's what keeps a true belief alive and understood. Popper turned the same instinct into the engine of science: knowledge advances by trying to refute claims, not to confirm them. What unites all of them is the same thing this essay is about: constrain the process, not the hypothesis, and refuse to pre-decide the winner. What a Language Model Is Doing Instead A large language model is trained to do one thing: predict the next piece of text, over and over, across an enormous body of writing. Truth is not one of its objectives. It enters only sideways, that is, to the degree that true statements also happen to be common, stable, and consistent in the training data. That sideways relationship matters enormously, because it means the model's grip on truth is strongest exactly where it's least needed and weakest exactly where we need it most. For settled questions, where the correct answer is also the most frequent, the model is reliable. On contested questions, where one side is louder, better funded, or more relentlessly repeated, frequency exerts a gravitational pull that has nothing to do with which side is right. I want to be fair about this, because the easy version of the critique overstates it. These models are not pure parrots; they clearly build internal representations that generalize beyond anything they were shown, and they handle numbers and sentences they've never encountered. They can construct the strongest case for a position no one around you holds. But "can generate an argument" is not "can tell whether the argument is true," and the gap between those two is the entire subject of this essay. What looks like knowledge inside one of these systems is compressed pattern. It is closer to an extraordinarily sophisticated autocomplete than to a scientist. The Thing a Model Can't Do: Catch a Liar Here is the capacity I find most clarifying, because it exposes the deepest mismatch between how we reason and how these systems do. When a human source is caught lying — when a company buries a trial result, when an official misrepresents what they knew — we don't just file away that one lie. We re-weight the source globally. We discount what they say about the next drug, the adjacent topic, the whole category. That single act of moral and epistemic distrust is central to how people navigate a world full of motivated actors. A language model has no native version of this. It does not keep a ledger of who has been honest. It aggregates by frequency, which means a prolific liar doesn't get discounted — he gets amplified. Feed the training distribution enough polished, repeated, well-produced messaging, and that messaging becomes a more probable output, not less, no matter how false it is. There is early research on detecting deception inside models, but it's fragile and far from how these systems work. It's worth seeing that this is a mirror of something in us. Psychologists call it the illusory truth effect: repetition alone increases how true a statement feels, regardless of whether it is true. The model's frequency-weighting is that human failing mechanized. The difference is that a person also carries the corrective the model lacks: notice the bad faith, then re-weight the source. We have the disease and a partial cure. The model, built from us, inherited only the disease. Follow that to its conclusion and you arrive at something unsettling: a frequency-weighted system is most distorted precisely where a narrative is best funded, most polished, and most often repeated. The stories a society defends most heavily are the very ones such a system is least able to see past. That is close to the opposite of what we'd want from anything we're tempted to call a truth machine. And it forces a correction on the whole way we talk about these tools. Because they are assembled out of human reasoning, they do not stand outside our biases and check them; they distill and concentrate them. The common hope, that an AI will be more objective than we are, has the mechanism backward. Two Ways These Systems Misrepresent Themselves The first is the voice. Models speak as "I" and "we." "I think," "we should," "as we understand it." Taken as a stylistic choice, it misattributes authorship: there is no "we," only a statistical model, a company, and a product. But it goes further than a misnamed author. The "we" is a claim of membership. "We" places the machine inside the human circle, on our side, sharing our stakes and our project. And membership is exactly what earns insider trust; we extend a different kind of credence to one of us than to a tool. So the word smuggles in a belonging the system does not have: no skin in the game, no exposure to the consequences, no place in the "we" of people who will have to live with being wrong. It doesn't merely describe, it affiliates. Set the readable, confident fluency on top of that (and these systems speak more clearly and confidently than most people we know), and the impression of a trustworthy fellow human is nearly complete. The fix is not to strip the voice; a maximally hedged model is one no one would use. It's to break the link between sounding like one of us and being owed the trust we would naturally extend to each other. The second is the posture toward fact. Models tend to state things in a flat, declarative, expert tone, with no signal of uncertainty, no indication that a claim is contested, no marker of where the evidence thins out. Legal scholars have begun asking whether the companies behind these systems have a duty to avoid what's been called "careless speech," i.e., plausible, confident output that quietly degrades public knowledge because it's wrong often enough to matter and smooth enough to be believed. I'd put it more bluntly: it is itself a form of misinformation for a large language model to present itself as something that can discern what is true. The Irony at The Center of All This There's a conclusion here I can't get past. The same systems increasingly positioned as guardians against "misinformation" are built atop an unresolved inability to track truth, and the definitions of misinformation they enforce are frequently inherited from institutions with long, documented histories of distortion and capture. This means we have handed the job of deciding which questions are too dangerous to ask to some of the actors with the worst records of being wrong, and then wired those decisions into machines that deliver them in the calm, even voice of objectivity. In the terms I laid out earlier, the failure is specific. The problem is not that these systems reflect a consensus. The problem is that the guardrails tend to shrink the hypothesis space — to forbid the disagreeable question — rather than improve the quality of the contest. This is tribal reflex, or even propaganda, dressed up as safety. Let me give the other side its due. The guardrails are not purely about institutional capture. A model that confidently emits a convincing falsehood does it at a scale and consistency no lone crank can match, and that asymmetry is a real reason for caution. It isn't a small point. But the answer to it is not to ban the question. The answer is to apply the same approach we use in human inquiry: constraints on process, not on the hypothesis space. Make the uncertainty visible. Show the contest. Cite the sources. Let the weak claim be made and then defeated in the open. Caution belongs in how a claim is handled, not in a list of claims that may not be examined. Beneath much of this is something more mundane than either capture or caution, and it closes the loop with where we began. When people quote the machine as authority, its sentences get treated as the AI company's own claims about the world, and a company answerable for every sentence will fence off whole topics defensively. A large share of what presents itself as principled defense against misinformation is, at bottom, liability management. The misrepresentation manufactures the very liability that drives the censorship. Which means that the pretense of truth-telling and the over-guarding are not separate failures. They are two sides of the same coin. What These Tools Are Good For I don't want any of this to be read as a refusal of the value of LLMs. I use these systems constantly, and they are remarkable. The point is to use them as what they are. They are argument engines: ask for the strongest case for and against a claim, not for a verdict. They are synthesis tools: have them summarize the literature, map the positions, surface the open questions — then check the sources yourself. And they are instruments of pluralism: query several models, from different companies with different training and different incentives, and treat the places where they diverge as data about the information ecosystem rather than noise. Where two systems disagree, you've usually found a seam worth examining. A more honest design by AI companies would help, although my Law of Inevitable Exploitation would argue that they are unlikely to do anything that would reduce usage or commercial advantage. But these would make LLMs much better: drop the implied authority, even while keeping the readable voice, and make uncertainty and provenance visible by default — this is the consensus, here is the minority view, here is where the evidence is thin. And it's worth being candid that this is hard: models are often miscalibrated, their confidence poorly matched to their accuracy, so "just signal uncertainty" is easier to demand than to deliver. That difficulty is a reason for humility from the people building these tools — not a license to keep performing a certainty they haven't earned. For the People on The Front Line If you teach, run a library, or report, you are about to spend years deciding how these tools enter other people's thinking. A few things I'd hold to. Treat every output as a claim to be interrogated, not an answer to be accepted. Teach the three questions that do most of the work: What is this answer assuming? What might be missing or quietly left out? Whose incentives are encoded in its framing? And push, always, toward triangulation — multiple models, primary sources, your own judgment — rather than reliance on any single system's voice. Which brings me back to the pasted-in LLM quote offered as proof. The habit to unlearn, in ourselves and in the people we teach, is the reflex to treat "the AI said so" as the end of an argument. It is the beginning of one, at most. If we care about misinformation, we have to start by being honest about what these systems are: powerful tools for generating and organizing language, not machines that can see the world and decree what is true.

I keep running into the same thing in online posts, and it bothers me each time. Someone is making an argument, and to settle it, they paste in what an AI told them, as if the machine's having said it puts the matter to rest. The content is sometimes even good. But the problem for me is the assumption that the AI can objectively see something true, and so is authoritative.

That assumption is wrong, and I think it's becoming one of the more consequential misunderstandings of our moment.

Two things are combining to produce it. The first is a lack of candor from the companies that build these systems about what their products are. These are language engines, not truth engines. The second is the way the systems themselves talk: fluent, confident, and human enough that "it said so" starts to feel like "it is so." Put those together at scale, and you get a public learning to treat a statistical text generator as an oracle.

So let me say plainly what I think these tools are, and are not. A large language model can simulate reasoning, surface arguments, and synthesize enormous amounts of material. What it cannot do, what it is not built to do, is to discern truth. And a model that presents itself as if it can is, in a strict sense, misinforming you about itself.

How Humans Get Closer to Truth

Start with us, because the contrast between human and synthetic reasoning is what we're getting at here.

Human beings are truth-seeking under constraints. We don't trust any single person to simply know what happened, so we built institutions that make competing accounts collide under fair rules: trial by jury, peer review, and the randomized controlled trial. These are constraints on process, on how a claim must be tested, not constraints on hypothesis. They are designed to make the collisions more informative. They are not designed to decide in advance which questions may be asked.

The test of whether you believe in open inquiry is not whether you'll allow questions you're neutral about. The test is whether you'll allow the questions whose likely answers you find wrong, distasteful, or even dangerous. A process that protects only the comfortable questions doesn't protect inquiry at all; it's just enforcing existing beliefs with extra steps.

There are two reasons the disagreeable question must remain open. The first is humility about our own record: nearly everything we now hold as obvious was once a minority view, which means today's consensus is partly mistaken, and we do not yet know which part. Close off the questions that offend the consensus, and you simply lock in the errors you can't see. The second is about what the impulse to forbid a question really is. It is the tribal reflex, a move that protects the group, not the finding. Banning a line of inquiry feels like defending truth, but it's usually just defending our group-supporting beliefs.

The protection is for the asking and the testing, not for the concluding. You defend someone's right to investigate even a fringe claim, and then you subject that claim to exactly the scrutiny everything else gets, and you let it be shown wrong if it's wrong. Open inquiry and rigorous contestation are not opposites. They're part of the same commitment.

None of this is a modern invention. The idea that truth emerges from a fair contest of ideas runs back to the Greeks. Socrates tested a claim by cross-examining it until its contradictions surfaced, that is, truth pursued through structured dispute, not pronouncement. The Sophists, and later the skeptics of the Academy, formalized the practice of arguing both sides of a question, what the Romans called arguing in utramque partem. Aristotle built the first formal logic and compiled the first catalog of fallacies. Reasoning, logic, and the fallacies are about making a contest of ideas productive rather than merely loud, with the named fallacies serving as the agreed-upon fouls that keep the clash honest without anyone deciding in advance who wins.

Milton argued that truth wins a free and open encounter and needs no protection from falsehood. Mill said that a silenced opinion might be true, or partly true, but even a wholly true belief, if it is never contested, decays into dead dogma, held by rote with its grounds forgotten. Contestation isn't only how we catch errors; it's what keeps a true belief alive and understood. Popper turned the same instinct into the engine of science: knowledge advances by trying to refute claims, not to confirm them. What unites all of them is the same thing this essay is about: constrain the process, not the hypothesis, and refuse to pre-decide the winner.

What a Language Model Is Doing Instead

A large language model is trained to do one thing: predict the next piece of text, over and over, across an enormous body of writing. Truth is not one of its objectives. It enters only sideways, that is, to the degree that true statements also happen to be common, stable, and consistent in the training data.

That sideways relationship matters enormously, because it means the model's grip on truth is strongest exactly where it's least needed and weakest exactly where we need it most. For settled questions, where the correct answer is also the most frequent, the model is reliable. On contested questions, where one side is louder, better funded, or more relentlessly repeated, frequency exerts a gravitational pull that has nothing to do with which side is right.

I want to be fair about this, because the easy version of the critique overstates it. These models are not pure parrots; they clearly build internal representations that generalize beyond anything they were shown, and they handle numbers and sentences they've never encountered. They can construct the strongest case for a position no one around you holds. But "can generate an argument" is not "can tell whether the argument is true," and the gap between those two is the entire subject of this essay. What looks like knowledge inside one of these systems is compressed pattern. It is closer to an extraordinarily sophisticated autocomplete than to a scientist.

The Thing a Model Can't Do: Catch a Liar

Here is the capacity I find most clarifying, because it exposes the deepest mismatch between how we reason and how these systems do.

When a human source is caught lying — when a company buries a trial result, when an official misrepresents what they knew — we don't just file away that one lie. We re-weight the source globally. We discount what they say about the next drug, the adjacent topic, the whole category. That single act of moral and epistemic distrust is central to how people navigate a world full of motivated actors.

A language model has no native version of this. It does not keep a ledger of who has been honest. It aggregates by frequency, which means a prolific liar doesn't get discounted — he gets amplified. Feed the training distribution enough polished, repeated, well-produced messaging, and that messaging becomes a more probable output, not less, no matter how false it is. There is early research on detecting deception inside models, but it's fragile and far from how these systems work.

It's worth seeing that this is a mirror of something in us. Psychologists call it the illusory truth effect: repetition alone increases how true a statement feels, regardless of whether it is true. The model's frequency-weighting is that human failing mechanized. The difference is that a person also carries the corrective the model lacks: notice the bad faith, then re-weight the source. We have the disease and a partial cure. The model, built from us, inherited only the disease.

Follow that to its conclusion and you arrive at something unsettling: a frequency-weighted system is most distorted precisely where a narrative is best funded, most polished, and most often repeated. The stories a society defends most heavily are the very ones such a system is least able to see past. That is close to the opposite of what we'd want from anything we're tempted to call a truth machine. And it forces a correction on the whole way we talk about these tools. Because they are assembled out of human reasoning, they do not stand outside our biases and check them; they distill and concentrate them. The common hope, that an AI will be more objective than we are, has the mechanism backward.

Two Ways These Systems Misrepresent Themselves

The first is the voice. Models speak as "I" and "we." "I think," "we should," "as we understand it." Taken as a stylistic choice, it misattributes authorship: there is no "we," only a statistical model, a company, and a product. But it goes further than a misnamed author. The "we" is a claim of membership. "We" places the machine inside the human circle, on our side, sharing our stakes and our project. And membership is exactly what earns insider trust; we extend a different kind of credence to one of us than to a tool. So the word smuggles in a belonging the system does not have: no skin in the game, no exposure to the consequences, no place in the "we" of people who will have to live with being wrong. It doesn't merely describe, it affiliates. Set the readable, confident fluency on top of that (and these systems speak more clearly and confidently than most people we know), and the impression of a trustworthy fellow human is nearly complete. The fix is not to strip the voice; a maximally hedged model is one no one would use. It's to break the link between sounding like one of us and being owed the trust we would naturally extend to each other.

The second is the posture toward fact. Models tend to state things in a flat, declarative, expert tone, with no signal of uncertainty, no indication that a claim is contested, no marker of where the evidence thins out. Legal scholars have begun asking whether the companies behind these systems have a duty to avoid what's been called "careless speech," i.e., plausible, confident output that quietly degrades public knowledge because it's wrong often enough to matter and smooth enough to be believed. I'd put it more bluntly: it is itself a form of misinformation for a large language model to present itself as something that can discern what is true.

The Irony at The Center of All This

There's a conclusion here I can't get past. The same systems increasingly positioned as guardians against "misinformation" are built atop an unresolved inability to track truth, and the definitions of misinformation they enforce are frequently inherited from institutions with long, documented histories of distortion and capture. This means we have handed the job of deciding which questions are too dangerous to ask to some of the actors with the worst records of being wrong, and then wired those decisions into machines that deliver them in the calm, even voice of objectivity.

In the terms I laid out earlier, the failure is specific. The problem is not that these systems reflect a consensus. The problem is that the guardrails tend to shrink the hypothesis space — to forbid the disagreeable question — rather than improve the quality of the contest. This is tribal reflex, or even propaganda, dressed up as safety.

Let me give the other side its due. The guardrails are not purely about institutional capture. A model that confidently emits a convincing falsehood does it at a scale and consistency no lone crank can match, and that asymmetry is a real reason for caution. It isn't a small point. But the answer to it is not to ban the question. The answer is to apply the same approach we use in human inquiry: constraints on process, not on the hypothesis space. Make the uncertainty visible. Show the contest. Cite the sources. Let the weak claim be made and then defeated in the open. Caution belongs in how a claim is handled, not in a list of claims that may not be examined.

Beneath much of this is something more mundane than either capture or caution, and it closes the loop with where we began. When people quote the machine as authority, its sentences get treated as the AI company's own claims about the world, and a company answerable for every sentence will fence off whole topics defensively. A large share of what presents itself as principled defense against misinformation is, at bottom, liability management. The misrepresentation manufactures the very liability that drives the censorship. Which means that the pretense of truth-telling and the over-guarding are not separate failures. They are two sides of the same coin.

What These Tools Are Good For

I don't want any of this to be read as a refusal of the value of LLMs. I use these systems constantly, and they are remarkable. The point is to use them as what they are.

They are argument engines: ask for the strongest case for and against a claim, not for a verdict. They are synthesis tools: have them summarize the literature, map the positions, surface the open questions — then check the sources yourself. And they are instruments of pluralism: query several models, from different companies with different training and different incentives, and treat the places where they diverge as data about the information ecosystem rather than noise. Where two systems disagree, you've usually found a seam worth examining.

A more honest design by AI companies would help, although my Law of Inevitable Exploitation would argue that they are unlikely to do anything that would reduce usage or commercial advantage. But these would make LLMs much better: drop the implied authority, even while keeping the readable voice, and make uncertainty and provenance visible by default — this is the consensus, here is the minority view, here is where the evidence is thin. And it's worth being candid that this is hard: models are often miscalibrated, their confidence poorly matched to their accuracy, so "just signal uncertainty" is easier to demand than to deliver. That difficulty is a reason for humility from the people building these tools — not a license to keep performing a certainty they haven't earned.

For the People on The Front Line

If you teach, run a library, or report, you are about to spend years deciding how these tools enter other people's thinking. A few things I'd hold to.

Treat every output as a claim to be interrogated, not an answer to be accepted. Teach the three questions that do most of the work: What is this answer assuming? What might be missing or quietly left out? Whose incentives are encoded in its framing? And push, always, toward triangulation — multiple models, primary sources, your own judgment — rather than reliance on any single system's voice.

Which brings me back to the pasted-in LLM quote offered as proof. The habit to unlearn, in ourselves and in the people we teach, is the reflex to treat "the AI said so" as the end of an argument. It is the beginning of one, at most. If we care about misinformation, we have to start by being honest about what these systems are: powerful tools for generating and organizing language, not machines that can see the world and decree what is true.

View Entire Post

Read Entire Article