The Graph Said So

The Graph Said So
Photo by Алекс Арцибашев / Unsplash

There is a finding from a 2024 study published at the ACM CHI Conference that deserves more attention than it has received. Researchers analyzed nearly nine thousand COVID-19 data visualization posts on Twitter, tracking over a million engagement events. Their finding: posts containing data-driven reasoning errors attracted significantly more replies than accurate ones, and the resulting discussions lasted longer. The mechanism was not persuasion. People were pointing out the problems. The effect was the manufacture of apparent controversy: the appearance of live scientific dispute where the underlying question was not, in fact, open.

That is a different and more serious problem than simple misinformation. A chart that convinces you of something false can be corrected. A chart that makes a settled question look contested has already done its damage before anyone fact-checks it. The controversy is the product. The data provides the frame.

This is the environment in which we now produce and consume data visualizations at scale.


The tool problem is structural and largely unexamined. Every major productivity platform — Notion, Salesforce, Power BI, Google Looker, dozens of others — now auto-generates visualizations from whatever data is present. The software asks no prior questions. It does not ask whether the variables have a theoretical relationship. It does not ask whether the sample is appropriate, whether confounders exist, whether the time series is stationary, or whether plotting these two things together implies anything at all. It takes X, it takes Y, and it draws the line.

The graph exists. The graph looks like analysis.

What makes this worse is that the tools are not selecting chart formats randomly. A 2019 study published in IEEE Transactions on Visualization and Computer Graphics ran a series of controlled experiments presenting participants with identical correlational data in different visual formats. Bar graphs and aggregated line graphs, the default outputs of most auto-generation tools, produced significantly higher causal attributions than scatter plots of the same data. Participants shown a bar graph were substantially more likely to conclude that one variable caused the other than participants shown a scatter plot. The data was identical. The inference was not.

The implication is direct: the standard visualization formats that tools generate by default are precisely the formats most likely to trigger false causal inference in viewers. The tool is not neutral. It is selecting, by default, for the most epistemically misleading presentation of the data.


This would be a manageable problem if the people receiving these graphs understood what they were looking at. The evidence suggests they do not — including, critically, people with advanced training in data.

In February 2026, the Federal Reserve Bank of Kansas City published an economic bulletin asking whether the rise in U.S. labor productivity since late 2022 represents a new AI-driven chapter. Chart 1 plots labor productivity above its pre-pandemic trend, with the gen-AI period visually bracketed as a distinct era of stronger growth. The implication is legible before a word is read. The authors note, in the text, that the fit is moderate (R² = 0.19) and that causality can run both ways. They note that AI adoption explains little of the shift in aggregate contributions. These are significant disclaimers.

They appear after the chart has already made its argument.

This is not a fringe publication or a careless analyst. This is a Federal Reserve bulletin doing exactly what the CHI researchers described: producing a data-driven artifact that implies a causal story the underlying data cannot support, one that will circulate as evidence of that story regardless of the footnotes. The chart travels. The R² stays home.


The AI productivity literature is generating this problem at volume. Individual workers report productivity gains of 30, 40, 50 percent from AI tool use. These self-reports get graphed against business outcomes. The graphs appear in board decks. Executives approve further AI investment on the basis of charts that plot a subjective perception against an aggregate measure, with no mechanism specified and no confounders addressed. The fact that both lines go up is treated as the argument.

Meta-analytic evidence finds no robust relationship between AI adoption and aggregate productivity gains. The individual task acceleration is real in controlled settings; the organizational translation is not established. The chart shows both numbers. It does not show the gap between them.


The standard response to this problem is to call for better data literacy: more statistics education, more training in correlation versus causation. This response is inadequate — not because statistics education is unimportant, but because it addresses the wrong failure.

The failure is not happening at the level of statistical technique. A person who knows how to run a regression can still produce a meaningless visualization if they have not first asked whether the two variables have any theoretical basis for a relationship. That question is prior to the statistics. It belongs to epistemology: what is a claim, what would have to be true for this to be evidence of something, what would constitute a reason to believe it.

Data science programs teach the pipeline. They do not, as a rule, teach the epistemology underneath it. The result is practitioners who are technically competent and epistemically unprepared: who know when to use which test but have not been required to ask what testing is for.

The tools have made this worse by removing a decision point. A visualization used to require a deliberate act. Someone had to choose to put those two variables on the same axes. Now the software chooses. The decision has been automated away, and with it the moment of reflection that might have caught the problem before it became a chart, before the chart became a slide, before the slide became a reason.

And the format the software selects by default (the bar graph, the aggregated line) is the one research shows is most likely to make a viewer conclude that one thing caused another. The epistemic failure is baked in before anyone even opens the file.


The CHI researchers were careful to note that their findings come from one domain (COVID-19 discourse on Twitter) and the mechanism they describe has not been formally measured across all contexts. But the structural logic holds wherever data visualizations are produced faster than they can be interrogated. Misleading data-driven content is most effective not at convincing people of false things but at making true things look disputed. That is a precise description of what is happening now with AI productivity, with any domain where the underlying question is genuinely complex and the visualization is not.

The graph does not have to be wrong to be harmful. It has to be accepted as more than it is. Right now, the infrastructure for producing such graphs is expanding faster than the capacity to interrogate them.

That is not, at the base, a data problem. It is an epistemology problem infecting a data problem.


Sources

Lisnic, M., Lex, A., & Kogan, M. (2024). "Yeah, this graph doesn't show that": Analysis of online engagement with misleading data visualizations. Proceedings of the CHI Conference on Human Factors in Computing Systemshttps://doi.org/10.1145/3613904.3642448

Xiong, C., van Weelden, L., & Franconeri, S. (2020). Illusion of causality in visualized data. IEEE Transactions on Visualization and Computer Graphics, 26(1), 853–862. https://doi.org/10.1109/TVCG.2019.2934399

Çakır Melek, N., & Miller, S. (2026). A new U.S. productivity chapter? What industry data say about AI. Federal Reserve Bank of Kansas City Economic Bulletin, February 11. https://www.kansascityfed.org/research/economic-bulletin/a-new-us-productivity-chapter-what-industry-data-say-about-ai/

Jen

Jen