May 2, 2026 · 16 min LONG READ BOOK

Andler on Problem and Situation

Reading notes on chapters 1.4, 6.5, 7.4, and 8.1-8.2 of Daniel Andler’s Intelligence artificielle, intelligence humaine.

I have been making my way through Daniel Andler’s Intelligence artificielle, intelligence humaine: la double énigme (Gallimard, 2023) over the past few months. It is dense, careful, and occasionally needs a second pass to follow and remember the argument cleanly.

The book is built around one claim: AI works on Problems, never on Situations. Andler is a mathematician turned philosopher of cognitive science, founder of the Département d’études cognitives at the École normale supérieure. Three threads carry the case: the five-stage retreat that defined modern AI (chapter 1.4), the failed search for a missing ingredient (chapter 6.5), and the Problem-Situation distinction (definitions in §7.4, vignettes in §8.1, the Type I/II distinction formalised in §8.2). The third is the part that made me want to write this note.

1. The five renunciations

The original Promethean project was a machine that thinks like a human. Andler shows it was abandoned in five distinct steps. Each retreat is independent of the others, and what AI gave up only becomes visible from all five together.

Thinking thought. John Searle’s 1980 Chinese room imagines a person who follows a rulebook to answer questions in Chinese without understanding a word. The rulebook works on the form of the characters and never on their meaning. Andler calls this gap semantic blindness. The same form/meaning split is what makes von Neumann’s universal constructor work, which I wrote about last month: a polymerase copies DNA without parsing its content, faithfully and blindly. The cell pulls this off precisely because the copy step does not need to understand. An LLM is in a comparable position. It operates on token forms and inherits whatever groundings the training corpus provides, with no independent purchase on what those tokens refer to. The system can check what it says against patterns in text, but never against the world, which is one reason hallucinations remain a structurally possible occurrence in modern LLMs, addressed but never fully eliminated (a separate thread I worked through in On Systemic Errors).
Equivalence with human intelligence. Early AI assumed the route to a thinking machine ran through cognitive psychology: study how humans solve a problem, then build the same path in silicon. Marvin Minsky and especially John McCarthy cut this tie. Reaching the same answer became enough; the steps in between were free. Andler labels the two stances anthropic¹ (imitate human cognition) and ananthropic (do not bother). Going ananthropic means AI is no longer obliged to converge on the human path even when it converges on the human answer. A vision transformer is held accountable to ImageNet accuracy. Whether it processes images the way a primate cortex does is a question no one inside the field treats as load-bearing.
Generality. The opening ambition was a single faculty of intelligence that could meet any task. The General Problem Solver of 1957 (Newell, Simon, Shaw) was the most ambitious attempt. Its failure split intelligence into separate domains: vision, language, planning, reasoning, motor control. Nils Nilsson’s image is of a smörgåsbord² of field intelligences. Foundation models complicate the picture. They show genuine emergent transfer across tasks they were not explicitly trained on, and Christopher Summerfield’s These Strange New Minds (2025) makes the careful case that this transfer is real and worth taking seriously. Andler’s take, set up by the Problem-Situation distinction I unpack below, is that this transfer still happens within the plane of Problems: the model interpolates across already-formulated tasks. The act of carving a domain in the first place is somewhere else.
Independence from human intelligence. Expert systems, AI’s first commercial success in the 1980s, encoded the rules of thumb of human experts. The system did not reason its way to a diagnosis. It applied rules a human had already written down. Modern AI looks freer than that on the surface. The dependence is just deeper in the stack³. Pretraining encodes human writing; instruction tuning encodes human demonstrations; RLHF encodes human preferences. Each layer of training is a layer of human supervision. The autonomy the system exhibits after deployment is built on top of this scaffolding.
Reflection. A reflective system has access to its own reasoning. Symbolic AI had a primitive version: you could trace the inference chain back. Statistical machines have nothing of the kind. The model partitions the input space according to mathematical regularities that are not reasons in any strict sense. Mechanistic interpretability studies the substrate of a trained network. There is no symbolic reasoning to recover, because none was performed in the first place. This is part of why hybrid approaches like GraphRAG (combining structured knowledge graphs with vector retrieval) are gaining ground. They put an explicit reasoning layer on top of the statistical core, recovering some of what the symbolic tradition gave up. Lettria⁴ is one of the more interesting efforts here, building automated ontology construction at the heart of GraphRAG. Whether this restores reflection or scaffolds around a reflectionless core is, in Andler’s frame, exactly the question.

These five together describe what mainstream AI research no longer attempts. AI marketing continues to promise some of it.

2. The failed search for the missing ingredient

Chapter 6.5 asks whether AI could recover its ambition by adding back what was given up, one ingredient at a time. Andler examines four candidates. The same wall keeps reappearing across all four. None can simply be bolted onto a current AI system, because each only does its work as part of a larger cognitive whole. Adding any one of them to a Swiss army knife does not turn the knife into a brain.

Consciousness. No theory clear enough to know what we would be engineering. No evidence that consciousness contributes to problem-solving in humans. The candidate where we cannot say what we would be adding, nor whether adding it would help.
Sense. Searle’s problem returning under another name. Andler turns to Bertrand Russell’s distinction between knowledge by acquaintance (direct, embodied contact with a thing) and knowledge by description (knowing a thing through propositions about it). Humans use both. AI uses only the second. Most of the time, description suffices for intelligent action. What acquaintance buys is the ability to investigate: to walk around the thing, ask follow-up questions, update the description. AI cannot investigate because it does not know where to look. Knowing where to look requires common sense, which is the next candidate.
Common sense. “The pen is in the baby’s coat” / “the baby is in the pen.” A reader instantly knows the first pen is a writing instrument and the second is a playpen. Pens fit in coat pockets; playpens hold babies. None of this is in the sentence. Sixty years of attempts to encode common sense as a database (the Cyc project and its successors) have produced incremental progress only. Andler’s deeper move is that common sense is the question of which features of a situation matter, a perspective-taking problem disguised as a knowledge problem. The same one that returns, under another name, in chapter 8.
Affects and metacognition. Affects (emotions, moods, and epistemic feelings such as the answer is on the tip of my tongue) shape attention. They create a salience gradient that organises a search for solutions. Affective computing (Rosalind Picard, 1997) has produced local progress: chatbots that detect irritation, companion robots that simulate warmth. The salience gradient in current AI comes from training data; the system has no stake in outcomes that would generate one. Metacognition (knowing what one knows) is already built into many AI systems, and so is not the missing ingredient anyone hoped it was.

Andler calls the resulting pattern the Swiss army knife syndrome. AI keeps adding blades, and adding the next blade does not transform the knife into something more than a knife. The additive conception of intelligence collapses.

The metaphor that closes the chapter is the projection. Imagine human intelligence projected onto a flat plane, the plane of Problems. The shadow is a set of problem-solving capacities. Project a computer onto the same plane and you get another shadow. The two shadows can be compared, but only on the plane. The cowboy and his shadow are different orders of thing.

3. Problem and Situation

This is the core of the book, and the part I found genuinely powerful. The Problem-Situation distinction is conspicuously absent from current debates on AI, and almost every conversation about what these systems can or cannot do should pass through it. It is the framework that explains why the five renunciations had to happen, why the missing-ingredient hunt fails, and why progress on benchmarks tells us less about intelligence than the benchmarks suggest.

Andler stipulates two definitions in §7.4:

Problem: a posed question with a clear criterion for what counts as a correct answer. Cleanest cases: textbook arithmetic, a chess endgame puzzle, a logistics optimisation.
Situation: a concrete, lived, singular moment that calls for action. Bound to a place, a time, and a subject who experiences it. No two people share one, and no person experiences the same one twice.

In §8.1, Andler walks the reader through seven micro-stories about a man called André leaving work and trying to get to a friend’s dinner. Three of them are enough to see the framework working, and they are arranged to grow progressively further from anything AI can handle.

Situation A. André weighs his options for getting to dinner. The bicycle is rejected (it is raining), the taxi is too expensive, and bus 7 will get him there 45 minutes late, which he judges acceptable. He walks to the bus stop. Reading: a clean optimisation problem (options, costs, decision rule), and the move from lived moment to posed question feels automatic. → Type I Situation.
Situation B. André is at the bus stop. He sees a colleague approaching whose tedious complaints he wants to avoid. A stranger is also waiting for the bus. André starts a conversation with the stranger before the colleague reaches the stop. Reading: a Problem could in principle have been formulated here (“how do I avoid the colleague?”), but Andler’s point is that André did not formulate it. He felt a vague reluctance, a small impulse, and acted. The action handled the situation before any question was posed, and possibly without any question ever being posed at all. The Problem is at most retrofittable in hindsight, not the thing André actually solved. → Type II Situation.
Situation C. Same scene, no colleague involved at all. André simply turns to the stranger and starts a conversation. Reading: there is nothing to retrofit. No discomfort to escape, no goal to optimise, no question even latent in the moment. André acted in a way that can still be assessed as more or less appropriate (warm or intrusive, well-judged or awkward), but no Problem exists, before, during, or after. The contrast with B is the whole point: B has a Problem hiding in the wings that André bypassed without formulating; C has no Problem at all. Both are Type II, but C shows that Type II is not just “Problems we did not bother to write down”. → Type II Situation.

The cognitive labour that turns a Situation into a Problem is problem-formulation: selecting which features of a lived moment will count as the data of a posed question, and which will be pushed to the periphery or dropped. Andler’s claim is that problem-formulation is itself the work, and is not in turn a problem to be solved. It is a perspective-taking operation, irreducibly holistic: the meaning of each element of the Situation depends on the perspective adopted, and the perspective depends on the elements. There is no analytic procedure that produces the right perspective from a list of features. Preferences themselves can emerge during problem-formulation.

§8.2 formalises the contrast and inverts how we ordinarily think about the relation. Type I, the bus-route optimisation, is the special case. Type II is the general case. The illusion that life is “one problem after another” is what Andler calls panproblemism, after Karl Popper. Popper’s late book All Life Is Problem Solving (1994) extends his philosophy of science to a general theory of action: every organism, every cell, every act of cognition is reframed as a tentative solution to a problem set by the environment, refined by trial and error. It is an elegant move and a generative one for evolutionary biology, but Andler argues it overreaches when applied to lived experience. Most of what humans do is not problem-solving in any strict sense, and treating it as such only works because once a Situation has been written down for a reader, it is already a Problem statement. The lived element has been abstracted out at the moment of description.

This is where Andler locates AI’s real limit.

AI receives Problems already abstracted, already perspectivised, already cleansed of the lived element that made the abstraction necessary.

Problem-formulation is the work upstream of every benchmark. AI does not perform it. Benchmarks are by construction Type I; they cannot test problem-formulation because problem-formulation has already happened by the time the benchmark exists. Benchmark improvements live entirely on the plane of Problems and tell us nothing about the Situations that produced them.

4. Verification, reasoning LLMs, and the philosophy gap

From Situations to Verifiable Problems: how AI's domain sits inside human cognition

*The two transformations AI does not perform: problem-formulation (the upstream act, after Andler) and verification (the empirical filter, after Wei).*

The framework lines up surprisingly well with Jason Wei’s Verifier’s Law, which I covered a few months ago. Wei’s claim is that AI’s ability to solve a task is directly proportional to how easily that task can be verified: tasks with objective truth, fast checking, scalable verification, low noise, and continuous reward fall first. From an Andler vantage point, the Verifier’s Law is a description of the Problem plane. Verifiability is a property of well-posed Problems. Type II Situations are unverifiable by construction, because there is no posed question against which to check an answer. The two frameworks meet at the same point from opposite directions: Wei from the empirical observation of which tasks AI conquers, Andler from the philosophical argument about what AI is structurally doing. Both point to a frontier defined by the act of formulating Problems, which is upstream of either of their analyses.

The natural pushback I keep hearing is: surely reasoning LLMs (OpenAI’s o-series, Claude with extended thinking, DeepSeek R1) are starting to handle Situations? They plan, they backtrack, they self-correct. Doesn’t that look like problem-formulation in action?

My current read is that reasoning LLMs improve along the verification frontier without extending it. The training method behind them, often called RLVR (reinforcement learning with verifiable rewards), explicitly selects for tasks where verification is clean: math, code, contest puzzles. These are pure Type I cases, Problems already extracted from any lived context. What the model learns is to spend more compute searching the solution space within an already-given frame. The framing itself, which features count as data, which preferences matter, what the question even is, comes from the prompt or the training distribution. The model interpolates between formulated Problems. It does not turn lived moments into Problems.

A small data point that supports this read: state-of-the-art LLMs are impressive on philosophy at the level of definition, summary, and exam-question response, and weak on actual philosophy work. Original argument, framework construction, identifying which question is the right question to ask, all of this stays out of reach. Philosophy is Type II par excellence; the act of formulating the question is half the work. LLMs are good at Type I philosophy (paraphrase a position, structure a known argument). They struggle at Type II philosophy because that requires problem-formulation, and problem-formulation is exactly what they do not perform.

This is the part of Andler’s argument I take most seriously. The bet that scaling reasoning will get to general intelligence assumes that better answers to already-formulated Problems will eventually reach Situations. In Andler’s framework, this is a category mistake. Better Type I is still Type I.

The conceptual toolbox

Concept	Definition	Illustration	Implication for current AI
Semantic blindness	The system operates on the form of symbols without access to their meaning.	Searle's Chinese room (§1.4.a).	LLMs check outputs against text patterns, never against the world.
Anthropic / ananthropic	Two stances: imitate human cognition (anthropic), or simply reach the same answer (ananthropic).	Minsky and McCarthy cut AI from cognitive science (§1.4.b).	All current SOTA models are ananthropic by design; benchmarks measure outcomes, not paths.
Swiss army knife syndrome	Adding capabilities one by one fails to produce general intelligence; the additive conception collapses.	Failed search for the missing ingredient (§6.6).	No single addition (consciousness, common sense, affects) lifts AI out of its current regime.
Projection	AI is a shadow of human intelligence cast on the plane of Problems; comparison is possible only on that plane.	Closing metaphor of chapter 6.	Benchmarks compare shadows, not what casts them.
Problem	A posed question with a clear criterion for what counts as a correct answer.	Textbook arithmetic, chess endgame, logistics optimisation (§7.4).	Everything AI does well today; the entire benchmark plane.
Situation	A concrete, lived, singular moment that calls for action, bound to a place, time, and subject.	Defined in §7.4; illustrated by the seven André vignettes (§8.1).	What AI never receives; abstracted away by the time a benchmark exists.
Type I / Type II Situations	Type I collapses to a Problem almost automatically. Type II has no obvious Problem to retrofit, or the action handles the moment before any question is posed.	André weighing transport options (Type I) vs starting a conversation with no goal in mind (Type II) (§8.2).	Benchmarks are Type I by construction; reasoning LLMs improve at Type I, not at Type II.
Problem-formulation	The act of selecting which features of a Situation count as data of a posed question, and which go to the periphery.	Holistic, perspective-dependent, irreducible to procedure (§7.4 introduces; §8.2-8.3 argue it is not itself a meta-problem).	The work upstream of every benchmark; what AI does not perform.
Panproblemism	The illusion that all life is problem-solving, after Popper's argument that every organism and act of cognition is a tentative solution to a problem.	Andler's neologism, after Popper's All Life Is Problem Solving (1994); coined in §7.4, developed in §8.2.	Why we keep mistaking benchmark progress for progress on intelligence.
Verifier's Law	AI's ability to solve a task is proportional to how easily success can be verified.	Jason Wei's Stanford talk on scaling AI, late 2025.	The Problem plane described from the empirical side; the structural and the empirical converge.

Two paths follow. The Swiss army knife: more blades, sharper blades, AI as a giant toolbox. Andler reads this as a clear-eyed acceptance of what AI is. The other path is an artificial general intelligence project that genuinely tries to handle Situations. Andler is not optimistic. He argues it deserves consideration anyway.

The book leaves a real question open. Is the projection metaphor a structural impossibility or a current technical limitation? The structural reading keeps AI as a tool, however powerful. The technical reading makes the next twenty years of AI research about something other than scaling. My own reading sits closer to the technical, but Andler has shifted the burden of proof. The question is no longer what AI still needs to add. The question is what problem-formulation in a machine would even look like.

[Etymology] Andler borrows anthropic from the Greek anthropos (human), repurposing it from its more familiar use in cosmology (the anthropic principle, which states that observed properties of the universe must be compatible with the existence of observers). The same Greek root gives the AI safety lab Anthropic its name, in a similar gesture: AI development oriented towards human values and interests. Three uses, one root, three different intellectual projects. ↩
[Etymology] Swedish for a buffet table of small dishes (smör = butter, gås = goose, bord = table). Nilsson uses it to describe AI as a spread of unrelated specialities with no main course. ↩
[Expansion] A foundation model looks autonomous from the outside and is dense with human supervision underneath. The supervision arrives in three stacked layers. (1) Pretraining. The model ingests trillions of tokens of human-authored text, code, images, and other data, with the simple objective of predicting the next token. Everything the model “knows” about the world comes from this corpus, which is a snapshot of human-generated content with all the biases, gaps, and quirks that implies. (2) Supervised fine-tuning (SFT). After pretraining, the raw model is good at completion but not at following instructions. SFT trains it on curated demonstrations: human-written examples of how to respond to specific kinds of prompts, produced by contracted annotators or carefully selected datasets. (3) Reinforcement learning from human feedback (RLHF). Humans rank pairs of model outputs (“which response is better?”), and the model is updated to favour the preferred ones. This shapes tone, helpfulness, refusal behaviour, and a thousand other dimensions. RLAIF (RL from AI feedback) substitutes some human rankers with models that were themselves trained on human preferences, which is cheaper but inherits the same upstream supervision. The capability of foundation models is partly the size of the pretraining corpus and partly the quality of this supervision stack layered on top. The “magic” is real, but it is supervised magic. ↩
[Disclosure] Lettria is a portfolio company, so I am not a neutral observer here. I bring it in because the GraphRAG approach speaks directly to Andler’s reflection point. It is one of the few credible attempts to put structured reasoning back where statistical learning had removed it. ↩

— Daniel Andler, *Intelligence artificielle, intelligence humaine* (2023), chapters 1.4, 6.5, 7.4, 8.1-8.2

From my reading of Intelligence artificielle, intelligence humaine : la double énigme by Daniel Andler &These Strange New Minds by Matthew Summerfield

Also read: AI, Magic Skin, and Je-ne-sais-quoi ·The Moat That Cannot Be Coded: Nvidia, Frontier Labs, and Defensibility in the 21st Century