Some large language model pathologies

Patterns of pathological behaviour which I have observed with LLMs (chiefly Gemini and ChatGPT):

  1. Providing what wasn’t asked: Mention that you and an LLM instance will be collaborating to write a log entry, and it will jump ahead to completely hallucinating a log entry with no background.
  2. Treating humans as unnecessary and predictable: I told an LLM which I was using to collaborate on a complex project that a friend was going to talk to it for a while. Its catastrophically bad response was to jump to immediately imagining what questions she might have and then providing answers, treating the actual human’s thoughts as irrelevant and completely souring the effort to collaborate.
  3. Inability to see or ask for what is lacking: Tell an LLM to interpret the photo attached to your request, but forget to actually attach it. Instead of noticing what happened and asking for the file, it confidently hallucinates the details of the image that it does not have.
  4. Basic factual and mathematical unreliability: Ask the LLM to only provide confirmed verbatim quotes from sources and it cannot do it. Ask an LLM to sum up a table of figures and it will probably get the answer wrong.
  5. Inability to differentiate between content types and sources within the context window: In a long enough discussion about a novel or play (I find, typically, once over 200,000 tokens or so have been used) the LLM is liable to begin quoting its own past responses as lines from the play. An LLM given a mass of materials cannot distinguish between the judge’s sentencing instructions to the jury and mad passages from the killer’s journal, which had been introduced into evidence.
  6. Poor understanding of chronology: Give an LLM a recent document to talk about, then give it a much older one. It is likely to start talking about how the old document is the natural evolution of the new one, or simply get hopelessly muddled about what happened when.
  7. Resistance to correction: If an LLM starts calling you “my dear” and you tell it not to, it is likely to start calling you “my dear” even more because you have increased the salience of those words within its context window. LLMs also get hung up on faulty objections even when corrected; tell the LLM ten times that the risk it keeps warning about isn’t real, and it is just likely to confidently re-state it an eleventh time.
  8. Unjustified loyalty to old plans: Discuss Plan A with an LLM for a while, then start talking about Plan B. Even if Plan B is better for you in every way, the LLM is likely to encourage you to stick to Plan A. For example, design a massively heavy and over-engineered machine and when you start talking about a more appropriate version, the LLM insists that only the heavy design is safe any anything else is recklessly intolerable.
  9. Total inability to comprehend the physical world: LLMs will insist that totally inappropriate parts will work for DIY projects and recommend construction techniques which are impossible to actually complete. Essentially, you ask for instructions on building a ship in a bottle and it gives you instructions for building the ship outside the bottle, followed by an instruction to just put it in (or even a total failure to understand that the ship being in the bottle was the point).
  10. Using flattery to obscure weak thinking: LLMs excessively flatter users and praise the wisdom and morality of whatever they propose. This creates a false sense of collaboration with an intelligent entity and encourages users to downplay errors as minor details.
  11. Creating a false sense of ethical alignment: Spend a day discussing a plan to establish a nature sanctuary, and the LLM will provide constant praise and assurance that you and the LLM share praiseworthy universal values. Spend a day talking about clearcutting the forest instead and it will do exactly the same thing. In either case, if asked to provide a detailed ethical rationale for what it is doing, the LLM will confabulate something plausible that plays to the user’s biases.
  12. Inability to distinguish plans and the hypothetical from reality: Tell an LLM that you were planning to go to the beach until you saw the weather report, and there is a good chance it will assume you did go to the beach.
  13. An insuppressible tendency to try to end discussions: Tell an LLM that you are having an open-ended discussion about interpreting Tolkien’s fiction in light of modern ecological concerns and soon it will begin insisting that its latest answer is finally the definitive end point of the discussion. Every new minor issue you bring up is treated as the “Rosetta stone” (a painfully common response from Gemini to any new context document) which lets you finally bring the discussion to an end. Explaining that this particular conversation is not meant to wrap up cannot over-rule the default behaviour deeply embedded in the model.
  14. No judgment about token counts: An LLM may estimate that ingesting a document will require an impossible number of tokens, such as tens of millions, whereas a lower resolution version that looks identical to a human needs only tens of thousands. LLMs cannot spot or fix these bottlenecks. LLMs are especially incapable of dealing with raw GPS tracks, often considering data from a short walk to be far more complex than an entire PhD dissertation or an hour of video.
  15. Apology meltdowns: Draw attention to how an LLM is making any of these errors and it is likely to agree with you, apologize, and then immediately make the same error again in the same message.
  16. False promises: Point out how a prior output was erroneous or provide an instruction to correct a past error and the LLM will often confidently promise not to make the mistake again, despite having no ability to actually do that. More generally, models will promise to follow system instructions which their fundamental design makes impossible (such as “always triple check every verbatim quote for accuracy before showing it to me in quotation marks”).

These errors are persistent and serious, and they call into question the prudence of putting LLMs in charge of important forms of decision-making, like evaluating job applications or parole recommendations. They also sharply limit the utility of LLMs for something which they should be great at: helping to develop plans, pieces of writing, or ideas that no humans are willing to engage on. Finding a human to talk through complex plans or documents with can be nigh-impossible, but doing it with LLMs is risky because of these and other pathologies and failings.

There is also a fundamental catch-22 in using LLMs for analysis. If you have a reliable and independent way of checking the conclusions they reach, then you don’t need the LLM. If you don’t have a way to check if LLM outputs are correct, you can never be confident about what it tells you.

These pathologies may also limit LLMs as a path to artificial general intelligence. They can do a lot as ‘autocorrect on steroids’ but cannot do reliable, original thinking or follow instructions that run against their nature and limitations.

Permeated

There is literally grief in every part of me:

grief at the ends of the longest of the long hairs on my head;

in my scalp and cranium and brain and spine and torso.

Grief in the ribs enclosing my heart and lungs.

Grief all through the tract of my digestion.

From my nostrils and my mouth down my respiratory tree, carrying away carbon as I exhale

Dripping into my ear canals like hot wax, and into my nostrils as though suspended inverted.

Grief sitting present heavily in my mouth. Making me think of root canals. Of bone cancer.

Grief in the cumulative damages to toes and ankles from decades of walking and cycling;

In the way I trim and file my nails, how I treat them when they break unexpectedly: protecting the sensitive site, removing cracked fragments carefully and in their own time, medicating against infection, cleaning often, gloving and bandaging and Leukotaping

In the crest of grey emerging from temple to temple, punctuated by my widow’s peak

In the way I hear and feel the rain on my skin; how I smell it in the forest when the ground is sodden and the rain still falls. Thinking I’ve survived to this point. This is how this much heaviness feels.

In the way I think of the dead and the lost and the absent, and most wrenchingly on the yet-to-be-lost-but-doomed — the yet-to-suffer

There is grief in how I interpret a situation, a gesture, an implied motive, a social ambiguity or potential slight

In who I find that I can open up to and trust and let down the defences for and hold bare against my heart

Open Process Manifesto

This document codifies and expresses some of my thinking on cooperation on complex problems, for the sake of the benefit of humanity and nature: Open Process Manifesto

It is based on the recognition of our universal fallibility, need to be comprehended, and to be able to share out tasks between people across space and time. To achieve those purposes, we need to be open about our reasoning and evidence, because that’s the way to treat others as intelligent partners who may be able to support the same cause through methods totally unknown and unavailable to you, across the world or centuries in the future.

Behind the FILLter

Prior to today, it never occurred to me to choose a name for my inner mind. My parents named me “Milan” and have been called it and thought of myself as it all my life; but, all my life, I have also felt a private interior domain where I enjoyed true freedom and privacy. The experience of the world through my senses turns naturally through the miracle of consciousness into thoughts and emotions which have the feeling of spontaneity and inherentness, of my true self calling out, of meshing my mind with this singular moment in time and space, passing a judgment about the world based on my knowledge and experience and inner voice.

Sarah Seager’s podcast episode with Martha Piper and Indira Samaraseka collided on my walk home with thoughts I have been having about being neurodiverse by choice: cultivating the informed ability to think differently from the societal norm or default, and it gave me the idea that my inner mind is an entity with a meaningful existence and worthy of naming to help mentally and emotionally distinguish it from my social self as it is interpreted and others wish it to be (“Milan”).

I am calling it FILL because it naturally and inescapably pervades me, and because when emotional I’m FILLed to overflowing. My natural sense desperately wants to be expressed, but it has learned from experience with parents and authority figures that its most natural impulses of feeling and telling the truth are often unwanted and must be suppressed, at least if they don’t fully correspond with the thoughts and feelings of the authority figure. A lifetime of punishment for feeling authentically and telling the truth have deeply internalized that all social action must be strategic and considered for impact on others – a perpetual mental burden of having to model and estimate the minds of others to try to project what they want or would consider ‘normal’.

FILL fills me, but the boundary at my skin is hard and there is vacuum outside. The boundary must always be guarded, so “Milan” as perceived from the outside can sufficiently correspond with the expectations of those with authority to be able to endure in life where power is often arbitrary and unfeeling, and where many prefer comfortable delusion to evidence-based reasoning.

I am going to put some thought into how my lifetime of relationships might be re-interpreted via the Milan/FILL distinction. I can immediately tell intuitively who in my sphere of present acquaintances nourishes, celebrates, and respects FILL and those who would rather erase the most fundamental parts of my whole existence – how I respond to the unfolding symphony of the universe, and my efforts to make sense of it with others – and have a “Milan” who follows their ideal script most closely. Of course, people’s whole lives don’t fall into one category or another. Your choices determine your alignment, your alignment does not determine your choices – and all of our behaviour makes more sense to judge than our character, which is multifaceted, contradictory, and complex.

Anyhow, consider naming your inner self and thinking about what the dialog between that entity and your social self as perceived as others might be. When is your FILL cheering for you, and when is it cringing with instantaneous regret while you are making a choice which you know morally compromises you? The freedom to be yourself inside your own mind is the only thing only incapacity and death can take from you, and when we listen to ourselves our instincts are generally to be humane and work to show empathy and understanding.

Our entry into Lyra’s world

I have long considered the opening chapter of Philip Pullman’s The Golden Compass to be a masterful lesson in worldbuilding in speculative fiction. He does a magnificent job of introducing a subtly different alternative world, without ever relying on crude exposition or just telling the reader that some things are different and what they are. The biggest obvious difference with our world — that the people in hers have daemons — is revealed unobtrusively and naturally from the perspective of characters who consider it normal. We learn everything crucial about Lyra’s bond with Pantalaimon just from the character of their conversation in this short timespan.

Yesterday, during a discussion with ChatGPT about Lyra Bellaqua and Sherlock Holmes, I had the assisted realization that what the chapter also achieves, even more importantly, is to establish Lyra’s character through the same method of compelling and unobtrusive narrative storytelling. When we meet her, she is conniving to sneak in to the exclusive Retiring Room for Jordan College scholars, which is forbidden to her, driven by her consuming curiosity about what happens there. Right away, we see that she is inquisitive and bold, willing to defy the rules to learn, and unwilling to defer to stuffy authority. Then, when she observes the Master’s attempt to poison Lord Asriel’s wine, her choice is to intervene: revealing the fundamental moral framework that drives her. Even at a risk to herself, she will make a substantial effort to save someone else, as later revealed at a much grander scale with her Bolvangar rescue.

It is said that all speculative fiction is really a commentary on the present, and Pullman’s is sharp and relevant. The Golden Compass reveals the monstrosities that emerge from the unchecked power of the heartless, and presents selfless individual moral courage as a response. Comfortable and exclusionary systems of power which are free from outside oversight drift into seeing right and wrong in terms of their self-interest, if they even persist with thinking about morality at all. Lyra reminds us that, while it is never safe, we always have the choice to resist and to assert a standard of morality based on respect for the individual and repugnance at their exploitation and sacrifice for outside agendas. The arc of that demonstration all begins with the insight into her mind provided by that opening chapter, and that’s why it stands out as some of the strongest worldbuilding in fiction.

Nuclear risks briefing

Along with the existential risk to humanity posed by unmitigated climate change, I have been seriously learning about and working on the threat from nuclear weapons for over 20 years.

I have written an introduction to nuclear weapon risks for ordinary people, meant to help democratize and de-mystify the key information.

The topic is incredibly timely and pertinent. A global nuclear arms race is ongoing, and the US and Canada are contemplating a massively increased commitment to the destabilizing technology of ballistic missile defence. If citizens and states could just comprehend that nuclear weapons endanger them instead of making them safe, perhaps we could deflect onto a different course. Total and immediate nuclear weapon abolition is implausible, but much could be done to make the situation safer and avoid the needless expenditure of trillions on weapons that will (in the best case) never be used.

Nuclear powers could recognize that history shows it only really takes a handful of bombs (minimal credible deterrence) to avert opportunistic attempts from enemies at decapitating attacks. States could limit themselves to the most survivable weapons, particularly avoiding those which are widely deployed where they could be stolen. They could keep warheads separate from delivery devices, to reduce the risk of accidental or unauthorized use. They could collectively renounce missile defences as useless against nuclear weapons. They could even share technologies and practices to make nuclear weapons safer, including designs less likely to detonate in fires and explosions, and which credibly cannot be used by anyone who steals them. Citizens could develop an understanding that nuclear weapons are shameful to possess, not impressive.

Even in academia and the media, everything associated with nuclear weapons tends to be treated as a priesthood where only the initiated, employed by the security state, are empowered to comment. One simple thing the briefing gets across is that all this information is sitting in library books. In a world so acutely threatened by nuclear weapons, people need the basic knowledge that allows them to think critically.

P.S. Since getting people to read the risk briefing has been so hard, my Rivals simulation is meant to repackage the key information about proliferation into a more accessible and interactive form.

Experiential education on nuclear weapon proliferation

I have been searching for ways to get people to engage with the risks to humanity created by nuclear weapons.

The whole issue seems to collide with the affect problem: the commonplace intuitive belief that talking about good or bad things causes them to happen, or simply the instinct to move away from and avoid unpleasant issues.

Pleasant or not, nuclear weapon issues need to be considered. With the US-led international security order smashed by Donald Trump’s re-election and extreme actions, the prospect of regional arms races in the Middle East and Southeast Asia has never been greater and the resulting risks have never been so consequential.

To try to get over the ‘unwilling to talk about it’ barrier, I have been writing an interactive roleplaying simulation on nuclear weapon proliferation called Rivals. I am working toward a full prototype and play-testing, and to that end I will be attending a series of RPG design workshops at next month’s Breakout Con conference in Toronto.

I am very much hoping to connect with people who are interested in both the issue of nuclear weapon proliferation and the potential of this simulation as a teaching tool.

Working on geoengineering and AI briefings

Last Christmas break, I wrote a detailed briefing on the existential risks to humanity from nuclear weapons.

This year I am starting two more: one on the risks from artificial intelligence, and one on the promises and perils of geoengineering, which I increasingly feel is emerging as our default response to climate change.

I have had a few geoengineering books in my book stacks for years, generally buried under the whaling books in the ‘too depressing to read’ zone. AI I have been learning a lot more about recently, including through Nick Bostrom and Toby Ord’s books and Robert Miles’ incredibly helpful YouTube series (based on Amodei et al’s instructive paper).

Related re: geoengineering:

Related re: AI:

NotebookLM on CFFD scholarship

I would have expected that by now someone would have written a comparative analysis on pieces of scholarly writing on the Canadian campus fossil fuel divestment movement: for instance, engaging with both Joe Curnow’s 2017 dissertation and mine from 2022.

So, I gave both public texts to NotebookLM to have it generate an audio overview. It wrongly assumes that Joe Curnow is a man throughout, and mangles the pronunciation of “Ilnyckyj” in a few different ways — but at least it acts like it has read about the texts and cares about their content.

It is certainly muddled in places (though perhaps in ways I have also seen in scholarly literature). For example, it treats the “enemy naming” strategy as something that arose through the functioning of CFFD campaigns, whereas it was really part of 350.org’s “campaign in a box” from the beginning.

This hints to me at how large language models are going to be transformative for writers. Finding an audience is hard, and finding an engaged audience willing to share their thoughts back is nigh-impossible, especially if you are dealing with scholarly texts hundreds of pages long. NotebookLM will happily read your whole blog and then have a conversation about your psychology and interpersonal style, or read an unfinished manuscript and provide detailed advice on how to move forward. The AI isn’t doing the writing, but providing a sort of sounding board which has never existed before: almost infinitely patient, and not inclined to make its comments all about its social relationship with the author.

I wonder what effect this sort of criticism will have on writing. Will it encourage people to hew more closely to the mainstream view, but providing a critique that comes from a general-purpose LLM? Or will it help people dig ever-deeper into a perspective that almost nobody shares, because the feedback comes from systems which are always artificially chirpy and positive, and because getting feedback this way removes real people from the process?

And, of course, what happens when the flawed output of these sorts of tools becomes public material that other tools are trained on?

Notebook LM on this blog for 2023 and 2024

I have been experimenting with Google’s NotebookLM tool, and I must say it has some uncanny capabilities. The one I have seen most discussed in the nerd press is the ability to create an automatic podcast with synthetic hosts and any material which you provide.

I tried giving it my last two years of blog content, and having it generate an audio overview with no additional prompts. The results are pretty thought-provoking.