Galactic Institutions:
Imagining the Institutional Framework vs. the Cylons — a thought experiment
This post is the third and last piece of my argument for creating an institutional framework to mitigate the danger of catastrophic AI risk.
It is also, as promised, the fun one — so please read it in that spirit.
The first post in this series presented the logic and the principles of such a framework, emphasising the need for adversarial design, kill switches and elements of planned unpredictability.
The second post put forward a detailed proposal for a specific institutional setup. I proposed seven institutions, five of which can analyse, advise and suggest to the legitimate governing bodies, while two wield veto powers.
After theory and structure, the next logical step is the practice. But of course, practice is missing. There is no AI takeover to check it against, and one hopes there never will be. So practice, in this case, is theoretical.
I’ve been using science fiction and fantasy universes to teach about important political concepts for a while. Writers of imagined universes — creative, inconsistency-sensitive people — need to have an understanding about how their society works.
Read more on the logic if interested here.
So it was logical to “test” my framework in such an imagined universe.
One of the most “successful” — from the evil AI point of view — examples of existential risk manifesting is in the Battlestar Galactica universe.
So here follows my test: the institutional framework against the Cylon attack!
The Risk Manifested
contains spoilers for Battlestar Galactica series,
The Battlestar Galactica TV series opens with the almost total wipeout of humanity. After an earlier war and a forty-year silence, intelligent machines created by humans, called Cylons, return. They have spent the interval building agents that pass as human and corrupting the software that runs the human fleet’s defenses.
Specifically, they have built malware into the fleet’s software and worked to have it installed across the human fleet. They also worked to get access to the codes of humanity’s defense network. A smart but impressionable scientist, Gaius Baltar, possessed these codes and has been arguing for the reinstallation of software to the fleet to replace more human-in-the-loop demanding programmes. He meets a beautiful woman and is so taken by her that he shares the codes. The woman is a Cylon, Number 6.
Once the Cylons have the codes, they strike. The shield is disabled and humanity is almost wiped out with Cylon nukes. The malware activates in the human ships, which are then easily destroyed.
Who survives: a few ten thousands out of billions. One ship captain, Adama, refused the software update. His convoy is what remains of humanity — fifty thousand people out of billions.
Institutions in Action
Scenario Analysis: Do we know where these Cylons live? Otherwise, deterrence as such…
The first institution in my framework is the Scenario Laboratory, whose only task is to imagine how a catastrophe could unfold. Let us imagine it came up with six scenarios. These are shown on the chart: a nuclear strike from orbit; infiltration of key personnel; corruptible software system across the fleet; a biological weapon; revolt seeded from within and advanced Cylons.
Now why these specifically. The institution looks at the way the colonies work, and builds scenarios accordingly. For instance, it builds the nuclear attack scenario as it understands the nuclear imbalance. The humans know the Cylons have nukes, and also that the Cylons know where humans live. Humans, on the other hand, do not know where the Cylons live. So there is no way for a capable second strike opportunity, which means that the nuclear strike against us becomes one of the most important scenarios, as there is no credible deterrence.
Other options also start from present weaknesses. A few humans have codes which have the power to shut down the defense system. Obviously, a danger. 12 colonies — just perfect for internal division! Notice how bureaucratic overreach is actually useful here: the years pass, and the institution comes up with better, more delicate scenarios — if for nothing, following the age old logic of finding arguments for its own existence.
The scenario analysis institution, like all of them, gives recommendations. The most important: let’s try to find the Cylons.
Risk Monitoring: Have you noticed all these new faces around?
The Risk Monitoring Institution will try to find evidence for these scenarios unfolding. They will, of course, note that we have no second strike capability. More interesting for us is the infiltration idea. Looking at how key military personnel are surrounded with people suddenly appearing with little past, this might give them a flag or might not. But the case is being built.
Noticing infiltration is never easy…
The institution would not find evidence for biological weapon build-up, but might notice symptoms of inner division: Adama refusing to update his ship’s software! What happens? The false flag might start a reaction that enhances preparation.
Ombudsman: Funny how the lunatics seem to cluster at the same places…
Ombudsmen in general are established as entities to which ordinary citizens and employees can bring complaints and flag anomalies — usually without the power to act, but with the power to draw attention and force a response. The catastrophic-AI version is the channel through which anyone can report that something seems wrong — a person behaving strangely, a system behaving oddly — without having to prove it, and have it registered and looked at. Most of what arrives is noise.
But I would argue that in the BSG case, the infiltrating Cylons might be caught via “citizens’ complaints”. With only a handful of infiltrators in a whole civilization, they are not placed at random; they cluster where the enemy needs them, near power. So of all the citizen complaints, the institution would spot much more coming from near military leadership than, say, the economy, politics, or academia. As there is little reason to be so many more lunatics near the military, the institute by the distribution might spot something.
Redundant institutions. Like the money I put on the red when the black is the winner.
I wanted to make the framework realistic, which is why it shows one institution doing nothing at all. The body meant to evaluate new enemy capability has nothing to evaluate, because no new model is ever captured; it sits idle for forty years. There is also nothing to retroactively annul financially. This models reality: some institutions will be more important than others, and we might not know in advance which ones.
The Power of Specifics
So what do you see, as a decision maker? Your institutions flagging the nuclear imbalance, the danger of codes existing with a few select humans to shut down defences and autonomous systems being built. These flags could not force the colonies to action — but the colonies would have to be suicidal not to act on them.
But the use of these institutions becomes even more clear once we run the thought experiment a little further. These institutions are tasked with thinking of the worst case scenario, meaning they would be asking specific questions. And one such question they would certainly ask is, why do we have a code for disabling the whole defense network?
Notice how disabling the defense network is very different from having a code to make it active. If the code is to disable, that means the network must be running all the time. What possible reason would anyone have for disabling — and why would that decision fall to a scientist? These are the questions that would be asked once such a framework is active.
I have an analogy for this. Many of us have in our homes the carbon monoxide machine that detects whether there is too much carbon monoxide. If someone said, why don’t we make sure there’s a system to remove these — and by the way, I want to take this one out now — wouldn’t we be suspicious?
Which would lead to someone installing a backup, or giving the power to reinstall, but most likely just taking the power to end humanity from the vain scientist suddenly surrounded by a woman with no past.
The Unbearable Boringness of Being
Institutions, especially those trying to prevent something, are often uncool. In our thought experiment, the Cylons probably would not attack, as they would not have the code to the defense network, would learn that deactivation can easily be reactivated, and their people might be flagged. They would work on finding new methods to realise their God-sanctioned genocide.
The Cylons are religious, did I forget to tell you? That is why papal encyclicals are NOT scary at all! And there is NO NEED to ask questions like who in the government is already a Cylon. It will all be all right.
But the point is, what would happen is that something doesn’t happen. The institutions would do their jobs. It would all be very mundane and life would just go on. Which is precisely their purpose. And this purpose is achieved by the logic and the framework. And not the specific institutions.
Cylons need a new plan
The show runners, consciously or not, point to us why the Cylon attack could work. In their famous opening lines, they are not talking about technology, advances and tricks. They emphasize another element — look at the last line.
They have a plan. An adversarial framework suggests we assume the “other” has a plan. A plan is not a goal. It is not a strategy. It is a concrete, specific list of actions. As such, it can possibly be countered. If we prepare. If we also … have a plan.
Like building institutions to prevent all this. Though then we would not have the show, which would be a shame.





