Retrocausality: Can the Future Change the Past?
By Eugene Sandugey · · 13 min read
What is retrocausality?
Retrocausality is the idea that events in the future can influence events in the present. In physics, the equations of motion work equally well in both time directions. Run them backward and they look the same. The "arrow of time" is statistical (entropy increases), not fundamental. Physics does not actually forbid the future from influencing the past.
Every particle in the universe follows the path of least action. Physicists write this down, use it every day, and never ask the obvious question: how does the particle "know" which path is optimal before it gets there?
A note on what this page does. This page proposes a MECHANISM for how 100% optimization efficiency could work. The mechanism uses the transactional interpretation of quantum mechanics. The framework's actual evidence (fine-tuning precision, cross-scale acceleration patterns, independent convergence, JWST results) stands regardless of which interpretation is correct. If you reject TI, the evidence doesn't change. Only the mechanism story does.
John Cramer's transactional interpretation says this is what happens: the way things turn out in the future reaches back and influences which possibilities become real in the present.
If your reflex is "TI is fringe, therefore dismiss," stop and check the logic. Every quantum interpretation (Copenhagen, Many Worlds, Pilot Wave, Transactional) has the same experimental evidence: perfect, identical. No experiment has ever told them apart. None. Copenhagen is popular because it was first, not because it has better evidence. Many Worlds is popular in certain physics circles because it's mathematically elegant. TI is less popular because fewer researchers work on it. Popularity is a social fact about physicists, not an evidential fact about physics. (For unfamiliar terms on this page, see the glossary.)
The framework uses TI because it is the only interpretation with an explicit selection mechanism, which is what you need to explain 100% optimization efficiency.
Why use the transactional interpretation?
Four reasons. A built-in selection mechanism (the handshake). Retrocausal structure (the future influences the present). Computational efficiency (only confirmed paths pay full cost). And it respects relativity in a way the others don't.
Einstein proved there is no universal "now." Observers moving at different speeds disagree about which events happen at the same time. This is the relativity of simultaneity, and it is not controversial: every GPS satellite corrects for it.
Copenhagen says the wavefunction "collapses instantaneously upon measurement." Instantaneous in whose reference frame? There is no preferred frame. Relativity forbids it. Many Worlds says reality "branches when decoherence happens," but decoherence is a gradual process whose timing looks different to different observers. When exactly does the branch occur?
TI sidesteps this entirely. The offer wave goes forward, the confirmation wave comes back, and the transaction is a 4D spacetime event connecting two specific points. No universal "now" needed. No preferred frame. Relativity is built into the structure, not bolted on as an afterthought.
When other pages on this site discuss retrocausality and "future selects past," read those as "under TI." If TI turns out to be wrong, the mechanism story changes. The empirical claim and the core argument don't.
How does the transactional interpretation work?
John Cramer proposed this in 1986. The mechanism:
Think of it like a phone call. You dial every number simultaneously (offer wave goes forward to all possible futures). One person picks up (confirmation wave comes back from the optimal future). The connection is made (reality crystallizes). All the phones you didn't connect with don't ring, don't answer, and don't cost you anything. Only the completed call uses the line.
In Cramer's model: every quantum event sends waves forward in time to all possible futures. Certain futures send waves backward in time to the originating event. Where offer and confirmation waves meet, reality crystallizes. This is what we call wave function "collapse." Unconfirmed paths never fully decohere. Only the confirmed path becomes fully real.
Why does this enable 100% efficiency?
If every phenomenon must optimize the optimization process (the 100% claim), the universe cannot afford to waste computation on dead-end paths. But how do you avoid dead ends when you don't know which paths lead where?
Forward-only optimization faces a brutal dilemma. You could try every possible path and pick the best, but you'd pay full price for millions of paths that go nowhere. Or you could guess which path is best and only try that one, but you'll miss the real answer half the time.
Retrocausality eliminates the dilemma. Explore all paths in superposition (free until collapse). Let the future identify which one works. Only that one crystallizes and pays the computational cost. This is one mechanism that could explain how 100% efficiency is achievable. The 100% claim itself is empirical (zero counterexamples across every tested domain) and stands regardless of which quantum interpretation you prefer. TI is the framework's proposed explanation for HOW. If TI turns out to be wrong, the empirical claim still stands, but a different mechanism would be needed to explain it.
Why isn't the selection random?
If waves cancel and one "wins," isn't that just dice? No. Feynman showed exactly how this works with his path integral (Nobel Prize, established physics, verified to 12 decimal places in quantum electrodynamics).
A photon going from A to B doesn't physically travel every possible route. The wavefunction propagates as one thing. The path integral sums contributions from every conceivable route between A and B. Each route gets a phase based on how much action it costs. Here is the key: routes NEAR the optimal path all have similar phases. Routes far from optimal have wildly different phases from their neighbors.
Add up waves that agree with each other and they reinforce. Add up waves that disagree and they cancel to zero. The optimal path doesn't win by luck. It wins because its neighborhood votes the same way. Everything else's neighborhood argues with itself and averages out to nothing.
This is not a metaphor. This is literally how the principle of least action works in Feynman's formulation. It's how physicists calculate where photons go, how electrons scatter, how particles decay. The most precisely tested predictions in the history of science come from this math.
What TI adds: in every other interpretation, the path integral is "just a calculation tool." Nobody explains WHY this procedure works. TI says the confirmation wave from the future IS the second boundary condition that makes the nearby-optimal paths reinforce and the far-from-optimal paths cancel. It's not an abstract calculation. It's a physical process. The offer wave goes forward, the confirmation comes back, and the standing wave between them IS the path integral made real.
The backward solutions were always there
Most people don't know this. Maxwell's equations, written in 1865, have TWO sets of solutions: retarded waves (forward in time, the normal ones) and advanced waves (backward in time). Every physics student learns to throw away the advanced solutions as "unphysical." Wheeler and Feynman used them in their absorber theory in 1945, decades before Cramer formalized TI in 1986.
TI does not add new physics. It uses solutions that were already in the equations. The comparison:
Copenhagen says: don't explain how collapse works. Many Worlds says: everything happens in infinite branches, for every particle, at every moment. Pilot Wave says: a nonlocal guide wave steers particles across the entire universe. TI says: use the backward solutions that Maxwell already wrote down. One forward wave, one backward wave, one transaction.
Of the four options, TI is the one that requires zero extra assumptions. It just stops throwing away half the math.
How does entanglement work without distance?
Einstein hated entanglement. He called it "spooky action at a distance" because it looked like information traveling faster than light. He proposed a simpler story: the outcome is decided at the moment the particles separate, and measurement just reveals it. Like putting a left glove in one box and a right glove in another, shipping them across the world, and opening one. You instantly know what's in the other. No spookiness, just pre-existing information.
Experiments proved him wrong. Aspect, Clauser, and Zeilinger (Nobel 2022) showed the correlations between entangled particles are stronger than any pre-assigned outcome could produce. The connection is real, not just pre-existing information.
But nothing travels faster than light in TI. The connection between entangled particles is not across space. It's through time. In 3D (our everyday view), two entangled particles are far apart with mysterious correlations. Measure one and the other instantly "knows." From a 4D spacetime perspective, they are connected by a transaction: a standing wave through the time dimension linking the moment they separated to the moments they're measured. One 4D event, not two separate 3D events communicating. The future measurements wrote the correlation backward to the moment of separation.
Maldacena and Susskind (2013) proposed this is literally true in a geometric sense: every entangled pair is connected by a microscopic wormhole. ER=EPR: Einstein-Rosen bridges ARE Einstein-Podolsky-Rosen pairs. The entanglement IS spacetime geometry. Van Raamsdonk (2010) showed that if you remove all entanglement from a region, spacetime itself disconnects. Space falls apart. Entanglement is not a weird side effect of quantum mechanics. It may be the material that space is made of.
Every interaction adds entanglement to the system. Entropy increasing IS that entanglement spreading. The arrow of time IS the growth of this web. And as the universe expands, the holographic boundary grows, creating more room for entanglement, which means more computational capacity. One of dark energy's optimization functions: growing the information boundary.
Three names for one thing
The diffusion model (noise refined toward a target, with the gradient coming from the goal). Retrocausality (confirmation wave from the future constraining the present). The principle of least action (optimizing the whole path at once). These sound like three different ideas. They are the same idea.
In a diffusion model, the target image guides the denoising. In TI, the future boundary guides the present. In least action, the endpoint constrains the path. Each describes a system where the outcome shapes the process, not just the starting conditions. That's what makes optimization possible: the future has a vote.
The least action principle has purpose built in
Every physicist uses the principle of least action daily: nature always picks the most efficient path (see The Mathematics for the full picture). The textbook says this is just a mathematical trick: you can rewrite the equations step by step so the "knowing the future" appearance vanishes.
The framework takes that purpose-like appearance at face value. Under TI, the future really does constrain the present. Particles follow optimal paths because the future is actively selecting them. The step-by-step rewrite gives the same predictions, but the variational form (whole-path optimization) is the one that actually appears in the Lagrangian and Hamiltonian formulations that underpin ALL of physics. When one form is more fundamental across every domain, calling it "just a trick" is the claim that needs defending. See The Mathematics.
How it compares to other interpretations
Every quantum event splits reality into branches. All branches are equally real. No way to pick which branch is "better." And nobody agrees on how to define probability when every outcome happens somewhere.
Every quantum event proposes possibilities. Under TI, only the path confirmed by future boundary conditions becomes real. That's an explicit selection mechanism.
Copenhagen gets every prediction right but won't say why collapse happens, and its "instantaneous collapse" requires a preferred reference frame that Einstein proved doesn't exist. Many Worlds keeps the math clean but has no way to pick outcomes, and its branching events have no definite timing across different reference frames. Pilot Wave gives particles definite paths but requires instant connections across the entire universe. TI is the only interpretation with a built-in selection mechanism AND native compatibility with relativity: each transaction is a 4D spacetime event, not an instantaneous 3D process. All four interpretations make the same predictions for every experiment. The choice between them is about explanatory power, not experimental results.
| Interpretation | Selection mechanism | Relativity-compatible? | Main problem |
|---|---|---|---|
| Copenhagen | None specified | No (instantaneous collapse needs preferred frame) | Won't say why collapse happens |
| Many Worlds | None — every outcome happens | No (branching timing is frame-dependent) | Born rule probabilities and the measure problem |
| Pilot Wave | Pre-existing pilot wave | No (requires nonlocal guidance) | Instant connections across the universe |
| Transactional (TI) | Future boundary conditions confirm one path | Yes (transactions are 4D spacetime events) | Less popular — fewer working physicists develop it |
What experiments support retrocausality?
All interpretations predict the same measurements. But several experimental results are most naturally described in retrocausal language:
Wheeler's Delayed-Choice Experiment (proposed 1978, confirmed 2007 and 2012): Send a photon through an apparatus. After it's already inside, THEN decide how to measure it. The result: your choice, made after the photon committed to a path, retroactively determines what the photon did earlier. As if the photon knew what you were going to choose before you chose it. Exactly what a retrocausal model predicts.
Delayed-Choice Quantum Eraser (Kim et al., 2000): Detect a photon. Then erase information about which path it took. The erasure happens AFTER the detection. The raw data looks the same either way, but when you sort the data by whether the which-path information was erased or kept, an interference pattern appears in the erased subset that was invisible before sorting. The which-path decision, made after detection, determines whether the pattern exists in the sorted data. Exactly what a retrocausal model predicts.
Causal Friendliness Theorems: Recent math in quantum physics showed that certain assumptions physicists take for granted can't all be true at the same time. Something has to give. Retrocausality is one of the officially listed escape routes. Not mainstream yet, but no longer dismissible either.
Negative Dwell Time (Steinberg group, Physical Review Letters 2024): Photons fired through a cloud of rubidium atoms. The ones that pass straight through have a measurable dwell time inside the cloud that is negative — they exit before they would have arrived if traveling at the speed of light the whole way. The effect was known since 1993 but dismissed as an artifact (only the front of the pulse making it through). Steinberg's group used weak measurement to probe the atoms directly during transit, asking the atoms how long the photon was excited in them. The two independent measurements agree: dwell time is negative. The artifact explanation is ruled out. Standard physics still describes the result, but the natural language for what's happening is that the photon's interaction with the atomic cloud reaches into the past.
Time-domain Double Slit (Tirole et al., Nature Physics 2023): The temporal analog of Young's double-slit. Imperial College used a thin film of indium tin oxide that flips between transparent and reflective on femtosecond timescales. Two reflectivity pulses act as two "slits" — same emitter, same receiver, only the time of reflection differs. A laser through the two time slits produces a clean interference pattern in frequency. The most natural way to describe what's happening: the photon explores multiple temporal paths and interferes with itself across time. Path-integral mathematics has always allowed this; the experiment makes it visible. A direct empirical analog of "light tests every possible path, including those that go through the future."
Each result is exactly what a retrocausal model predicts. The framework uses the transactional interpretation because it is the only one with a built-in selection mechanism: the future confirms which path becomes real. Whether TI gains wider acceptance is secondary. What matters is that the mechanism it describes (selection via future boundary conditions) is the only one compatible with 100% optimization efficiency.
Why one time dimension?
The framework makes a structural prediction about spacetime. Zero time dimensions means no computation or change. Two or more time dimensions would allow arbitrary time travel, destroying any optimization process (anything could be undone, nothing would stick).
One time dimension is the sweet spot: enough flexibility for quantum effects (tunneling, superposition, and retrocausal selection), enough structure to prevent paradoxes from collapsing the system. The framework predicts this is the only configuration that works for optimization.
Try to Break This
Steel-manned objections — strongest counterarguments first. Submit yours →
This assumes time only flows forward, which is itself an assumption. The fundamental equations of physics work equally well in both directions. Run them backward and the math is just as valid. The reason we experience time as one-way is about probability (there are more ways for things to be messy than neat), not about the laws themselves. The transactional interpretation doesn't break causation. It extends it: the math has always allowed backward-in-time solutions, but physicists traditionally ignored them. TI takes them seriously.
In TI, paradoxes can't happen because reality only becomes real when past and future agree. Think of it like a phone call: the call only connects if both sides pick up. If the future you're calling contradicts the past that sent the call, the connection never completes. The paradoxical scenario never crystallizes into reality. Paradoxes are impossible by construction, not by more rules.
All quantum interpretations are empirically equivalent. No experiment has ever told them apart. Popularity is a social fact about physicists, not an evidential fact about physics. TI is the only interpretation with a built-in selection mechanism AND native compatibility with relativity. The Causal Friendliness Theorems now list retrocausality as a formally recognized escape route from certain mathematical impossibility results.
Related
10 Nobel Prizes Through the Optimization Lens
Ten Nobel Prizes spanning 104 years (Planck to Aspect) showed the universe is computational. What each one says about optimization.
What Is Consciousness For? The Optimization Theory
The hard problem of consciousness dissolved. Experience is not something the brain produces. It IS optimization at neural scale. Testable and specific.