This problem is very annoying, As a non-theoretical physicist I didn’t even know HOW to ask the question that I wanted to answer here.
Most everyone understands maxwell’s equations and the concept of the photon, and the courses out there explain how we came to quantize light pretty well. Not as many get exposed to QFT but the materials are out there.
Once you start asking the more advanced questions like “what was acutally tried to model the graviton as a new particle” then suddenly ALL the course providers are silent. It’s either high level hand-waving from the news or you’re left with the journals to fend for yourself, that made me angry, which is why this post now exists.
This is my best attempt to explain the issues that arrived when we tried to model the graviton into the standard model.
It all starts with two slits
The double slit experiment is annoying af. An electron fired at a wall with two holes produces an interference pattern on the detector behind it — the same pattern you’d get from a wave. But electrons are particles. So which slit did it go through?
There is no fact about which slit the electron traversed, because no measurement recorded it. And if no measurement recorded it, excluding either path would make a prediction that disagrees with experiment. You must sum over both.
The moment you measure which slit, the interference pattern vanishes. Very silly.
No observation → two distributions (one per slit)
No observation → interference pattern (normal weird distribution)
The act of making the intermediate state observable destroys the effect that required you to sum over it.
So if you have a situation like this how do you measure the probability of a photon being detected at any particular position?
The mathematical expression of this is the path integral. The probability amplitude for a particle to travel from state to state is:
You integrate over every possible path connecting to , weighting each by where is the action along that path. No path is excluded.
Paths that are wildly different from the classical trajectory don’t vanish — they contribute oscillating phases that largely cancel each other out. The classical path survives because nearby paths have nearly identical phases and add constructively. Quantum behavior emerges where they don’t.
Feynman diagrams were developed to apply this principle to particle interactions. Instead of summing over paths in space, you sum over all possible interaction histories consistent with what you observed at the boundaries — particles in, particles out. Each diagram represents one such history.
The “virtual particles” appearing inside diagrams are not claims about what “really happened” — they are terms in the sum over unobserved intermediate states, exactly as both slits are terms in the double slit sum. The only thing that is real, in the sense of being observable and measurable, is — the probability that emerges from summing everything you cannot see. We’ll come back to this idea later, but first, we have to talk about some ideas from general relativity.
How does one… ‘quantize’ the gravity?
To quantize gravity, the idea is this, we assume that this particle exists, and just try to measure and observe it’s contributions to the physics of situations we already understand so we can compare our theory with experiment to determine it’s predictive effect.
So let’s try to do that here with gravity. General relativity gives us “the metric”, , it encodes spacetime curvature meaning it defines distance function between two points in spacetime.
In flat space you know how to use Pythagorean theorem to derive distance between two points:
But we don’t live in “flat” space, part of why GR was so wild is that it appears that reality itself curves with the existence of massive objects.
The metric contains the coefficients that tell you how to compute distance when space is curved:
It encodes the result of all the mass/energy in the universe already. Now back to the graviton particle.
To quantize gravity we assume the graviton exists as a small ripple on a flat spacetime.
So we take a “basic spacetime” A.K.A. a flat spacetime, to start, that is modeled this way.
flat spacetime (our baseline approximation of a flat minkowski spacetime)
Then we embed our particle in it, that “somehow” encodes and conveys the force of gravity throughout the universe.
the difference — what we’re trying to quantize as the graviton
Now we can take the true metric of our actual universe from general relativity and relate it to the flat spacetime, naturally the contribution from the graviton is what we care about.
As an aside, The current best estimate of the true metric of our reality is called the FLRW metric (the creators). Note that it’s a function of time which makes sense as the universe is expanding at an increasing rate.
Anyway we have a usable metric from GR, so we relate it to a flat spacetime and we have something we can use to model our graviton.
A massive object (gold sphere) sits in the spacetime fabric. The flat grid (ημν) bends into a funnel-shaped depression — the curvature field hμν. The deeper the well, the stronger the gravitational pull.
From here, all we know is that we have the metric that defines how spacetime is changing which we can use to model distance accurately between any two points in space.
we think, we don’t know for sure, that a graviton is the particle that mediates how gravity occurs in the universe.
The force particle must deliver the “message” between any two particles with mass in order for gravity to “happen”. That particle must be exchanged. That is the theory.
For example, we can draw two electrons, and just like how there’s other forces between them like the normal coloumb force , there’s a force of gravity acting on them as well.
So we now need to model all of the ways this hypothetical interaction could occur.
And similar to normal probability in order to have a complete picture, we need to actually look at all possible scenarios in order to model the situation accurately.
Feynman’s insight was that a particle takes all possible paths simultaneously, and so you must sum the probability amplitudes of all of them. Each Feynman diagram corresponds to one possible “history” of the interaction. You sum them all to get the total amplitude.
Which brings us to a famous integral, whch we will use to sum the probability amplitudes of all the possible momentum states that could have occurred in this particular electron interaction.
Theoretically we could calculate these probabilities, then run over the to LHC, run an experiment and see what happens. If the theory accurately predicts experiment, we would have increased our confidence in the theory, and so on.
So let’s look at how this worked SUCCESSFULLY for photons, and then we’ll come back to gravitons.
unlike gravitons, photons barely interact with each other, so it’s easy predict behavior
We are verrrry lucky with Electricity and Magnetism. We can look at singular photons and model them explicitly.
Photon-photon scattering exists but is incredibly suppressed because it can only happen indirectly — via a loop of virtual electrons:
The coupling is at each vertex, and you need 4 vertices for this diagram, so the amplitude goes as . Negligibly small at low energies — which is why light beams pass through each other unaffected in everyday life.
This is why classical E&M is linear — superposition, interference, diffraction all work cleanly. The self-interaction is weak so it doesn’t interfere with experiment.
Spin and Forces
Before we can go further you need to understand “spin”. Spin is an intrinsic property of particles — similar to angular momentum (the particle isn’t actually spinning, we can blame Uhlenbeck & Goudsmit some other time but it was 1925). It comes in integer or half-integer multiples of (a constant of nature), and it controls the mathematical structure of the field and the form of the interactions.
The spin of a force-carrying thing determines the angular dependence of the force it mediates:
| Spin | Example | Field type |
|---|---|---|
| 0 | Higgs boson | Scalar |
| 1/2 | Electron, quark | Spinor (fermion) |
| 1 | Photon, W, Z, gluon | Vector |
| 2 | Graviton | Symmetric rank-2 tensor |
What this means is that spin directly influences the way that forces interact in multiple directions. For example Spin-0 has no angular dependence. the Force looks identical from every direction. Symmetric.
But to be more specific what we’re talking about is particles that can be modeled as fields based on what kinds of interactions those particles have and what forces result from those.
A spin-1 field (for example the field creating the photon) carries one power of momentum, and is described by a 4 item vector . runs over 4 spacetime indices: (space). That’s it. One index, four components. The photon’s momentum is a separate 4-vector .
works with one index because the EM field just needs to know which direction it points at each spacetime point. One index = one direction. Think back to any model of electricity or magnetism you’ve dealt with, you model the forces as a direct relationship between regular objects in 3 dimensions.
A spin-2 particle (the alleged graviton created by it’s field) carries TWO powers of momentum and is described by a that we talked about earlier.
Programmer’s note, the rank of tensors just means a matrix. A rank-2 tensor could be shape
[3,3]. Rank-3 is[3,3,3].
Now I know what you’re thinking, “David if there’s a higgs boson that mediates a force just like these other ones, doesn’t that mean there’s a higgs force too?” The short answer is no, and I don’t have the time to read the math behind why. I would remind you that you’re reading a blog post on quantum gravity by a venture capitalist I’m doing my best here.
Perturbation Theory
Like we said above, most interesting quantum field theories can’t be solved directly with formulas — the interactions are too complicated. (This is because particles interact with each other as well of course so predicting and isolating activity is hard). Perturbation theory is a workaround: you start with a free theory (no interactions, exactly solvable) and then treat the rest of the interactions as small corrections layered on top. Each correction is suppressed by higher powers of the coupling constant , so the full answer starts to look like a power series (think of a taylor series from calculus):
Each term in this series corresponds to one “loop order” in the Feynman diagram expansion. This works beautifully when (QED has , so the series converges fast). It falls apart when the coupling is large — or, as we’ll see, when the coupling grows with energy. For gravity, the effective coupling at energy is , which blows up near the Planck scale.
Feynman Diagrams and Loops
These diagrams are gonna keep coming up in particle physics. I find them very annoying in a sense because they really abstract away what’s going on but I understand why they’re used. They are simple tools to intuitively think about the physics of a situation.
Let’s think about a simple example, You have two electrons. They repel each other. The question is: what is the probability amplitude for this interaction? In QFT the answer is computed by asking: what fields exist at every point in spacetime, and how do they interact? which is a very large question. The electron has a field . The photons are also defined by a field . They’re coupled — meaning the photon conveys the electronagnetic force, which is how the repelling force is conveyed between the two electrons.
To get the amplitude you evaluate:
That exponential in there, when expanded is an infinite series…
Each term is a multiple integral over spacetime of products of field operators. When you actually evaluate these integrals using the known behavior of the fields, you get momentum-space integrals — because fields are easier to analyze in momentum space via Fourier transform.
Feynman diagrams are a visual diagram for the terms in a perturbation series. Each diagram represents one specific mathematical contribution — an integral — to the probability amplitude for some physical process (say, two photons scattering off each other).
The diagrams are built from two ingredients: propagators (internal lines, representing virtual particles traveling between points) and vertices (the points where particles meet and interact).
Propagators
A propagator is a mathematical object attached to every internal line in a Feynman diagram. It represents the amplitude for a virtual particle to travel from one interaction vertex to another, carrying momentum .
For a massless spin-1 particle (photon), the propagator in momentum space is:
For a massless spin-2 particle (graviton), it’s… messier:
To put it more simply, the propagator is the “how likely is a particle to get from A to B” factor in a Feynman diagram calculation.
Photon propagator: Dμνγ ∼ 1/k²
Graviton propagator: Dμνρσh ∼ 1/k²
The part means: higher momentum (shorter distance) means a smaller contribution. Both photons and gravitons have this same falloff, so neither blows up at high energy from the propagator alone.
The graviton’s extra indices ( vs photon’s ) reflect that it’s spin-2 — it needs two pairs of indices to describe how a rank-2 tensor (the metric perturbation ) propagates. More indices, uglier formula, same basic behavior.
Both fall off as at large momentum. Both the photon and graviton propagators are equally well-behaved at high .
vertices
A vertex in these diagrams is a point where fields interact.
For example, an electron comes in, emits a photon, electron goes out. Three lines meeting at a point. That’s a vertex.
A “tree diagram” is a casual term that refers to the idea, that momentum is conserved across the interaction in that diagram.
loops
A loop is what happens when a “virtual particle” is emitted and reabsorbed by the same diagram — forming a closed path.
At a tree level (no loop): electron emits photon, other electron absorbs it. Momentum of the photon is completely fixed by the two electrons’ momenta via conservation.
With a loop: electron emits a photon, that photon splits into two particles, they recombine back into a photon, which is then absorbed. The momentum of those two internal particles is not fixed by conservation — they just have to sum to the photon’s momentum. So is free, and you integrate over all possible .
Tree-level diagrams have no closed loops. The momenta of all internal particles are completely fixed by conservation laws — these give finite, classical-looking results.
Loop diagrams have closed loops of internal lines. Because the momenta flowing around a loop are not fixed by conservation, you must integrate over all possible values — from zero to infinity:
Here’s an example of a loop diagram
The question the loop integral is asking, is what is the momentum of every possible “virtual particle”, what I mean by this is we couldn’t observe it, so we have to assume this hypothetical particle that COULD have been a part of the interaction with this very low probability. A particle has 4 momentum components: (), energy () plus momentum in all 3 spatial directions.
means you’re integrating over 4-dimensional momentum space.
That function often doesn’t fall off fast enough as , so the integral diverges. The number of independent loops in a diagram determines how many unconstrained momentum integrations you face — and for gravity, each additional loop makes things dramatically worse.
This is what the full integral actually looks like just for some clarity from the shorthand
Again just try to remember the idea is you’re integrating along all of the dimensions to take the sum of all possible momentum states.
So what do you do when your loop integrals diverge?
So we decide to come up with a hack, just stop integrating! Literally just stop integrating at some large momentum and see if you can make sense of it without accounting for all possible momenta, the theory will only apple for LOWER momentum states, where most of life happens.
We set a constant that is artificially low. Your integral is now finite but depends on .
The physical intuition is that high momentum implies short distance. A UV cutoff is implicitly saying “I don’t trust my theory above energy , but we can make something very useful for most other normal circumstances.”
Tree level: finite, no integration needed
One loop: ∼ Λ⁴, generates new R² counterterms
Two loops: Goroff & Sagnotti's definitive result
the derivation of charge and mass
For a photon interaction like we’ve been talking about, you can model all the possible interactions this way;
Start with the integrand at large , where :
So the integrand becomes:
Now switch to 4D spherical coordinates. In 4D, a shell at radius has surface area , so:
The integral becomes:
So the full integral splits as:
The divergence is a single log. It has the same form as the charge term already in the Lagrangian — so one redefinition of absorbs it at every loop order.
we don’t know the TRUE value for the energy of a photon. We just have this very very accurate approximation that is the ‘absorption’ of the finite part when trying to solve this integral. We don’t actually know the “true” charge of an electron, as it can never be observed.
trying to do the same with gravity doesn’t work
for the gravity analogue
This is called Quartic divergence. Each loop order produces a divergence of a new type, requiring a new parameter — infinitely many parameters, no predictive power.
In QED the divergence has the same shape as a term already in your Lagrangian, so you just redefine that one number.
Long story short, you can’t define the divergences into constants that map to experiment, we’ll go in detail below if you want to dig further.
Degree of Divergence
The degree of divergence is a power-counting estimate of how badly a Feynman diagram (again just a representation of that integral we keep talking about) diverges.
For a loop integral in 4 spacetime dimensions, you have — four powers of momentum from the integration measure. The propagators in the denominator suppress it, and any derivative factors in the vertices add powers of in the numerator. The net power (as in exponent) of in the integrand is the degree of divergence:
- : the diagram diverges as (power-law divergence)
- : logarithmic divergence,
- : superficially convergent (individual diagram is finite)
In a renormalizable theory like QED, depends only on the external particles in the diagram — not on the loop order . So you only ever encounter a finite set of divergence structures, all absorbable into a fixed set of counterterms.
With gravity, — it grows linearly with loop order, generating a new, inequivalent infinity at every order of perturbation theory.
In every other successful quantum field theory — quantum electrodynamics, the weak force, QCD — we encounter infinities from virtual particle loops. But these infinities are manageable: they can be absorbed into a small, fixed set of physical parameters (masses and coupling constants) through a procedure called renormalization. Once you fix those parameters by measurement, all predictions are finite and agree with experiment.
Renormalizability means that the number of independent infinities equals the number of free parameters you can tune. A theory with infinitely many inequivalent infinities is not predictive.
General Relativity, when quantized around flat spacetime (as in no gravity, no curvature) , gives more divergent loop diagrams. You would need infinitely many measurements to fix infinitely many parameters — rendering the theory unusable.
Where Comes From
For a Feynman diagram with loops, internal graviton lines, and vertices of type (each carrying derivatives):
Using :
At loops, diagrams diverge as — a new inequivalent infinity at every order.
Why Negative Mass Dimension = Non-Renormalizable
If you imagine a gravitational loop diagram at loops. The coupling appears times (two per vertex, one pair per loop plus external). For the amplitude to be dimensionless:
The Hierarchy of Divergences: At each loop order , we get a new divergence multiplying a new operator. These operators have increasing numbers of derivatives and curvature tensors, each requiring a new free parameter to absorb. The theory has infinitely many free parameters.
✓ QED (Renormalizable)
Coupling: e, with [e] = M⁰ (dimensionless)
Divergences at each loop: ∼ ln Λ or Λ²
All absorbed by: δme, δe, δZψ, δZA
3 parameters → all orders finite after renormalization
✗ Quantum Gravity (Non-Renormalizable)
Coupling: κ, with [κ] = M⁻¹
Divergences at L loops: ∼ Λ2+2L
New operators needed: R², R³, RμνRμν, Rμνρσ³, …
∞ parameters → theory is not predictive
The Effective Field Theory Perspective
There is a silver lining: below the Planck scale ( GeV), quantum gravity works. The higher-loop corrections are suppressed by powers of at accessible energies — completely negligible. This is WHY classical general relativity works.
The conclusion we’re supposed to draw from this is that quantum gravity is not a complete, UV-finite theory. It needs completion at the Planck scale (This is where one might substitute “string theory” or “loop quantum gravity” are).
Visualizing the Momentum Integration
The central object is the one-loop integral over virtual graviton momentum :
QED: integrand falls off — integral is log-divergent, absorbed by renormalization
Gravity: every loop order L gives a faster-growing integrand — all diverge as Λ→∞
The left panel shows the QED integrand : it falls rapidly and the area under the curve is finite (logarithmically divergent, easily renormalized). The right panel shows the gravity integrands for : each grows with , and the integral blows up as — with each loop order strictly worse than the last.
For (one graviton loop) the integrand grows as , giving . For it grows as , giving . For as , giving . Each loop order requires an entirely new counterterm — the theory generates an infinite tower of inequivalent infinities.
Why QED Works but Gravity Doesn’t
A detailed comparison of power-counting in quantum electrodynamics versus perturbative quantum gravity illustrates precisely why one theory is renormalizable and the other is not.
| Property | QED (Photon) | Perturbative Gravity (Graviton) |
|---|---|---|
| Spin | 1 | 2 |
| Coupling constant | , dimensionless | , |
| Propagator | ||
| Vertex momentum growth | (gauge coupling) | (curvature = 2nd deriv) |
| Superficial divergence | (decreases with external legs ) | (grows with loop order) |
| Divergence at 1 loop | (log, renormalizable) | (quartic!) |
| New operators generated | None beyond original Lagrangian | (infinitely many) |
| Free parameters needed | 3 (‘s) | ∞ (one per loop order) |
| Renormalizable? | ✓ Yes — predictive to all orders | ✗ No — unpredictive beyond tree level |
| Definitive proof | Dyson 1949 | Goroff & Sagnotti 1985; van de Ven 1992 |
What This Means for Physics
The Effective Field Theory Verdict
Donoghue (1994) showed that quantum gravity can be treated as an effective field theory below the Planck scale. The leading quantum corrections to the Newtonian potential are:
The second term is the classical post-Newtonian correction. The third term, proportional to , is the genuine quantum gravitational correction — computable and unambiguous below the Planck scale. At m, it is of order — immeasurably small.
Proposed UV Completions
Several approaches attempt to provide a UV-complete theory of quantum gravity:
- String Theory — Replaces point particles with 1D strings; graviton is a string vibration mode. UV finiteness achieved by infinite tower of massive string excitations.
- Loop Quantum Gravity — Discretizes spacetime at the Planck scale; spacetime itself has a quantum structure that provides a natural UV regulator.
- Asymptotic Safety — Proposes that Newton’s constant flows to a non-Gaussian UV fixed point, making the theory UV complete without new degrees of freedom.
- Causal Dynamical Triangulations — Lattice-like formulation of quantum gravity using Regge calculus with a causal structure.
- Supergravity
The Bottom Line: Unfortunately Nature requires new physics at the Planck scale, GeV. That search is one of the deepest problems in science.
The next time you meet someone who is pissed off they can’t get a serious understanding of what’s really happening in this area, send them this post.
seperately, feel free to send me corrections as well, I’d like to make sure I get this right.
*The definitive proof of non-renormalizability came from Goroff and Sagnotti’s two-loop calculation. They found a divergent counterterm:
This operator (cubic in the Riemann tensor, sixth order in derivatives) does NOT appear in the original Einstein–Hilbert action. Its divergence coefficient is 209/2880 ≈ 0.0726 — nonzero, so it cannot be tuned away. A genuinely new counterterm is required, with no experimental handle.
Physics: Einstein–Hilbert action · Goroff & Sagnotti (1985) · van de Ven (1992) · Donoghue (1994)
References: Zee, QFT in a Nutshell · Donoghue, Phys. Rev. Lett. 72 (1994) · Goroff & Sagnotti, Nucl. Phys. B 266 (1986)
Special thanks to ai tools for infinite tutoring and the very helpful diagrams.