The saga of rooted staggered quarks

Michael Creutz


I look at the rooting controversy from a historical point of view and review how I have come to the conclusion that these simulations involving staggered quarks must be discarded.

The lattice gauge community has recently been stricken with an extremely bitter controversy over a popular algorithm to study the non-perturbative interactions of quarks and gluons. This approach, based on something called "staggered" quarks, coupled with an uncontrolled approximation to adjust the number of dynamical "flavors" has come to be a major part of the US lattice QCD effort. But because of the enormous effort put into simulations with this algorithm, the practitioners have been unwilling to honestly explore theoretical issues associated with the approach, dogmatically claiming that it must become exact in the continuum limit. Unfortunately this is impossible due to an incorrect treatment of important non-perturbative effects. Many of the world's lattice theorists have recognized this problem, but the practitioners of the approach have not. Here I look at this problem from a historical point of view and review how I have come to the conclusion that these simulations involving staggered quarks must be discarded.

The important issues began to be understood during the theoretical revolutions of the mid 1970's. In this period, shortly after QCD, the quark confining dynamics of quarks and gluons, was formulated, it was quickly realized that there were many phenomena in quantum field theory that could not be understood in terms of conventional perturbation theory. These included solitons, instantons, confinement, and dualities between superficially quite distinct field theories. Among the more surprising results of the period was the fact that QCD had an additional non-perturbative parameter, usually expressed as the so called CP violating angle theta. The physics that allows this parameter to appear is at the heart of the problem with staggered fermions.

How theta enters into physical processes is subtle, and over the years occasional speculations have been made that its effects may disappear in the regularized field theory. I myself made such a suggestion in a paper with T. Tudron (Phys.Rev.D16:2978,1977). I now know that this is not true.

The rooting issue is associated with a particular fermion formulation that has an inherent degeneracy in the number of fermions. In particular, the simple staggered fermion formulation has four species. The rooting prescription is to replace the fermion determinant for each flavor with its fourth root in the hope that this will give the effect of a single flavor. As I discuss shortly, this mutilates important aspects of the non-perturbative physics.

Throughout the 1980's lattice gauge theory grew rapidly with the increasing realization that it allowed quantitative calculations independent of perturbation theory. However the lack of the massive amounts of computing needed for dynamical fermion simulations meant that the rooting issue did not arise. And even without fermions, the presence of a sign problem giving large numerical cancellations prevented direct studies of theta dependence. I should note that it is not this sign problem that is the issue with rooting; rather it is an incorrect treatment of the phenomena that make theta a physical parameter, and these phenomena are present even when theta vanishes.

It was not until 1995 that I realized that theta could be easily understood through its role in effective chiral Lagrangians. This was not a particularly new idea, it was just that I had personally not understood it before. In Phys.Rev.D52:2951-2959,1995. (hep-th/9505112) I discussed this in terms of the venerable linear sigma model for pions. This builds on the picture of an energy potential resembling a "Mexican hat" or "wine bottle bottom." The vacuum is degenerate and the light pions represent field fluctuations around this minimum of the potential. To this one can incorporate masses as a "tilt" in the potential favoring one particular direction in field space. But one can go further and ask how does a quark mass difference modify this picture. This gives rise to higher order effects that "warp" the potential in a quadratic way. One winds up with three parameters: the tilt, the warp, and the fact that these effects need not be in the same direction. These parameters map in a non-linear way onto the more conventional parameters, the masses of the up and down quarks and the strong CP angle theta. This simple intuitive interpretation convinced me that theta actually had physical meaning.

A potential giving spontaneous symmetry
Fig. 1: Our understanding of pionic physics is based on spontaneous breaking of chiral symmetry. Pions are excitations of a quark condensate around a nearly degenerate minimum. Quark masses distort this potential. The up quark mass, the down quark mass, and theta map non-linearly onto a tilting of the potential, a quadratic warping, and an angle between these two effects.

Adjusting the warping allows one to uncover some rather interesting phenomena. In particular, moving in parameter space it becomes possible to manipulate isolated but nearly degenerate minima and expose the existence of first order phase transitions. In terms of the angle theta, these occur at pi. This paper also briefly discusses going to flavor groups larger than SU(2), and argues that these first order transitions at theta=pi are a universal phenomenon whenever multiple quarks are degenerate.

At this point my suspicions about rooting began to solidify. In the chiral Lagrangian language, it is natural to think of the masses as complex numbers with theta being related to their phase. Giving N_f flavors a common mass with a common phase, the physical theta is the number of flavors times this phase angle. In the complex mass plane, the first order transition at theta of pi manifests as N_f equivalent first order phase transitions joining at the origin. Considered as a function of N_f, this is a highly non-analytic behavior, with a new transition appearing for each new flavor. Now unrooted staggered fermions have four species, so one expects four transitions meeting at the origin. The rooting prescription of taking the fourth root the fermion determinant hopes to mimic the effects of a single flavor. Taking the root of the fermion determinant is a fairly smooth process, and I was puzzled how this could reduce these four transitions into a single one.

A natural way to explore this further is to consider several flavors in the chiral Lagrangian and explore the behavior as they are made non-degenerate. In hep-th/0303254 I presented the phase diagram of three flavor QCD as a function of the up, down, and strange quark masses, all kept real. The structure is quite rich, with some regions exhibiting spontaneous CP violation. Those were anticipated much earlier by Dashen, and all occurred where the conventional angle theta was pi.

The three flavor phase diagram
Fig. 2: The phase diagram for three flavor QCD as a function of the up and down quark masses at fixed strange quark mass. The shaded regions represent the spontaneous CP violation occurring at theta of pi. The separation between the phase transition lines and the graph axes is missed in the rooting procedure.

A particularly intriguing feature of this phase diagram is the fact that nothing special happens when only a single quark mass passes through zero. Because the group is SU(N_f) and not U(N_f), the presence of non-zero values for the other quark masses stabilizes the vacuum in a region when just one mass is small. This led me to question whether there was some physical measurement one could make to determine if a quark mass was indeed zero. I could think of none, and proposed that a single vanishing quark mass might not be a physical concept. This paper was submitted to Physical Review D.

This is where the shit started hitting the fan. There was a common lore that if the up quark mass were to vanish, then the problem of why theta appeared to be phenomenologically very small would be solved. I was saying that this lore might be wrong. This drove the referees nuts, with statements like "I am somewhat concerned that the errors are so obvious." After numerous similar scathing remarks the paper went to a divisional editor for PRD, who upheld their opinion. On rejection I took the paper and split it into two parts, one on the phase diagram and the second on the vanishing mass issue. These both appeared in Physical Review Letters, Phys.Rev.Lett.92:201601,2004 (hep-lat/0312018) and Phys.Rev.Lett.92:162003,2004 (hep-ph/0312225). I do derive some visceral pleasure from having turned a rejected PRD paper into two PRL's. Nevertheless,it appears that much of the theoretical community still does not understand or believe this take on the issue. One of my presentations on the topic can be seen online here.

As a side remark, the divisional editor that rejected this paper is now one of the staunch supporters of the staggered approach. However, in an earlier work he rather beautifully used chiral Lagrangians to describe lattice artifacts with Wilson fermions. Had he pursued this with non-degenerate quarks, I expect he might have come to the same conclusions as I.

A question arises on whether one can use the topological susceptibility of the gauge fields as a tool to define a massless quark. Naively if the quark mass vanishes, the fermion determinant will vanish whenever there is non-trivial topology. So my claim would be false unless the topological susceptibility itself is ill defined. This would require that the winding number of the gauge fields is also ill defined. Much earlier Luscher had shown that if the gauge fields were smooth enough the topological number could be given a unique lattice definition. His smoothness condition also plays a role with Neuberger's overlap operator, which then gives a unique index theorem relating winding number with zero modes of the Dirac operator. This led me to explore the consequences of this smoothness condition and to realize in Phys.Rev.D70:091501,2004 (hep-lat/0409017) that forcing things to be so smooth required the Hamiltonian to be non-Hermitean. But if one does not impose the smoothness condition, there would exist gauge configurations where the index associated with the overlap operator depended on the kernel used to define it. Thus, despite its theoretical elegance, the overlap operator does not solve the problem of defining a massless quark.

Staggered quarks now reenter the picture. If rooting from four flavors down to one were correct, this would force a singularity in the theory as any single quark mass passes through zero. The underlying chiral symmetry says that the quark condensate associated with a massless quark must vanish. This is in direct contradiction to the simple chiral Lagrangian analysis, which says that the heavier quarks stabilize this condensate at a non-zero value. This was enough to convince me that rooting staggered quarks could not be correct.

Meanwhile large staggered simulations were continuing. The practitioners admitted there was an assumption that the rooting was correct, and presented various facades of formalism to support their claim for plausibility of the continuum limit. Informally I often complained about this, but was reluctant to publish anything so negative. One of my discussions on the issue can be seen here. Eventually the claims of the staggered advocates became so outrageous that I felt I had to be more aggressive. I was pushed further by statements that if someone had issues with staggered quarks, they needed to write them up. At the time I was too naive to appreciate how the stubborn nature of some personalities involved would mean that these arguments would be dismissed without serious discussion. As with the up quark mass issue, this is one of those situations where a person without tenure would be ill advised to challenge conventional lore.

So I submitted a paper (hep-lat/0603020) pointing out the inconsistencies between rooting and the expected chiral behavior. This was quickly rejected by PRL which has a policy of not publishing interesting and controversial papers. After transferring it to PRD, things got stuck, with numerous referees simply refusing to respond. After about a year and eight referee reports, some positive and some negative, PRD decided that they don't publish interesting and controversial papers either. I did not take this delay kindly and rewrote the paper with the provocative title "The evil that is rooting." This was fairly quickly accepted by Physics Letters (Phys.Lett.B649:230-234,2007; hep-lat/0701018), although the title was mollified at the editor's suggestion. This letter was accompanied by a rebuttal by Bernard, Golterman, Shamir, and Sharpe, and that was further accompanied by my response pointing out several of their errors.

The main rebuttal to my argument was that yes, rooting does give the wrong chiral behavior; so, one should stay out of the chiral region before the continuum limit is taken. This does seem a bit peculiar given one of the main motivations for staggered quarks is that they maintain a remnant chiral symmetry. And for a reduction to the one flavor theory this is particularly suspect since that theory does not have any chiral symmetry and no concept of a chiral limit. But the issue is not the chiral limit, it is an incorrect treatment of the physics related to theta.

While this was going on I was head of the program committee for the 2006 Lattice conference, held in Tucson. The question of a presentation on staggered fermions came up, and, since the approach is demonstrably wrong, I refused to have anything to do with scheduling such a talk. One of the committee members, who is among the strongest advocates of rooting, asked me to recuse myself from the discussion, while some other members of the committee strongly objected to the staggered community attitude of "either prove that the 1/4-th root is wrong, or shut up." Nevertheless, the remainder of the committee decided there should be such a talk and went ahead without my involvement and invited S. Sharpe to present it. That talk was rather disappointing in just rehashing the old hand waving "plausibility" arguments and basically ignoring my complaints. I did present my point of view in a poster at this meeting PoS LAT2006:208,2006 (hep-lat/0608020). It was after this point in time that the proponents of rooting ceased any discussion of the issue with me. They had made up their minds that I was wrong and would not consider it any more.

During this period I began to become aware of a rather severe split between the US and Europe in attitudes towards this subject. I gave a talk on "The Evil that is Rooting" at a meeting in Spain and found objections from only two of the participants. One of them was, and I believe remains, willing to consider the issues while the other is steadfastly with the stubborn camp.

During the discussions for the program at Lattice 2007, for which I was on the International Advisory Committee, the question arose as to whether there should be talks on the staggered issue. It was becoming clear that, since in the minds of many the issue was not resolved, I had to talk on the subject. Normally an IAC member would not give a plenary; so, I assumed I would just give a parallel talk on the topic. But deliberations of the local organizing committee decided that there would be two back to back plenary talks, one by me on the problems and one by Kronfeld, who was supposed to both rebut me and also summarize the recent results from the method.

I reformulated my arguments in terms of the 't Hooft vertex, something crucial to the understanding of how the theta dependence of QCD works. While previously I had not claimed to have a proof, here I showed specific non-perturbative effects that must come out wrong, even in the continuum limit. This proof appears in the proceedings. A more extensive discussion of the issues appears in Annals. The basic point is that the four "tastes" of staggered fermions are actually not equivalent, and rooting averages the fermion determinants of different theories. The proof consists of showing that the symmetries of the staggered determinant forbid the appearance of the correct 't Hooft vertex for the target theory. Another way to state the issue is that the 't Hooft vertex strongly couples the different tastes, and thus they cannot be considered as independent.

This talk was quite well received by the European community, but apparently not by the staggered people, mostly from the US. Kronfeld's talk pretty much missed the point, but of course I am not unbiased. After the meeting, Bernard, Golterman, Sharp and Shamir produced a preprint claiming to refute my arguments; however, this paper is so full of obvious errors and misrepresentations that I assume it will never be published.

The staggered community has continued to ignore these problems. I feel their stranglehold on the US lattice effort approaches scientific dishonesty. As an example of the prevailing vindictiveness, a recent paper of mine on a completely different topic was rejected from a prominent US journal on the basis of a single negative referee report stating that "It is puzzling that the author ignores all these highly relevant lessons that have been learned long ago in the context of the staggered fermion formalism." It was overlooked because I wanted to avoid the ongoing controversy, of which the referee was certainly aware. After I did add remarks on the comparison with staggered, the paper was rejected without further review by a divisional associate editor representing the staggered community. He raised some symmetry issues based on comments by the Maryland group, to which I was never given a chance to respond. This paper was then submitted to a European journal where I hoped for a more equitable treatment. There it was quickly published.

Beyond the international ridicule this this controversy brings on the USQCD community, other aspects are particularly upsetting from a scientific point of view. First, enormous amounts of computer time continue to be wasted on generating lattice configurations from which any non-perturbative information will be questionable. About 38 percent of the current computer time allocated by the USQCD collaboration is going to continue these efforts. Second, young people associated with this project are taught to repeat, without question, the party line that all will be okay in the continuum limit. Third, the practitioners are such a powerful force that most outsiders are unwilling to look into the problems despite the fact that the underlying physics is so fascinating. And finally, I find it extremely unsettling that some physicists widely regarded as experts in chiral symmetry and lattice gauge theory can so casually and thoroughly delude themselves with bad science.

In short, the lattice has been very good to me. It is extremely painful to see it abused so blatantly.