« My Childhood Death Spiral | Main | My Best and Worst Mistake »

September 15, 2008


Nice! The model does rely on the existence of one "best" candidate. Or does it? Do you think results would also hold if --like in many voting situations-- there are, say, 50% Farmers and 50% Fisher(wo)men who disagree on what "best" is? Each one gets a signal on which candidate would be better for themselves individually.

I have not gone through the math, but it would be interesting to know whether Fisher(wo)men with unequal info quality would abstain if they know that farmers have very similar info quality.

Any theories about why there's so much social pressure to vote?

Any theories about why there's so much social pressure to vote?

I like the theory expressed by "another bob" in the comments at Econ Log, the link in the OP: "Consenting to be governed is the key benefit here! Therefore the larger the proportion that participate the more convincing are the results, even if the results are narrowly divided or even meaningless in terms of policy choices."

First a definition. The rights of the citizen in a country like the U.S. can be divided into two broad categories: property rights and political rights. Property rights include the right to participate in free markets. Political rights include the right to lobby the government, the right to donate to politicians and parties and of course the right to vote.

Any theories about why there's so much social pressure to vote?

Because people whose status and income depend heavily on influence -- and most "serious" journalists and quite a few professors are in that group -- particularly influence over public opinion, influence on decision makers and influence with elected officials do better when prospective "customers" for that influence see their political rights as potent means to get what they want. The big danger to the status and income of the influence wielders is that most people will get what they want by exercising their property rights, e.g., by trading their labor for money and using that money in voluntary transactions. (Of course, the latter way of getting what you want increase the prosperity of your country whereas the former way decreases it, but the influence wielders usually manage to tiptoe around awareness of that fact.)

Promoting the influence of voting is a component of the general promotion of political rights. If my theory is correct then one would expect the heaviest levels of promotion of voting to come from groups who status and income are most heavily tied to their (perceived) influence over voters like political bloggers and campaign consultants (as well as from the more deluded and zealous adherents of America's civic religion, such as schoolteachers).

Perhaps we need to know if we can trust you, Nancy. Your stated political views signal your loyalty to our group. But you might just be paying us lip service. By voting - even tho there's no benefit & to the extent it may actually cost you - you are proving your loyalty to us. You make yourself more trustworthy & you gain status with us. So we pressure to vote. we need you to declare yourself. Whoever "we" are.

There's something seriously wrong with your model, because it predicts, over a wide range, that it's best for just 1 or 2 people to vote.

This could be true if we had a way of identifying the person with the highest q. But we don't.

Also, you say, "we'll find quality and rank are related by a power law, q = q1*rank^-power". There have to be hidden assumptions here, which are the most important part of the model. A priori, we don't know anything at all about what the distribution of q-values is.

Finally, even supposing your power-law model is good (and you've given no evidence for it), we have no idea what the proper value for "power" is. .003? 17?

Phil Goetz: Surely we don't need to identify the person with the highest q, just have a procedure that concentrates probability mass on such people sufficiently well. That's what the electoral college originally was. I would be ecstatic to transfer all future election decisions to the 10,000th most qualified voter, provided that neither he/she nor anyone else knew the critical voter's identity. Even the millionth most qualified would surely be an improvement.

Can I propose an idea? It's one I had for some time.

One obvious problem with ignorant people not voting is (my guess) that the lower class would be under-represented. Thus, the plan:

Everyone who wants to vote takes a special test with questions on economics, politics, and the candidates, authored or reviewed by the candidates themselves. If they pass, they get a voter's card with their name and income information. After the election, the weight of each vote is multiplied so that every income bracket has influence propotrional to the percentage of population that it constitutes.

Thus, democracy without idiocracy.

Thinking out loud here ... might it not be rational to vote for the policy that is less likely to be correct?

Candidate A talks about the various tradeoffs of different health care policies. He emphasizes there is no free lunch.

Candidate B says government health care will solve all problems, and can be implemented painlessly.

If voter C doesn't know anything about health care or economics, he should rationally choose candidate B. Even if he knows enough to believe that Candidate B has only a 5% chance of being correct, he might still vote for candidate B, if he believes that the benefits of B's policy (if correct) are much, much higher than the cost of A's policy. (Taking into account costs if the policy turns out to be wrong, etc.)

A rational voter C *should* wonder why there are so many voters who disagree with his logic. But he might think that the opposition is just voting out of self-interest. That is often true.

Your car is running badly. You go to three mechanics, and each diagnoses a complex, $1,000 engine problem. You go to a fourth mechanic, and he diagnoses a bad $2 wire. Wouldn't it be rational to let the fourth guy try to fix the problem first? Especially if he points out -- perhaps truthfully -- that the first three guys get heavy subsidies from the engine manufacturers to replace unnecessary parts.

That is: if a policy is simple enough to sound plausible, promises large benefits, and is opposed by many out of self-interest, might it not be *rational* for the ignorant voter to take a chance on it?

So as long as voters are somewhat ignorant, there will always be a market for candidates to promise unfeasible "panacea" policies, and it will always be rational for those ignorant voters to choose them.

I'm sure right-wing voters think left-wing voters are falling for this fallacy re: rent control, corporate taxation, and so on. Left-wing voters think right-wing voters are falling for this fallacy re: free markets and global warming.

Phil: The analogy doesn't work because unlike auto mechanics politicians can always charge you whatever they want after the fact and you can't replace them for several years.

Tiiba: Very good idea! Sadly it would be disastrous, which is why even very good ideas need serious criticism before implementation. It might work in a less diverse country, but in the US there's a serious problem with defining the groups that one's membership should be assigned to, which means that you get gerrymandered socio-economic brackets and ethnic identities or the like. Under the naive version, African Americans with $25K incomes living in New York where $25K is poverty loose their vote and the vote of middle-class white farmers in Idaho who make $25K but own their land and partially feed themselves is multiplied.

“So I celebrate the noble abstainers”

I believe people abstain mainly for the following reasons:

a) They do not care. It may be argued that abstentions of the selfish are beneficial. However, it hardly makes sense to celebrate the abstainers themselves.

b) They do care, but seeing both sides of the issues at stake, they can not make up their mind. I suspect that these people are in fact better informed and less biased than the general public. Hence further encouraging their abstention would be counterproductive.

Any theories about why there's so much social pressure to vote?
The purpose of government is not to produce quality decisions, but to avoid the destructive competition for power that takes place in a power vacuum. People didn't create the institution of monarchy because kings were good at ruling, but because having a clear line of succession made it harder for minor leaders to seize more power.

Voting indicates that a person has bought into the system and recognizes it as valid. More importantly, because people's beliefs shift to become compatible with their actions, if people can be induced to act in a way that indicates they accept the validity of the system, they will tend to grow to accept it.

No system of social control works without the cooperation of the people being controlled. The function of the instruments of control is to induce acceptance of the mechanism, first and foremost.

Phil Goetz: Surely we don't need to identify the person with the highest q, just have a procedure that concentrates probability mass on such people sufficiently well. That's what the electoral college originally was. I would be ecstatic to transfer all future election decisions to the 10,000th most qualified voter, provided that neither he/she nor anyone else knew the critical voter's identity. Even the millionth most qualified would surely be an improvement.

That is true. But I'm questioning Robin's model. The model seems to assume that we can identify people's q values. Also, because it predicts over large regions that it's best to have only 1 person vote, I guarantee that it doesn't take into account the fact that q should be seen not as a person's probability of voting correctly on any issue, but as an estimate of that probability, which has a variance.

I also think that Robin's choice of q1 is too low to be useful.
When power=1, q(1) = q1, q(2) = q1/2, q(3) = q1/3, etc.,
and so p(1) = .55, p(2) = .525, p(3) = .5167. The result, that 1 or 2 people should vote, is because q1 is chosen as so low that most people's opinions are worthless. Even the rank 1 person is only better than a complete moron 1 time in 20, so I think this q1 value is unrealistic.

In general, with models like this, you have to be very careful about the rank-1 item, which tends to dominate the results, if your exponent is near or less than -1. In such cases, it's sometimes best to remove the top-ranked element from your set. In any case, I also think Robin's choice of looking for exponent values around -1 is very much wrong, and something like -.1 or -.01 might be a better place to look. I would also suggest scrapping the power-law distribution entirely, for a normal or a lognormal distribution (such as we use for IQ).

Robin: Is that q = q1*(rank^-power) (usual operator precedence), or q = (q1*rank)^-power ?

londenio, there's a large lit on abstention given different preferences. Small voting costs can discourage voting in a large voter pools.

Nancy and others, there seem to be many reasons voting is pushed, including system validation and loyalty signals as suggested.

Phil G, models are not intended as exact descriptions of reality; they instead simplify to clarify. You get the same results with the max possible q1, with removing the top person, and with common knowledge only of each person's expected q value. Log-normals with a wide variance look like power laws over their mid-range. It is true people don't know their own and other info levels exactly, but if they have some reasonable idea of where they rank they should abstain if expect to have little info relative to others.

It's a commonly held theory that voting is encouraged because it increases consent to be governed. I don't know how it holds up empirically. A possible problem with reserving voting for the most qualified is that it'll be mostly elite wasps and ashkenazi jews that vote (although it would probably be more gender balanced than it currently is, which is very skewed female) -which could reduce consent to being governed. One thing mass voting may do is sort out the archetype of what people want their alpha leader to be: how much do they care that that archetype has the same phenotype, same life history, same worldview. I suspect all things equal, a middle aged christian white woman with an associates degree and a couple kids from heartland america and inconsistent conservative values is the archetype that'll win out until further demographic change takes its course. Palin fits the archetype imperfectly, but better than most.

We might get farther insistency on higher levels of competency from presidential appointees. For example, Harriet Meiers had some populist appeal to her narrative (though perhaps not Palin's charisma -presidents probably don't like appointees with competing charisma anyways, consider the Kissinger headache) but as an appointee people probably saw her more like a doctor or engineer than an expression of personal archetype dominance fulfillment, and thus were sensitive to signals by experts that she wasn't qualified.

So perhaps the best real world result is presidents that optimize their population's archetypal representation, and appointees that optimize competent governance. Consider how Bush's dumb desires could've been kept in check with more competent appointees at the justice department, the cia, the secretary of state, the nsa, the treasury department, and (with regards with higher economic comeptence requirements, and this goes beyond checking Bush to general consideration of social welfare) the federal judiciary.

If you want a lot of cheap media attention, you could start the Anti-Voting Drive. "Uninformed? Stay home!"

Robin: "(Voter signal qualities q>0 are common knowledge.)"

Do you mean the fact that in your model q is always positive is common knowledge, or do you mean the q values themselves (That is, which voter is associated with which q) is common knowledge?

If not the latter, do the voters know their own signal qualities or rank?

Psy, the q-values (or their expected values) are common knowledge.

Eliezer, Bryan Caplan has tried that, with limited success.

Perhaps we in the US ought to be celebrating our right to abstain. In some countries, such as Brazil, voting is mandatory.

Robin, in the first case, log-normals are not power laws. I don't know how you define "midrange", but power laws dramatically emphasize the impact of the first few ranked items; log-normals do not, unless you set the distribution's variance too high. They should give very different results in this case. And since we always use either a normal or a log-normal distribution for IQ, why on earth did you choose to use a power-law distribution for voter acuity? You need a very strong justification as to why you chose a power law, and haven't given any.

If you used the more-reasonable log-normal model, and interpreted (1+q)/2 as an estimate of the person's expected probability of being correct on any given issue (that is, a voter is not described by a probability, but by a probability distribution), and you used reasonable parameter values, you would get very different results, invalidating the point you are making with this post.

This LISP code will help demonstrate the behavior of your model:

(setq power -1)
(setq q1 .1)
(defun q (r q1 power) (* q1 (expt r power)))
(defun p (r) (/ (+ 1 (q r q1 power)) 2))
(defun genlist(r)
(cond ((eq r 1) (list (p 1)))
(t (append (genlist (- r 1) ) (list (p r))))))

Now do this:
> (genlist 20) ;; list p(correct) for top 20 voters
(0.55 0.525 0.51666665 0.5125 0.51 0.5083333 0.50714284 0.50625 0.50555557 0.505 0.50454545 0.50416666 0.50384617 0.50357145 0.50333333 0.503125
0.5029412 0.50277776 0.5026316 0.5025)

The top 20 voters are all pretty stupid. Say we have 100,000,000 voters turn out:
> (p 1000000)

According to these parameters, the one-millionth best voter (top 1%) votes better than random only 6 out of 100 million times. This is why their votes are useless in your calculations.

If you set q1 to the maximum, which is 1, you of course want only 1 voter, because the top voter is right all the time. But look:
> (setq q1 .99)
> (p 1000000)

Now the voter at the 1% boundary is better than random 5 times in 10 million. With power = -1, you get this same basic result no matter what q1 is.

Now let's try a different exponent:
> (setq power -.1)
> (p 1000000)
> (genlist 20)
(0.995 0.96185136 0.94349945 0.9309225 0.9214133 0.91379964 0.90746975
0.9020649 0.8973571 0.89319247 0.8894627 0.8860887 0.8830106 0.88018274
0.8775688 0.87513983 0.8728725 0.8707472 0.8687482 0.8668616)

These parameters are somewhat more believable - but not for the top-ranked people; no person is right 99.5% of the time in these matters, or even 96% (except by chance).

So your model has 3 problems. One is that, because it uses a power law, the top-ranked 1 to 3 people will dominate for most parameters. The second is that the benefit of having a large number of voters is reducing variance, and I don't think you're taking variance into account. The third is that it relies on being able to identify - not approximately, but precisely - everyone's rank; because you have to identify the rank 1, 2, and 3 people exactly right. Even if the power-law turns out to be justified (and I don't think it is), you would have to account for the uncertainty in ranking, which would dramatically steer your results in the direction of "more voters is better".

Phil, I most certainly am explicitly taking variance into account. Many things that can vary by large magnitudes are distributed as power laws, and I stand by my claim that "log-normals with a wide variance look like power laws over their mid-range." Mid-range is less than one standard deviation. I disagree that I need to assume folks identify rank precisely; we need only posit voters have a decent idea of how they rank. If you arbitrarily assume that the info of the top M folks is no better than that of rank M in my model, you will only ensure that at least M folks must vote; the rest will be the same.

I am curious about the magnitude of the social benefit of abstaining in this model. It seems like they only remove a very small amount of noise, so I am guessing it doesn't make a huge difference in terms of the probability of choosing the correct candidate.

In reality, I am more concerned about voters who are biased because they are uniformed. You could examine that by allowing for more correlation in the less informed voters. Maybe a model where some signals are common and some are rare, so that uninformed voters are more likely to have common signals. I am guessing you would get similar results about abstaining, but there would be a higher social cost to uninformed voting.

Phil, I most certainly am explicitly taking variance into account. Many things that can vary by large magnitudes are distributed as power laws, and I stand by my claim that "log-normals with a wide variance look like power laws over their mid-range." Mid-range is less than one standard deviation. I disagree that I need to assume folks identify rank precisely; we need only posit voters have a decent idea of how they rank. If you arbitrarily assume that the info of the top M folks is no better than that of rank M in my model, you will only ensure that at least M folks must vote; the rest will be the same.
  • Perhaps you could post more of the model, so we can see how variance is taken into account, and how you compute the numbers for cases 1 and 2.
  • You still haven't explained why you want to use a power law, beyond saying that many things that can vary by large magnitudes are expressed as power laws.
    • The quantity you are computing is a probability that ranges from .5 to 1. It cannot vary by large magnitudes. Some related quantity, something like Eliezer's "optimization power" measurement, might vary according to a power law; but the probability, being bounded below and above, is not a good candidate.
    • Even if you were looking at an underlying measure of "voting intelligence", rather than the probability, standard practice is to use a normal distribution for this kind of thing. Only radicals like me and Mike Vassar use a log-normal.
    • Using a power law is what gives you your result. Your entire case boils down to the claim that voting intelligence (actually, the probability of choosing correctly, which is an even more extreme claim) has a power-law distribution. There's no point discussing anything else until that is cleared up, because your point is that your model says fewer people should vote, and I believe it says that because you're using a power law.
    • The mid-range is not the problem. The top end, rank 1-10 or so, is the problem.
  • For case 1, you do need to identify rank exactly. You couldn't say that everybody but the top-ranked voter should abstain, unless everyone knew who the top-ranked voter was. If you were likely to end up with voter #5 instead of voter #1, you would be better off taking the (estimated) top several voters.

Mike makes a good point in bringing up correlation. If you assume that voters are uncorrelated, most models will probably conclude that everyone should vote. The main problem with uninformed voters is when their votes are correlated (because of eg. advertisements, cultural biases, or systematic errors in reasoning).

Since your model doesn't mention correlations, and yet comes up with small numbers of voters being optimal, I have to continue to suspect that you aren't taking variance into account correctly.

I wasn't very clear. Robin isn't giving the probability a power-law distribution; he is basing it on a power-law distribution.

Also, when I spoke of using a log-normal distribution for IQ, this is misleading. While it may be true that you can fit some of the data better with a log-normal distribution, this log normal distribution will be so close to a normal distribution, that if you plotted it and showed it to a statistician, he would call it a normal distribution.

Skills usually have approximately normal distributions, because they are the combination of a large number of random factors. You can sometimes use a log-normal distribution to account for a skew caused when the distribtion is bounded below but not above.

Now, how does a skill, with a nearly normal distribution, map into a probability of voting correctly? Looking at our usual data, such as times in running the 100m dash, might be problematic because it isn't clear whether to consider times bounded or unbounded.

So I plotted the 1273 scores in the Netflix competition. These scores are in root-mean square error of guessed movie ratings, and theoretically range from 0 to about 1.05 (what you get if you guess the average value for each rating). In practice, the lower bound can be approximated by having a person try to guess their own ratings for movies that they already rated in the past; based on one experiment, this lower bound is about .79.

This is a pretty good substitute for a probability, because the RMSE is closely related to the probability of guessing the right rating. (It is sqrt(9*p(off by 3) + 4*p(off by 2) + p(off by 1)).)

Sadly, I can't post the picture of the plot here, but I can tell you what the histogram looks like. It occupies the range .864 to .951, with a mode of about .905. Most of the mass is between .90 and .93, with a rather sharp drop-off down towards .86. The end with the lowest-RMSE scores (corresponding to our highest-ranked voters) is nearly flat in the histogram. In other words, it looks vaguely normal, but heavy in the range .9 to .945. As with all skill-based scores, in this distribution, there are a few people who are very bad, and a few (fewer) who are very good, with most being in the middle.

This is in contrast to Robin's proposed distribution, in which almost everyone is very bad. That distribution is extremely sharp; it looks like an L if you plot p(correct) vs. rank, for all parameters I have tried. That is not a realistic distribution. And it is that unrealistic distribution (and possibly some issue with how Robin handles variance) that leads directly to the conclusion that almost everyone shouldn't vote.

Expert Political Judgment by Phillip Tetlock seems like a relevant book on the topic of how competence relates to getting questions right. It's worth noting that all the experts did MUCH better than Berkley undergraduates. The outliers among the experts were pretty strikingly not from the same distribution as the bulk of experts, though all were worse than sophisticated algorithms, though unlike algorithms the experts can ask questions, not just answer them.

Robin - Caplan clearly didn't have the right slogan. I'm thinking 'Rock The Ignorance!'

Or possibly 'Vote and Die!'

I think that what may have led Robin to his distribution is that elections are often nearly evenly split. If you suppose that voters have at least a 50 percent chance of being right, then to explain this, you must further suppose that nearly all voters have just a tiny bit over a 50 percent chance of being right. You would probably end up with Robin's model.

A simple solution would be to suppose that some voters have a less than 50 percent chance of being right. But I don't think that would save this model, because it isn't modeling the fact that politics is an adaptive system. Whatever the dispositions of the voters are, political parties adjust the alternatives offered until the expected vote is again split 50/50. Voters are polarized into 2 groups (in the US), which self-adjust to each claim about 50% of the voters. And if you believe that your group is right, then it is always rational to vote - even on issues you don't understand, or for candidates you've never heard of.

This seems to be biased by two things -- first, as has been noted, the top ranked voters are assumed to be much smarter than the remaining 99.9% of the population and it's going to take a lot of distributed information to overcome that, and second that the information held by the lowest ranked voters disappears to insignificance very, very quickly, which stops it from ever doing so.

If you tweak the formula to assume *everyone* has some definite, better than even chance of voting for what's good for society, so that q = q1*(r^n)+c, where q1,n,c are constant and r is the ranking, no matter how small c is, you eventually get more voters improving the odds of selecting the socially optimal candidate again. For q1=0.1, n=1 and c=0.001 (thus giving a minimum probability of 50.05% of selecting the best candidate), the best voter has a 55.05% chance of getting the right outcome, then it's downhill until you have 475 voters and a 52.10367% chance, then up-hill after that, getting back to 55.0503% at 14013 voters and continuing to increase thereafter. This is when only the top 100 voters have a better than 50.1% chance of picking the socially optimal candidate, everyone else varies between that and 50.05%, but at least they vary independently.

Self-interest probably works fine as a proxy for q too: the socially optimal candidate is socially optimal because he or she benefits lots of people, so they're likely to benefit any particular individual -- so if you have a signal q' for a candidate that helps you, you probably have a signal q=q'/?? for the socially optimal candidate.

aj, your variation gives an asymptotic power of zero, so as my analysis predicts it favors having everyone vote. Also, see my added to the post.

Actually, if you take this analysis the other way -- doesn't it provide an argument for why (at least in some circumstances) it is worthwhile to vote on a purely individual basis?

Isn't the conventional "economic" argument that it's not worthwhile for any individual to vote because they'll almost never be the one vote that puts the better candidate over the line? Whereas this analysis indicates that (in many cases, but obviously depending on the distribution of q) no matter how many other people are voting, your participation actually increases the odds of a socially optimal outcome, even if every other voter is smarter than you?

aj, my analysis ignored voting costs, and even when people want to vote in my model for most the benefit in terms of increasing the probability of the better candidate winning is very small.

The comments to this entry are closed.

Less Wrong (sister site)

May 2009

Sun Mon Tue Wed Thu Fri Sat
          1 2
3 4 5 6 7 8 9
10 11 12 13 14 15 16
17 18 19 20 21 22 23
24 25 26 27 28 29 30