Escobari and Hoover’s “variance estimates” should not be taken at face value

This is the seventh in a series of blog posts that tackle Diego Escobari and Gary Hoover covering the 2019 presidential election in Bolivia. Their conclusions do not hold up to scrutiny, as we note in our Nickels Before Dimes report. Here, we expand on the different claims and conclusions made by Escobari and Hoover in their paper. Links to previous posts: Part One, Part Two, Part Three, Part Four, Part Five, and Part Six.

In the previous post, we noticed an error in margin calculations by Escobari and Hoover. Although the effect on their accounts was minimal, the incorrect use of Validos En Acta by Escobari and Hoover (among many others) stirred up controversy by showing that the official vote totals were not properly added. Instead, these reflect clerical errors made by jurors at individual polling stations. We now pick up where we left off in Post #5 when we noticed that there was counting bias in the election. Here, we delve into the effects that bias had on the first findings produced by Escobari and Hoover.

We start with their ‘variance estimates’, which are roughly reproduced below. We attribute discrepancies (indicated in red) to differences in the allocation of polling stations to constituencies – a problem We set With the first release of their paper for 2019, and it’s possible that that hasn’t been fully patched.

Table 1

Repeating the “variance estimations” of Escobari and Hoover
Copy but MAS-CC
(1) (2) (3) (4) (5) (6)
Worker
close -8.286 (0.324) 7.975 (0.343) 16.26 (0.653) 7.243 (0.437) 6.762 (0.464) 0.377 (0.194)
continuous 36.86 (0.136) 46.69 (0.134) 9.830 (0.266) 11.28 (0.162) 11.36 (0.151) 12.39 (0.063)
and fixed effects[1]
Municipal 129.6
Region 23.49
region 124.7
Notes 34529 34529 34529 34529 34529 34529
s2 0.017 0.016 0.017 0.640 0.740 0.958

Source: TSE and author accounts.

Notes: Dependent variables are percentages of Válidos En Acta (often missing or otherwise misreported in count tables) and not of valid formal votes. Standard errors in parentheses are robust. Notice the differences from Escobari and Hoover in red.

[1] The F-test stats for fixed effects are not robust.

Also note that the analysis is not weighted by the number of voters in each station. For example, the constant for column 3 indicates that the average simple margin for Morales across polling stations included in the TSE announcement was 9.83 percentage points—nearly 2 percentage points above the official result at the time. Similarly, Escobari and Hoover’s score indicates that the average simple margin for Morales across all polling stations was 12.45 percentage points — again nearly two percentage points higher than the official result. This is because in actual elections, overall voting shares are not calculated the way Escobari and Hoover do. In actual elections, it’s vote totals, not average margins, that matter. Thus, polling stations with fewer votes have less influence on the final vote than polling stations with more votes. Ignoring this makes it difficult to put Escobari and Hoover’s findings into context.

Consider the two-section example of Table 2. In the rural constituency, there were 100 valid votes, Morales won by 40. In the metropolitan area, Mesa won with 25 votes out of 250. The average simple margin is (40-10)/2 = 15 percentage points. But overall (taking both stations as one group) Morales won 40-25 = 15 votes out of 350, or just 4.3 percentage points.

Table 2

An illustration of the importance of weights for context
voters no votes edge
villager 100 40 +40
urban 250 -25 -10
sum three hundred fifty 15th 4

We move to column 3 of Table 1 above. in Table 3, we present the results of Escobari and Hoover along with replications and corrections to facilitate context. First, we note that our replication (column 2) reproduces the published results exactly (column 1). Secondly, we see that when the integer number of valid voters is used in the calculation, we have 22 more observations, and we only miss four polling stations that were cancelled. Third, we note that once the polling station data is balanced by the number of valid voters (column 4), the “constant” decreases by about two percentage points. This reflects putting numbers into context. Morales’ lead (based on official numbers) in the polling stations listed in the TSE announcement was 7.9 percent of valid votes.

Similarly, when moving from column 3 to column 4, “off” increases by 0.5, which means that when giving too much importance to small polling stations, Escobari and Hoover end up underestimating the boost when moving from stations listed in the TSE announcement to those outstanding. As a group, Morales’ margin in prominent polling stations is 7.883 + 16.77 = 24.65 percentage points, not 9.843 + 16.27 = 26.12.

Table 3

Replication and re-analysis of Escobari and Hoover’s basic difference model
As it was published Repetition The right voters weighted
(1) (2) (3) (4)
Worker
close 16.26 (0.653) 16.26 (0.653) 16.27 (0.653) 16.77 (0.663)
continuous 9.830 (0.266) 9.830 (0.266) 9.843 (0.266) 7.883 (0.264)
Notes 34529 34529 34551 34551
s2 0.017 0.017 0.017 0.019

There are several ways to interpret these results. One is to simply say that they measure how much more late polls favor Morales, and not give a reason why. This analysis is only descriptive.

Another is to say that these results measure bias in counting opposition polling stations disproportionately early. Perhaps the rural stations that happen to be in Morales’ favor are simply more prone to being late, and thus left out of TSE’s ad – nickels before dimes.

The third is to say that the ad itself is divisive: the mere fact that a polling station is not included in the ad explains the increase in support, and that, had they all been included, Morales would have won by only 7.9 percentage points. Because voting took place prior to the ad, exclusion from the ad should not in itself increase Morales’ support in these polling stations. The implication is that the spike must have been due to additional fraud, either committed after the announcement or deliberate delay in reporting polling station results already known to contain fraud. That is, in this interpretation, SHUTDOWN will be a fraud agent.

In this figure, we are interested in the fraud-to-margin link, highlighted in red. Fraud is not something we can directly observe in the data, but one proposed mechanism is that the time required to carry out the fraud requires delaying verification of these tally sheets until after the TSE announcement (and then whether or not it was included in the subsequent “SHUTDOWN set”). .

Note that the published result is inconsistent regarding this interpretation. Escobari and Hoover argue in favor of the counterfactuals of 7.9 percentage points, but the constant in the model indicates an expected margin of 9.8 percentage points—not statistically different from the 10 percentage point threshold set for the election. This reinforces our view that the use of weights in the analysis is important when one wishes to interpret results.

This third explanation for the 16 percentage point difference as a measure of actual fraud is difficult to defend because of the confounding explanations in the second analysis. That is, in the model, SHUTDOWN captures everything that affects Morales’ fringe that varies across groups. There are a whole host of factors that all complicate the interpretation of the 16 percentage point difference as fraud.

We are still only interested in the fraud effect indicated in red. Of course, the count papers ended up on the set of SHUTDOWN for good reasons as well as for any supposed malice. Consider those that were transferred late (“late arrival” to the electoral authorities) and those that transmitted copies but could not be verified in a timely manner. We associate both ARRIVAL and SHUTDOWN with agility, but here “Rural” is a proxy for a combination of different geographic or socioeconomic factors, any of which may have a different effect on each. Importantly, these same factors carry information about Morales’ support, and thus influence the observed margin. Finally, the number of voters at any given polling station helps determine the order of arrival because smaller stations are able to complete the count of their votes more quickly.

The problem is that if we control SHUTDOWN alone, it carries with it information about all geographic factors. For example, since there is a station in the SHUTDOWN group, we can infer that it is more rustic and therefore more preferable to Morales. We can’t say if the 16 percentage point difference is due to the “fraud” that Escobari and Hoover seek to measure, or if it is all due to differences in geographic/socioeconomic factors. A more complex statistical model is required.

Of course, it’s not easy to identify — let alone quantify — every confounding factor. We have to bend somewhat to the reality of data availability. We must realize that shutdown is a residual effect. Whatever represents the late margin increment that is not explicitly designed is captured by SHUTDOWN. This includes both potential fraud and any nickels that were overlooked before the dimes. A “statistically significant” coefficient of closure does not indicate fraud, specifically, unless we can appropriately separate the effects.

To this point, an unexplained difference of 16.77 percentage points would be politically worrisome in the absence of other information. Applied to the 16 percent of elections included in the SHUTDOWN group, this means that a very simple model fails to explain 2.7 percentage points of Morales’ final margin. We can see this directly in the constant estimated in Table 3, column 4, which indicates that the unlocked group preferred Morales by 7.9 percentage points. If SHUTDOWN’s set is actually identical, the final election margin should be close to 7.9 percentage points and not the official 10.56. Thus, the model leaves unexplained politically significant residues that Escobari and Hoover interpret as fraud. However, we know for a fact that the critical assumption that the SHUTDOWN group is identical is wrong. The model does not take into account the significant differences between SHUTDOWN and non-SHUTDOWN groups. Nickels before dimes.

One way to deal with the dizzying array of potential differences is to divide polling stations into smaller groups. In doing so, we might hope to perform tasks such that within each group these confounding factors are more or less constant—so that we cannot easily distinguish one polling station from another Except by their inclusion or exclusion from the TSE advertisement.

As we will see in the next post, this is the reason behind columns 4-6 of Table 1.

Related Posts

Leave a Reply

Your email address will not be published. Required fields are marked *