It’s
the sample
Ziad Salim ;
A retired international civil servant
|
JAKARTA
POST, 15 Juli 2014
In
a previous article published in this newspaper (“Samples and souls: hazards
of election prediction in Indonesia”, April 24), the difference between
surveys and quick counts and the often less than accurate results of the
former were clarified.
Surveys
use hypothetical voters in their samples while quick counts use real voters,
hence, the latter are more accurate.
But
now we have a case where even the quick counts are disagreeing with one
another.
First,
the culprit is not in the methodology or statistics; they all use the same
and correct methodology and statistical techniques.
If
the results are all the same (or moving in one direction), the difference in
percentage points obtained can be gleaned from the margin of error and/or
confidence level.
If
the differences are very low (say 50 percent for one against 48 percent for
the other with plus 2 percent margin of error), it simply means no valid
inferences can be drawn, that is all.
But
if the results are topsy-turvy like they are now (where half of the survey
results pointed to pair Number 1 and the rest to pair Number 2), then the
culprit is always in the samples where they can be biased,
under-representative or even unrepresentative and this can happen as the
result of sample manipulation. Samples can be manipulated not only in many
ways but very easily too.
At
the extreme end, a geek with a laptop in a room by himself can come up with a
survey result all based on data mined in his own head.
And
a real example of this showed up a few weeks ago where one clever person
actually cut and pasted the survey results of the previous Barack Obama and
John McCain campaigns in the US and tacked it on to CNN website, with many
lapping it all up. Even though the hoax was eventually discovered, the damage
had been done.
If
the person wants to make his data a little bit more legitimate, he can
collect a few, real pieces of data from outside.
If
he collects too little (say, less than 1,000), his sample will be
under-representative or too small relative to the population.
But
all the quick counts collected huge data for this election (between 2,000 to
8,000), which means the divergent result did not come from sample size.
So,
where did the divergence come from? First, it is ironic that the survey
outfits were split between what have been described as “generally credible”
and those suspected of leaning toward one candidate whose supporters have
been accused of doing some “black” campaigns where they would do anything to
win.
So,
what would stop them from manipulating their data, where it is almost
impossible to tell which data is legitimate and which is not?
As
stated above, the extreme form of data manipulation is where it is cooked up
in one room (the data is fictitious, but in a country where 600,000
“fictitious cows” could be imported from Australia, coming up with 2,000
“fictitious voters” would not be hard).
Second,
they are collected from electoral districts known to be populated by voters
supportive of one group (so that, if you collect most of your data from West
Java, the stronghold of pair Number 1, you are bound to come up with pair
Number 1 as the winner, due to a biased sample).
If
you want to be more subtle and your data manipulation a little less
detectable, you will collect your data from some remote areas too (say, in
Papua) but only from locations where you know there are a lot of supporters
of pair Number 1 (and this is easy to ascertain from the work of party
members, field observers and election officials who can easily be paid if not
bought).
The
above possibilities explain why it is so easy to do a biased survey and to
tailor-make the results as preordered by somebody.
The
claim by some leaders of bogus survey outfits that payment by the subject
ordering the survey does not influence the result is an outright lie or a
form of public deception.
Even
if you bare all your data (as one of the challenges made by one head of the
suspected survey group), no one will be able to find any irregularity in your
data especially if the field reports were sent to the survey centers using
electronic means such as text messages via cell phones and the like, all
unreliable and subject to manipulation.
It
is strange, during the legislative election last April, when there was no
national polarity or polarization in the choice of candidates or parties, all
survey results had all their ducks in a row.
So
the explanation for the divergent results in the recent presidential election
(with only two and two polar opposite candidates running where one pair are
determined to win by hook or crook), it is not that difficult to believe that
the survey groups too had taken sides and data manipulation was the route
they took, not so much to convince the public who won the election, but to
create uncertainty, doubts and muddy the waters where they can hope to do
their dirty fishing.
In
the end, while the presence of survey institutes is legitimate in supporting
democracy by providing early results so the powers that be can no longer
manipulate elections, this time, sadly, some of them have entered the fray to
subvert the process.
The
problem is not that the people were split in the middle and could not or did
not make up their minds cleanly about who to support in this election, but
that the survey groups themselves have split in two, one actively taking
sides, manipulating their samples, hiding behind the scientific statistical
approach and taking advantage of democracy to subvert democracy itself, while
assuming we are all stupid when in fact it is their samples that are stupid. ●
|
Tidak ada komentar:
Posting Komentar