Old Dominion University
A to Z Index  |  Directories


Glen Sussman




POLS101

POLS300

POLS335

POLS407

POLS434

POLS496

IS710810

IS795895




POLS101S


INTRODUCTION TO DATA ANALYSIS

Political Science, one of the social sciences, is the systematic study of politics and government. What does systematic mean? It means that you analyze many cases rather than simply using one observation to draw a conclusion. Political scientists who study American Politics want to have a better understanding of the behavior of individuals and groups within the American political system.

In the process of studying politics scientifically, we do what scientists do in the natural sciences - - namely, we describe, explain, and predict as we learn more about the political world around us.

In this course, you will sometimes be examining politics and government normatively and at other times you will be studying it empirically.

Normative means thinking about the world as it ought to be.  Empirical means looking at the world as it is.

In order to pursue systematic research, we need data and variables.

Variables are characteristics of the things we are studying that vary from one case to another (e.g., type of occupation, income level, party identification)

Variables can be dependent or independent. An independent variable is assumed to be the cause of change in the dependent variable. For instance, we can analyze whether the dependent variable "voting behavior" (whether an individual votes or not) is influenced by the independent variable "gender" (male or female).

So, in doing data analysis, we are trying to understand the relationship between variables. One way to do this is to organize variables into tables called crosstabulations or frequency distributions. This way you can assess whether and to what extent there is a relationship between two variables. For instance, is there a relationship between education and voting?

We can measure the variable "education" by dividing it into years of schooling or level of education (no high school, high school degree, some college, college degree, advanced degree) and the variable "voting" into voted or didn't vote. We would then determine whether there is a relationship between the level of education and voting. Research has shown that people with a higher level of education are more likely to vote and people with a lower level of education are less likely to vote.

In the following table (using hypothetical data), we can evaluate the relationship between gender and voting.

  men  women
 voted  20%  80%
 didn't vote  80  20
   100%  100%
 p<.05    

The percentages in this crosstabulation suggest that women are much more likely to vote than men. Therefore, we might conclude that there is a relationship between gender and voting. In order to be assured that this conclusion is correct we want to determine whether the relationship is "statistically significant." In other words, we want to have a degree of confidence in our results or findings.

In many instances, we draw conclusions about a population from studying a sample of the population. A "population" is the whole group. A "sample" is a small representative group selected from the population (the larger group). Since we can't always study the whole population (all the citizens of the U.S.) we study a sample of all of the citizens (perhaps 1500 people). What we learn about this sample we "generalize" to the larger population. In other words, we assume that what we have learned from the sample is applicable to the population, i.e., the sample reflects the characteristics of the population. But since we are looking only at the smaller group (the sample) there is going to be some "error" in our measurement.

In political science we might argue that if we get the same results 95 times out of 100 we have learned something about what we are studying. In other words, the chance that our results (research findings) are due to accident or error is less than 5 times out of 100 (p<.05 or p<5%). For instance, in the case above about gender and voting, we are saying that we can be confident that the relationship we see is true - - namely, there is a relationship between gender and voting - - women are more likely to vote compared to men).

In contrast as we see in the table below (using hypothetical data), there is no relationship between gender and voting. The chance that what we are seeing is due to error is more than five times out of 100 (p>.05 or p>5%). In this case, men and women are just as likely to vote or not vote.

   men  women
 voted  65%   60%
 didn't vote  35   40
  100% 100%
 p>.05    

By comparing the two tables above, you can assess the percentages to make a judgement. You can then look at the probability (p<.05) or (p>.05) to ensure that you are confident in your findings regarding whether there is or is not a relationship between the two variables.

Here is another hypothetical example. Suppose you wanted to find out if there are similarities or differences between people who live in the four major regions of the country (east, midwest, south, west) and whether they would allow or not allow gay marriages. How would you interpret the following table:

   east  midwest  south  west
 allow  65%  65%  55%  67%
 don't allow  35  35  45  35
   100%  100%  100%  100%
 p<.05        

According to the table, are there similarities and differences between regions of the country on this issue. First, what do the percentages tell you? Second, what does the probability indicator tell you?

OK, in which region of the country are people less like to support gay marriages? Why? What have you learned from this exercise?

Here is another example.  In our democratic system, we use majority rule. In other words, whoever has the most votes (over 50%) wins. What happens if no candidate wins at least 50.1% of the vote? How do we decide who wins? The anwer is the plurality system where no one candidate wins a majority of the vote but wins more than the other candidates.

 Candidate  Percent Support
 I.M. Happy  45%
 I.M. Sad  35%
 I.M. Lost  20%
  100%

In the table above (using hypothetical data), I.M. Happy did not win a majority (more than 50%) of the vote but she did win a plurality (45%) of the vote so she wins the election.

There are many more examples and we will explore them in class. Do not hesitate to contact your instructor if you have any questions.