Extra Credit!

Loading Package + Data

library(tidyverse)

extracredit <- read_csv("https://raw.githubusercontent.com/sonoshah/POSC149/master/extracredit.csv")

This dataset, has all counties in California + All of the variables your used in part’s I, II, and III. Using your answers from Lab 2, re estimate your guesses for clinton_safe + trump_safe and compare them to CLINTON_FINAL, TRUMP_FINAL. These are the real vote totals that they got in 2016.

YOUR ASSIGNMENT

Please do the following:

For each part of lab 2 (I, II, and III):

  1. Compare your guess with the real guess, and create a variable that tells me how far off your guess was
  • For example: in Part I, your guess for clinton_safe for Alameda was 500,000 votes but the real total was 514,842.

  • That means your guess was off by 14,842 votes. The smaller the difference between your guess and the real total, the better your guess was.

HINT you will want to use mutate to create a variable for this

  1. Which method for guessing safe votes was most accurate, for the most counties? Part I, II or III? Why do you think that is?

Winners and Losers

Since you have the real vote totals and you have TOTAL_VOTERS that tells you the number of votes in each county, you can also figure out which candididate actually won in each county.

  1. Using that information, how good were your guesses for each part?

  2. Name the counties that you got wrong in each part.

HINTS

The Winners and Losers part might be a little tricky since it might require you to use a command we haven’t used before. It’s called if_else(). It allows you to create a variable that equals one thing if some condition is true and another thing if the condition is NOT TRUE. This will be useful for you for this question.

Something that outputs like this would be a good idea:

correct n
I WAS RIGHT 55
WRONG! 3

Here is some code to get you started, my comments are in there as well:

extracredit %>% 
  mutate(clinton_safe_votes = (HOWEVER YOU DECIDED TO MAKE THIS HERE),
         trump_safe_votes = (HOWEVER YOU DECIDED TO MAKE THIS HERE),
         realwinner= if_else(PUT CONDITIONAL STATEMENT HERE, "Clinton Wins", "Trump Wins"), 
         # If (conditional statement is true, then realwinner = Clinton Wins, if the statement is not true,
         #then realwinner = Trump Wins)
         myguess= if_else(clinton_safe_votes > trump_safe_votes, "Clinton Wins", "Trump Wins")) %>%
        # Same idea here, you are creating a variable called "myguess"
  select(County, myguess,realwinner) %>%
  mutate(correct = if_else(myguess == realwinner, "I WAS RIGHT", "WRONG!")) %>%
  #Creating a variable called correct, if "myguess ==realwinner", then correct = "I WAS RIGHT",
  #if myguess is not equal to realwinner, then correct = "WRONG!"
  count(correct)
  #give me a count of the variable correct (i.e. the number you got right, the number you got wrong)

In order to identify which counties you got wrong, you should be able to remove a bit of code, and use some filter commands to get you there.

GOOD LUCK!