US Mass Shooting Data Analysis

CS 390 Final Project

Ziyang Zhang, Xingbang Liu, Zijun Xia


Murder flickr photo by Asbestos Bill shared under a Creative Commons (BY) license

How did we get the data?

  • Kaggle
  • Las Vegas mass shooting
  • Question:
  • Can we find patterns from the mass shooting cases, so that we can prevent it from happening?

Data set introduction

Including:

ID, Title, City, State, Date, Summary, Fatalities, Injured, Total victims, Mental Health Issues, Race, Gender, Latitude, Longitude

Prehandleing

1. Missing location information

2. Title difficult to unerstand

3. Data not tiddy

Data details

Source: Mass shooting data(1966 - 2017)

  • Chosen from Kaggle website.
  • Definition of Mass shooting: greater than or equal to three victims.
  • Goal:

1st Question:

How can we use maps to analyze the data?


            Abandon dot plotting <- Too messy
            New library(maps) and library(ggmap)
          

The greatest number of victims in each state.

The minimum number of victims in each state.

2nd Question:

The Same Victims?

As the picture shows, Florida & Texas has the similar color

Filter the data

  • Left only Florida and Texas
  • Add all the victim from each year together

T-Test

  • P-Value ≈ 1
  • Those two states almost have the same Victims

Does mass shooting has any relationship with the age of shooter?

Does mass shooting has any relationship with the mental state of shooter?

Conbind together