-->

Recent articles

An Introduction to Using Google BigQuery in R

Google BigQuery is an enterprise data warehouse that allows users to store massive datasets and access them with super-fast SQL queries that harness the processing power of Google’s infrastructure. BigQuery is also home to large public datasets including the Medicare dataset used here. The Google Cloud project overall is a great place to start exploring open data and learning more about the various Google APIs created for the access and presentation of Big Data (Terrabyte+).

In this tutorial we will set up a Google BigQuery account, set up and test a sample SQL query on the Medicare data in the...

Read More

Mapping Electoral Votes and Unemployment in the US

This map shows states that Trump and Clinton won for electoral votes in 2016, and it also shows the unemployment rate (UER) for every county in the US 50. The scale starts at 0-2% and increases in increments of two, the highest unemployment rate ranges from 10-25%. The mean and max unemployment were very similar for Trump (4.9%, 21%) and Clinton (4.8, 22%), but the spread among states is noticeably different.

The high unemployment (>10%) in Clinton’s states were only in California in Imperial and Tulare Counties. Counties with unemployment higher than 10% were peppered throughout the states Trump won....

Read More

Bureau of Labor Statistics Data with Minor Data Munging

This foray into the somewhat recent history of unemployment rates (UR) will be a quick read. Before we even start trying to reason with what the data is telling us at first glance. We need to know everything we can about the data. What can the data at hand tell us? Are appearances deceiving (think UR percentages and the time series rises and falls)?

There are two lines on all of these charts. Those are there for a reason. If we segregated all the factors and used a wider array of seriesID data from BLS, we might expect to see...

Read More

Bureau of Labor Statistics Data with blscrapeR: US Unemployment

The Bureau of Labor Statistics (BLS) has made good on the Open Data Initiative aims of providing open access to government datasets. Fortunately there is an R language api for the BLS data called “blsscrape”, which I found first on Jason Rickert’s article on R Packages for Data Access at the Revolutions blog. This example uses BLS data for the U-3 (unemployment) and U-6 (marginally employed) datasets for series numbers LNS14000000 and LNS13327709 respectively. The years selected range from 1994 to 2016 and are grouped by the last three US president’s terms of office.

Unfortunately, access to the BLS...

Read More