We’ll use the 2016 US Election TV Advertisement Campaign data, which was originally hosted at Political Ad Archive and I have cleaned up and shared it here. You can download it as either CSV or EDF (Exploratory Data Format) and import it into Exploratory.
Before we start, there is one thing to note. In R, there are two basic data types around date and time in R.
One is Date, which contains only the date information like 2016–01–10.
Another is POSIXct, which contains both date and time information like 2016–01–10 10:45:20.
Having said that, let’s begin.
There are two columns called ‘start_time’ and ‘end_time’, which indicate what time each of the TV ads started and ended.
The problem is that these twese two columns came in as Character data type.
To make the analysis easier later, we want to convert these to Date/Time data type. This data contains both Date and Time so we want to convert them as POSIXct data type.
You can select ‘Change Data Type’ -> ‘Convert to Date / Time’ from the column header menu.
Since the data is in an order of ‘Year, Month, Day, Hour, Minute, Second’, you want to select ‘Year, Month, Day, Hour, Minute, Second’ from the sub-menu.
This will create a ‘mutate’ step with an expression like below.
ymd_hms(start_time)
This ‘ymd_hms’ function stands for:
It can be used for any text / numeric data that follows ‘Year, Month, Day, Hour, Minute, Second’ order.
Anyway, once you run the above step, you will get the ‘start_time’ column converted to POSIXct data type and the Summary view shows a nice histogram showing the range of the data.
Now, you can do all sorts of things with this data.
For example, you can go to Chart view, assign these new columns to X-Axis, and switch the aggregation level to something like Year, Month, Quarter, Week, etc.
What if you have only Year, Month, and Day, but not Time part of the data?
Then, there is a function called ‘ymd’.
Ok, then what if the date and time data doesn’t come in ymd_hms format?
For example, you might get data like below.
The X4 column has the characters ordered in Day, Month, Year, Hour, Minute, and Second.
Well, there is a function called dmy_hms that is literally reflecting the order of the data.
These functions are all coming from an R package called ‘lubridate’ and they are super powerful.
It doesn’t matter what characters are between the date and time attributes like Year, Month, Day, etc. For example, ymd function can take any of the following data and convert them to Date type appropriately.
2018-04-02
2018/04/02
2018/4/2
Reported on 2018, April 2nd.
What it matter is the order in which Year, Month, and Day are presetented.
It just works.
What happens if the month names came in as different languages other than English? Well, you can simply set the locale like below and that will parse the characters appropriately.
ymd_hms(start_time, locale="ja_JP")