In this article, I will introduce the operation called Spread which transforms the data in one category column into columns for each category. Note that this operation is the opposite operation of Gather, which introduces by converting horizontally spread data (Wide data) to vertically spread data (Long data).
I would like to explain the United Nations General Assembly Voting Data from Dataverse as an example. Each line of this data contains the content of each country’s vote for each resolution. For the sake of simplicity, an example of data filtered to the three countries of the United States, Canada, Russia is shown below, narrowing down the column to the resolution ID (vote_id), country name (country), voting content (vote). The data is structured such that the first row votes for Resolution 3 against the United States (USA), the second row is for Canada (CAN) vote, and the third row is for Russia (RUS) vote.
When you are processing data, you often want to transform such data in a way that each country such as USA, Canada and Russia has its own column as shown in the following figure.
You can use the Spread command to do this transformation.
By the way, the data structure before the transformation is called “Long data”. In this structure, the data gets “longer” as the number of countries increases. On the other hand, the data structure after the transformation is called “Wide data”. In this structure, the data gets “wider” as the number of countries increases. The Spread command transforms Long data to Wide data.
Here is how to transform Long data of the above example to Wide data on Exploratory.
Go to the table view, hold down the command key (control key for Windows), click the country column and vote column. It will select both columns. Then select Spread (Wide to Long) from the column header menu.
As the dialog appears, make sure that “country” is selected for the key column and “vote” is selected for the value column.
The operation of Spread is completed and the data structure has been transformed.
I introduced the Spread command to convert Long data to Wide data. I summarized how the Spread command works in the figure.
You can download the EDF file from here and import it into Exploratory to see how the steps are actually created.
If you don’t have Exploratory Desktop yet, you can sign up from here for free. If you are currently a student or teacher, then it’s free!
If you are interested in learning various powerful Data Science methods ranging from Machine Learning, Statistics, Data Visualization, and Data Wrangling without programming, go visit our Booster Training home page and enroll today!