How to Use Exploratory Part 2 - Visualization

This note is the second part of “How to Use Exploratory” series, designed to help you start using Exploratory efficiently, focusing on “Visualization.”

It is designed to help you learn useful features for quickly discovering hidden patterns in data using charts in Exploratory, by working hands-on with sample data.

The estimated time to complete this is about 20 minutes.

Let’s get started!

1. Import Data

We will use sample data of “Airbnb Listing Data for New York City” data. You can download the data from here.

In this dataset, each row represents one property, and columns contain information such as price and accommodates for each property.

Once the data is downloaded, open the download folder and drag and drop “Airbnb Listing Data for New York City.csv” into the Exploratory window.

A dialog for importing data will open.

On the left side of the data import dialog, you can configure various settings for how the data is read during import, but for now, simply click the “Import” button.

Specify a data frame name and click the “Create” button.

Once the data is imported, the Summary View will be displayed, allowing you to quickly overview the data.

3. Visualize Correlation

In data analysis, it is crucial to investigate whether there is a relationship between two columns, often referred to as “correlation.”

Correlation refers to a relationship where if the value of one variable changes, the value of the other variable also changes together according to a certain rule.

The “correlation coefficient” is an indicator that represents this correlation.

The correlation coefficient ranges from -1 to 1. A value close to 1 indicates a strong positive correlation, and a value close to -1 indicates a strong negative correlation. A value close to 0 means there is no correlation.

Investigate Correlation with a Scatter Plot

Exploratory offers various methods to investigate correlation, but this time we will try the simplest method: using a “scatter plot” to investigate correlation.

This time, we will investigate whether there is a correlation between two numerical columns: “accommodates” and “price”. Intuitively, rooms that can accommodate many people might have a higher price per night, but is that really the case?

From the chart view, click the “Add a new chart” button to create a new chart.

Select “Scatter Plot (No Aggregation)” as the chart type.

Select “accommodates” for the X-axis and “price” for the Y-axis.

You can find some outliers in the plot.

So, uncheck “Include outliers” for X and Y axis.

This allowed us to plot each value as a point at its corresponding position.

Trend Line - Linear Regression

Now, to investigate whether there is a correlation between these two columns, we will draw a straight line called “Linear Regression” as a trend line.

From the Y-axis menu, select “Linear” under “Trend Line” menu.

A linear trend line has been drawn on the scatter plot, and it can be confirmed that the slope is upward.

Hovering the mouse over the trend line revealed that the correlation coefficient is “approximately 0.5”, indicating a “moderately strong positive correlation”.

In other words, there is a correlation between “accommodates” and “price”, meaning that as the number of people who can be accommodated increases, the price per night also tends to increase.

This concludes the visualization part of the Exploratory usage guide!

How to use Exploratory Series

You can find other parts of the Exploratory Usage Series via the links below. Please try the next part on “Data Wrangling”.

  1. Basics - Link
  2. Visualization
  3. Data Wrangling (AI) - Link / Data Wrangling (UI) - Link
  4. Analytics - Link
  5. Dashboards - Link
  6. Notes - Link
  7. Parameters - Link
Export Chart Image
Output Format
PNG SVG
Background
Set background transparent
Size
Width (Pixel)
Height (Pixel)
Pixel Ratio