A Sankey chart is a type of visualization that represents the “flow” from one category to another using the width of bands. The wider the band, the greater the volume or proportion along that specific path.
This is particularly effective when you want to intuitively understand how values change or are distributed across multiple steps, such as user behavior from a site’s entry point to exit.

For example, the chart above is a Sankey chart representing the navigation paths of users entering an e-commerce site. You can track the journey from the entry source on the left to either exit or purchase, where wider bands indicate a higher number of users (sessions) for that path.
In this example, we will use event log data from a website where each row represents one user.

The sample data used in this note can be downloaded from here.
To create a Sankey chart, we use the R package
networkD3. If it is not already installed, follow these
steps to install it.
From the project menu, select “Manage R Packages.”

The “Manage R Packages” dialog will open. Select “Install New
Packages,” type networkD3 into the text box, and click the
Install button.

Once the message “networkD3 is Successfully installed networkD3” appears, the installation is complete.
To create a Sankey chart, the data must be formatted as required by
the networkD3 package.
Specifically, you need two pieces of data: one that represents “which element (event) transitioned to which element and how many times,” and a “node data” list that enumerates all unique event names.
In this guide, we will use AI Prompt to perform this data processing.
First, open the log data and click the “AI Prompt” button at the top of the screen.

When the AI Prompt input field appears, enter the following prompt to aggregate all transition combinations and run it.
Aggregate the transition counts from source to destination, excluding cases where the destination is missing.
Review
the generated code and click “Run as Step.”

This creates a data frame consisting of three columns: Source, Target, and Count.

Next, create a “Node” data frame that lists all unique event names.
Since nodes require a unique list of event names appearing in both the source and target columns, we will use the Branch feature to split this into a separate data frame.
Click the “Create Branch Data Frame” button.

When the dialog opens, enter “Nodes” as the branch name and create it.

Once the “Nodes” branch is created, execute the AI Prompt again and enter the following prompt:
Merge source and destination into one column to generate a unique list of elements.

Review the generated code and click “Run as Step.”

A node data frame consisting of a single “Element” column is completed.

Return to the original data frame (log data) where you created the branch, and add a new step following the branch point.
The networkD3 package requires the source and target
values to be row numbers (0-based indices) rather than column names, so
we will perform the index conversion here.
Open the AI Prompt again, enter the following prompt, and run it.
Lookup the source and destination entries in the Nodes column and replace with their 0-indexed positions.

At this point, remember to specify the node data frame according to the AI Prompt data frame notation.
Once the R script is generated, click “Run as Step.”

This results in a data frame where the source and target are converted into integer indices.

This is the final form to be used as the link data for the Sankey chart.
Select “Create Note” from the Report menu.

Once the new note is created, click the “Add Content” button and select “R Script.”

When the R script input dialog appears, paste the following code. (Please change the parts marked with # to match your own data.)
library(networkD3)
sankeyNetwork(
Links = Log, # Aggregate transition pairs converted to indices
Nodes = Nodes, # Specify the node list
Source = "Source", # Column name in "Links" (must be in quotes)
Target = "Destination", # Column name in "Links" (must be in quotes)
Value = "Transition Count", # Column name in "Links" (must be in quotes)
NodeID = "Element", # Column name in "Nodes" (must be in quotes)
units = "people", # Unit label (e.g., "people")
fontSize = 12, # Adjust label font size
nodeWidth = 25, # Adjust width of node blocks (in pixels)
height = 600, # Adjust chart height (in pixels)
)

After pasting the code, click the “Preview” button at the top left of the note.

The Sankey chart will be displayed as shown above, and you can adjust the positions of each node by dragging them.
If the chart does not fit within the drawing area, please change the
value of the height argument.
Sankey charts can also be displayed in a Dashboard in the same way.
Once you open the dashboard, click the “Add Text” button from the edit screen.

When the text panel is added, click the text edit icon.

When the text editor opens, select “R Script” from the “Add Content” button.

When the R script input dialog appears, paste the same code used previously in the note.
library(networkD3)
sankeyNetwork(
Links = Log, # Aggregate transition pairs converted to indices
Nodes = Nodes, # Specify the node list
Source = "Source", # Column name in "Links" (must be in quotes)
Target = "Destination", # Column name in "Links" (must be in quotes)
Value = "Transition Count", # Column name in "Links" (must be in quotes)
NodeID = "Element", # Column name in "Nodes" (must be in quotes)
units = "people", # Unit label (e.g., "people")
fontSize = 12, # Adjust label font size
nodeWidth = 25, # Adjust width of node blocks (in pixels)
height = 600, # Adjust chart height (in pixels)
)

After pasting the code, apply it and run the dashboard.

You have now successfully displayed a Sankey chart on the dashboard.
