It is said that in data analysis, about 80% of the time is spent on “data wrangling,” which involves data formatting and processing tasks.
When people think of data science or data analysis, they often imagine using complex algorithms to derive insights. While there are many books and courses to learn these, there are few opportunities to systematically learn about data wrangling.
We have also found that the most common questions and consultations from Exploratory users are related to this data wrangling. Although we have provided many how-to tutorials, users who are new to Exploratory or unfamiliar with the concept of data wrangling often struggle to properly process and format their data.
To lower the barrier to data wrangling and enable anyone to transform data into valuable information, we have added an “AI Prompt” feature that allows data processing in natural language.
You no longer need to manipulate various UIs or master complex functions for data wrangling (processing, formatting). From now on, you can freely and quickly process data by simply entering what you want to do regarding data wrangling, such as:
This prompt feature allows you to freely ask questions about data wrangling and suggests R code to solve them.
When you submit a prompt, three things are output: “R Command,” “Explanation of Functions Used,” and “Expected Results.”
In the R Command section, the data wrangling tasks necessary to process the queried content are output as R code, so you can apply that processing by pressing “Run as Step.”
If you want to modify the R Command, click the edit button.
From the edit mode, you can change or modify the generated column names and make other adjustments.
The functions used in the R Command are explained in the “Explanation of Functions Used” section, detailing what functions and arguments are being used.
Finally, the “Expected Results” section explains what results will be obtained by executing this code.
Now,
let’s introduce how to write prompts effectively!
The basic structure for writing prompts in English is typically “verb + column name + processing content” using a command form.
For example, let’s say we’re using sales data from an e-commerce site where each row represents one ordered product.
If we want to create a column that determines whether the sales are 1000 or more, we enter “Determine if sales are 1000 or more” in the prompt.
By executing the prompt, we get the R command result to determine if sales are 1000 or more. So, we click the “Run as Step” button.
This creates a column called “Sales 1000 or more.”
Now,
what if we want to aggregate by customer and then determine if sales are
1000 or more?
In this case, we can add the “group unit” to the beginning of the original question.
For example, change the question to “For each customer, determine if total sales are 1000 or more”
This allows us to aggregate to one row per customer and then determine if the total sales column is 1000 or more.
Finally, there are two points to keep in mind when writing prompts:
Keep it concise: Use short, clear expressions starting with a verb (like “Calculate…”, “Convert…”, “Extract…”)
Use specific column names: Include the names of the columns you want to process in the prompt
In this article, we introduced the “AI Prompt,” a new AI feature added to Exploratory from v12!
Until now, you needed technical knowledge to process data, but from now on, Exploratory allows you to freely and quickly transform data using natural language - just type what you want to do in plain English!
Upgrade to the latest v12 and try this powerful new feature today!
If you haven’t used Exploratory yet, please try our 30-day free trial!