Make Data Science More Accessible
Exploratory Online Seminar
English日本語
Scheduled for Future Dates
Analytics: Introduction to Logistic Regression
2019/9/18 (Wed) 11AM PT (US) / 2PM ET (US)
Duration: 1 hour
Logistic Regression is one of the most popular statistical learning algorithms that builds prediction models and helps us explain what and how a variable of our interest makes the binary outcome (TRUE/FALSE).
In this seminar, Kan will discuss the concept of the algorithm, how to use it in Exploratory, how to interpret the result it produces, and how to communicate the insights with visualization.
Join from this URL!
Past Seminars
Analytics : Introduction to Principal Component Analysis (PCA)
2019/9/11 (Wed) 11AM PT (US) / 2PM ET (US)
Duration: 1 hour
Principal Component Analysis (PCA) is an unsupervised machine learning algorithm and is known as one of the most popular dimensionality reduction technique. It can be also often used to visualize the relationships between the variables or even between the subjects of your interest such as customers, products, countries, etc.
Kan will introduce the basic concept of PCA and demonstrate how to use it to discover the patterns in data and understand the relationships better with many examples.
Recording:
Slides:
Sample Data:
How to choose the right charts for Exploratory Data Analysis
2019/9/4 (Wed) 11AM PT (US) / 2PM ET (US)
Duration: 1 hour
Kan will be discussing how to choose the right charts for Exploratory Data Analysis and show you how to use charts like Histogram, Density Plot, Boxplot, Scatter, Stack Bar charts.
Recording:
Slides:
Sample Data:
Data Wrangling: Introduction to Factor for Handling Ordered Categorical Data
2019/8/29 (Thu) 10AM PT (US) / 1PM ET (US)
Duration: 1 hour
Factor is one of the data types in R and it is designed to address typical challenges with categorical data. With Factor data type, we can set the order for the categorical values and manipulate the order based on your needs with a series of convenient functions.
In this seminar, Kan will introduce Factor data type and show how to manage the order in Exploratory.
Recording:
Slides:
Analytics: Exploratory: Linear Regression Part 2 - Multiple Regression & Variable Importance
2019/8/21 (Wed) 11AM - Noon (US Pacific Time)
This is a follow up session from the previous session “Introduction to Linear Regression Part 1 - Basic”.
In this session, Kan will introduce more advanced topics such as Multiple Regression, Co-Linearity, and Variable Importance. Also, he will demonstrate how you can build multiple Linear Regression models for multiple groups and how you can use this technique to make your analysis one step deeper.
Recording:
Slides:
Sample Data:
Analytics: Introduction to Linear Regression Part 1 - Basic
2019/8/14 (Wed) 11AM - Noon (US Pacific Time)
Linear Regression algorithm is considered as a basic algorithms, yet it is still one of the most popular algorithms in the world of data science because of its simplicity and applicability to many use cases.
Kan will be introducing the basic of Linear Regression algorithm and how to gain useful insights from the prediction model built by the algorithm in order.
Recording:
Slides:
Sample Data:
An Introduction to Exploratory v5.3
2019/8/7 (Wed) 11AM - Noon (US Pacific Time)
Kan will be introducing new features and critical enhancements of Exploratory v5.3.
Recording:
Slides:
Analytics: An Introduction to Distance / MDS
2019/7/17 (Wed) 11AM - Noon (US Pacific Time)
A family of distance algorithms help you understand the similarity (or difference) among your subjects such as customers, countries, products, etc. By using another algorithm called Multi-Dimensional Scaling (MDS), you can visualize such relationship in a much more intuitive way.
Kan will be introducing Distance algorithms and how to use them in Exploratory.
Recording:
Slides:
Analytics: Cohort Analysis with Survival Curve (Kaplan-Meyer) for Subscription Business
2019/7/10 (Wed) 11AM - Noon (US Pacific Time)
Cohort Analysis is one of the most critical analysis in SaaS / Subscription businesses. It helps you understand how your customers are churning (or retaining) your service as the time goes by.
And if you want to do it right, you want to use Survival Curve algorithm a.k.a. Kaplan-Meyer. This technique has been used in other areas such as employee attrition, machine maintenance, patient treatment, etc. but it works great for today’s data savvy SaaS businesses.
Kan will be introducing what the Survival Curve is and how you can use it in Exploratory.
Recording:
Slides:
Analytics: An Introduction to Random Forest
2019/7/3 (Wed) 11AM - Noon (US Pacific Time)
Random Forest is known as one of the ensemble machine learning algorithms that build ‘decision tree’ based models to predict either categorical or numerical outputs based on the patterns inside the data.
It can be often used as ‘Variable Importance’ to find which variables are more important to predict the target output.
Kan will be showing how to use it with Exploratory’s Analytics view along with various methods like Boruta, EDARF, and SMOTE (adjusting imbalanced data).
Recording:
Slides:
Sample Data:
References:
Analytics: An Introduction to Decision Tree
2019/6/26 (Wed) 11AM - Noon (US Pacific Time)
Kan will be introducing Decision Tree, which is one of the machine learning algorithms that build prediction models based on the patterns inside the data, by demonstrating it with Exploratory’s Analytics view.
Recording:
Slides:
Sample Data:
How to Create Dashboard, Note, and Slides in Exploratory
2019/6/19 (Wed) 11AM - Noon (US Pacific Time)
Kan will be walking through the main features of Dashboard, Note, and Slides and showing how to create them effectively.
Recording:
Slides:
Introducing Exploratory v5.1 & v5.2 New Features
2019/6/12 (Wed) 11AM - Noon (US Pacific Time)
Kan is going to introduce the new features of Exploratory v5.1 & v5.2 and walk through a quick demo of the following features.
• Boruta (Random Forest)
• Marginal Effect (Logistic Regression)
• Variable Importance with Linear Regression
• Chart - Highlight with Color
• Chart - Commenting feature
Recording:
Slides:
Data Wrangling: Working with Text Data
2019/1/15 (Tue) 10AM PT (US Pacific Time) / 1PM ET (US Eastern Time)
Kan will be introducing various data wrangling techniques to clean and transform Text data. Also, he’ll be introducing the basics of Regular Expression, with which you can manipulate your text data in a much more flexible way.
Recording:
Slides:
Sample Data: User Data
Data Wrangling: Working with Date / Time Data and Visualizing It
2019/1/8(Tue) 10AM PT (US Pacific Time) / 1PM ET (US Eastern Time)
Kan will be introducing various data wrangling techniques to clean and transform Date and Time data. Also, he’ll be discussing various ways to visualize Time Series data to gain deeper insights.
Recording:
Slides:
Slides:
Analytics: An Introduction to K-Means Clustering
2018/12/18(Tue) 10AM PT (US Pacific Time) / 1PM ET (US Eastern Time)
Kan will be introducing K-Means Clustering algorithm, which segments the data based on a given set of variables, by demonstrating it with Exploratory’s Analytics view.
Recording:
Slides:
Sample Data: US Baby Data
Analytics: An Introduction to Anomaly Detection for Time Series Data
2018/12/11(Tue) 10AM PT (US Pacific Time) / 1PM ET (US Eastern Time)
Kan will be introducing Anomaly Detection algorithm, which detects anomaly data in the time series data, by demonstrating it with Exploratory’s Analytics view.
Recording:
Slides:
Analytics: Time Series Forecasting with Prophet
2018/11/26(Mon) 10AM PT (US Pacific Time) / 1PM ET (US Eastern Time)
Prophet is an easy to use time series forecasting algorithm developed by Sean Taylor and co. at Facebook. I’ll be demonstrating how to use it in Exploratory.
Recording:
Slides:
Introducing Exploratory v5.0
2018/11/15 10AM PT (US Pacific Time) / 1PM ET (US Eastern Time)
I’ll be introducing the new features of Exploratory v5.0 including the new UI/UX, new Analytics like K-Means Clustering, Association Rules (Market Basket), and more!
Recording: