Logistic Regression is one of the most popular statistical learning algorithms that builds prediction models and helps us explain what and how a variable of our interest makes the binary outcome (TRUE/FALSE).
In this seminar, Kan will discuss the concept of the algorithm, how to use it in Exploratory, how to interpret the result it produces, and how to communicate the insights with visualization.
Principal Component Analysis (PCA) is an unsupervised machine learning algorithm and is known as one of the most popular dimensionality reduction technique. It can be also often used to visualize the relationships between the variables or even between the subjects of your interest such as customers, products, countries, etc.
Kan will introduce the basic concept of PCA and demonstrate how to use it to discover the patterns in data and understand the relationships better with many examples.
Factor is one of the data types in R and it is designed to address typical challenges with categorical data. With Factor data type, we can set the order for the categorical values and manipulate the order based on your needs with a series of convenient functions.
In this seminar, Kan will introduce Factor data type and show how to manage the order in Exploratory.
Analytics: Exploratory: Linear Regression Part 2 - Multiple Regression & Variable Importance
2019/8/21 (Wed) 11AM - Noon (US Pacific Time)
This is a follow up session from the previous session “Introduction to Linear Regression Part 1 - Basic”.
In this session, Kan will introduce more advanced topics such as Multiple Regression, Co-Linearity, and Variable Importance. Also, he will demonstrate how you can build multiple Linear Regression models for multiple groups and how you can use this technique to make your analysis one step deeper.
Analytics: Introduction to Linear Regression Part 1 - Basic
2019/8/14 (Wed) 11AM - Noon (US Pacific Time)
Linear Regression algorithm is considered as a basic algorithms, yet it is still one of the most popular algorithms in the world of data science because of its simplicity and applicability to many use cases.
Kan will be introducing the basic of Linear Regression algorithm and how to gain useful insights from the prediction model built by the algorithm in order.
Kan will be introducing new features and critical enhancements of Exploratory v5.3.
Analytics: An Introduction to Distance / MDS
2019/7/17 (Wed) 11AM - Noon (US Pacific Time)
A family of distance algorithms help you understand the similarity (or difference) among your subjects such as customers, countries, products, etc. By using another algorithm called Multi-Dimensional Scaling (MDS), you can visualize such relationship in a much more intuitive way.
Kan will be introducing Distance algorithms and how to use them in Exploratory.
Analytics: Cohort Analysis with Survival Curve (Kaplan-Meyer) for Subscription Business
2019/7/10 (Wed) 11AM - Noon (US Pacific Time)
Cohort Analysis is one of the most critical analysis in SaaS / Subscription businesses. It helps you understand how your customers are churning (or retaining) your service as the time goes by.
And if you want to do it right, you want to use Survival Curve algorithm a.k.a. Kaplan-Meyer. This technique has been used in other areas such as employee attrition, machine maintenance, patient treatment, etc. but it works great for today’s data savvy SaaS businesses.
Kan will be introducing what the Survival Curve is and how you can use it in Exploratory.
Random Forest is known as one of the ensemble machine learning algorithms that build ‘decision tree’ based models to predict either categorical or numerical outputs based on the patterns inside the data.
It can be often used as ‘Variable Importance’ to find which variables are more important to predict the target output.
Kan will be showing how to use it with Exploratory’s Analytics view along with various methods like Boruta, EDARF, and SMOTE (adjusting imbalanced data).
Kan will be introducing Decision Tree, which is one of the machine learning algorithms that build prediction models based on the patterns inside the data, by demonstrating it with Exploratory’s Analytics view.
Kan will be introducing various data wrangling techniques to clean and transform Text data. Also, he’ll be introducing the basics of Regular Expression, with which you can manipulate your text data in a much more flexible way.