Hi there!

It’s Kan from Exploratory.

Before starting this week’s Exploratory’s Weekly Update, there are two things.

First, my personal update. We’ve just had our baby a week ago. It’s been super hectic, but awesome, so much love. There is something about the newborn. ;)

Second, we have rescheduled our online Data Science Booster training to this coming April due to the fact that we needed a more time to prepare the contents and the next release of Exploratory v4.3. Thanks to those who have kindly accommodated the new schedule! The entry is still open, and we have a student discount (50% off). If you are interested in learning Data Science without programming, sign up today!

Enroll April Booster Training!

Now, let’s start this week’s updates!

What We Are Reading

What AI can and can’t do (yet) for your business  - Link

When you start AI / Machine Learning projects you will most likely hit the following 5 challenges.

  • Need large data set for training models
  • Need to label the training data
  • Don’t know what’s happening inside AI / ML models — Blackbox
  • Hard to generalize AI models
  • Bias in AI models

Labeling the data is a part of data preparation, which is the most critical task for building better models given that the AI models are heavily depending on the quality of the data. This is why Data Scientists spend their 80% of the time on Data Wrangling.

You Don’t Have to Be a Data Scientist to Fill This Must-Have Analytics Role  - Link

Companies hire more data scientists and expect them to find amazing insights magically, but this is one of the reasons why many Data Science projects fail at big companies. Data Scientists don’t understand Business and Business leaders don’t understand Data Science. To address this communication problem, a new role called ‘Translator’ in Data Analysis is emerging.

Personally, I don’t like creating another role and have business leaders throw everything on this new role. For the next generation of leaders and managers need to be able to employ some of Data Science methods just like they use Excel today. But, if it helps to make Data Science projects more success, then why not? It might be one of the roles we need in a transitional period.

As China Marches Forward on A.I., the White House Is Silent - Link

While China has a plan to spend $150 billion in AI till 2030 and becomes the world’s leader in AI, the current US government is cutting 15% in Science and Technology research funding in 2018.


  • Gradient Boosting in TensorFlow vs XGBoost  - Link
  • Different continents, different data science  - Link
  • Artificial Intelligence Nears the Summit of Hype in Davos  - Link

Quote of the Week

Sean is the inventor of Prophet — Time Series Forecasting algorithm, which you can use in Exploratory. I’d strongly recommend you read the whole thread if you are serious about data analysis. Link

Interesting Data

LinkedIn Top 10 Skills by Year  - Link

LinkedIn publishes a list of top 10 skills desired by companies every year. Someone collected it from 2013 and 2017 and merged them together. Statistical Analysis & Data Mining is a skill that ranks as number 2, which is no surprise given the popularity of Data Science these days.

(Introduced by ‘Data World’.)

What is Happy DB - Link

They used Amazon’s Mechanical Turk (crowdsourcing), asked people to write moments when they felt happy in the last 24 hrs, 1 week, and 1 month, and published the data as a series of CSV files. You can import the files into Exploratory, if you have, then do some text analysis to find some interesting insights. Here’s a tutorial for a quick text analysis in Exploratory, though it’s a bit old.

(Introduced by ‘Data is Plural’.)

What We Are Working On

We are adding Statistical Test capabilities into Analytics View to make it easier to access them and use them for the next release, v4.3, One of them is Chi-Squared Test.

Also, by visualizing the pairs of Category A and B that contribute increasing Chi-Squared you can spot some unusual trends quickly.

Data Science Booster Training

As mentioned at the beginning, we have re-scheduled our Data Science Booster Training to April 9th — 13th. There is also a student discount (50% off)! If you are interested in learning Data Science without programming, sign up today!

Enroll April Booster Training!

That’s it for this week.

Have a wonderful week!

Kan CEO/Exploratory

Subscribe Weekly Updates!

This is a weekly email update of what I have seen in Data Science / AI and thought were interesting, plus what Team Exploratory is working on.