How to create a Linear Regerssion Model as a step then prdedict with new data

With Exploratory Desktop, you can create a Linear Regression Model in Analytics View then visualize it. However, you can create a Linear Regression Model as a step in your data pipeline. This comes handy when you want to apply the model to a new data frame to predict the outcome or validate it with test data.

In this note, I’ll walk you though a way to do it.

Create a Linear Regression Model

Click plus button next to “step”, select Build and Evaluate Model from the menu, and select Build Lenar Regression Model

For example, to create a linear regression model that predicts new baby’s weight (weight_pounds) with Mother’s age (mother_age), Father’s age (father_age), Plurality(plurality), and Birth Month (month) columns, assign “weight_pounds” to target and “monther_age”, “father_age”, “plurarity”, and “month” as variables.

To split your data into training data set and test data set, provide ratio for the test data set. In this example, I put 0.3 (i.e. 30%) for test data set. It means 70% of the data will be used to train the model. It’s ready to build a model so click the “Run” button.

When successfully run, you can check execution result and parameter estimate like below. You can also see how much of your data is splitted for training and testing respecitively.

Predict with Test Data

To apply your model to test data and get predicted value, click plus buttn then select Predict on Test Data

Click “Run” button on the opneed “Predict Data” Dialog.

Then predicted value is stored in predicted_value column. You can check standard error or confidence interval values with tandard_error, conf_low, and conf_high columns

Predict with Other Data Frame

If you have other data frame that you want to predict, click token for the step that you created the prediction, then select Other Data Frame as a data and select a data frame that you want to use for prediction.

Then the predicted result is stored in predicted_value column

Evaludate Quality of Prediction

To evaludate quality of prediction, click plus button, select Build and Evaluate Mode, select Evaluate Quality of Prediction , and select Regression - Metrics

Select predicted_value for Prediction Value Column and weight_pounds for Actual Value Column and click Run button.

And now you can see R Squared and other information to evaluate quality of prediction.