San Francisco Chronicle Wine Award Data Analysis

There is data about the historical wine awards done by San Francisco Chronicle.

Summary View:

Table View:

With this data, I want to see if there is any correlation between the price and the award type. The award types are something like Gold, Silver, Bronze, etc. The question is, does higher price wines tend to get higher awards?

Data

It was originally hosted here at data.world.

https://data.world/rdowns26/sf-chronicle-wine-competition-results

Looks someone web-scraped from this website.

http://winejudging.com/medal-winners/

I have shared the data here for you to download.

https://exploratory.io/data/kanaugust/San-Francisco-Chronicle-Wine-Awards-Data-HZO2kjT7Bb

Analysis

Most of the Award Winning wines are concentrated in California.

Loading...

So I will zero in on California Wine data.

Wine Award Types

There are Double Gold, Gold, Silver, Bronze, and others for Award Types. The most frequent one is Silver, then Bronze.

Loading...

I’m going to create a new variable whose values can be TRUE if it’s either “Gold” or “Double Gold” and FALSE if it’s anything else.

With this new variable, we can see that North Coast and Central Coast in California have the most Gold (including Double Gold) awards.

Loading...

Those regions have about 20 to 30% of the entries being awarded for the Gold.

Loading...

Price and Award Correlation

There are some amount of correlation between Price and Award Types. For example, in Central Coast and North Coast the prices are higher for the Gold award type.

Loading...

By running Linear Regression, we can confirm that in Central Coast and North Coast there are significant correlations between Price and whether it’s awarded as Gold or not.

Loading...

By running Logistic Regression, we can confirm that even considering the effects of Vintage Year and Awarded Year, the Price is still having some amount of effect on whether it would be rewarded as Gold or not.

Loading...

How much Award Type can explain the variance of Price?

But, by looking at R-Square for the linear regression model built above, the award type can explain the variance of the price only little bit.

Loading...

This means that, yes, the Price and the Award Type have some sort of correlation between them, but even then, the price is not a deciding factor of the award type.

Conclusion

The price and the award type are correlated at some extend. This means, the higher the price is the higher the award type it would get. However, it doesn’t have a good predictive power, hence the price alone is not a deciding factor for the award type.