Kan Nishida
Having fun with analyzing interesting data and learn something I didnt’ know before.
This was created for Survival Analysis.
Data Source:

Data World (https://data.world/rdowns26/sf-chronicle-wine-competition-results)

---
Scraped from SF Chronicle's annual wine competition results: http://winejudging.com/medal-winners/

Data is available from 2014 - 2019, although formatting is slightly variant between years.

Wine awards range from best to worst as:

Best of Class
Double Gold
Gold
Silver
Bronze
And sweepstake wines are overall the top wines picked.
---
We can see that the north east side of the country has higher concentration of Maori folks.

I have downloaded the population data with ethnicity information from Statistics New Zealand. (http://nzdotstat.stats.govt.nz/wbos/Index.aspx) and the boundary data from this page (http://www.stats.govt.nz/browse_for_stats/Maps_and_geography/Geographic-areas/digital-boundary-files.aspx) at the same website.
I have downloaded the data from a Github repository called "County_Level_Election_Results_12-16" maintained by Tony McGovern (https://twitter.com/tonmcg). The 2016 data there was originally scraped out of Townhall.com's 2016 election result page.
I have used ungroup command to ungroup before dropping 'Co/Dist Code' column.
This sample Github issue data is perfect for playing with window calculations like 'Running Total'.
This sample data would be perfect for playing around with some of the window calculations like '% of Total', 'Difference From', 'Moving Average', etc.
This is to demonstrate how to calculate 'Running Total' with dplyr.
This is to demonstrate how to calculate 'Moving Averaget' with dplyr.
This is to demonstrate how to calculate 'Difference from Previous & First' with dplyr.
This is for demonstrating '% of Total' window calculation with dplyr.
I have used quantmod package to extract the data from Yahoo Finance and prepared this data for demonstrating a time serieis analysis for a blog.
I have geocoded the hospitals with zipcode database from 'zipcode' R package and filtered only for the top rated hospitals based on 'Overall hospital rating - star rating' as the measure.
Based on the data from NOAA, I have calculated the correlations among the regions by using 'cor' function.
I have used Distance and Multi-Dimensional Scaling algorithms based on the voting records for United Nations General Assembly voting records from 1945 to 2014. The data is filtered for Obama administration but you can either remove to include all the administrations or switch to a different administration.
Scraped 'Chronological List of Presidents of the United States' from the web and transformed it so that it can be used as a mapping table to map years to US Presidents.
You can use if_else function to label Weekend or Weekdays. This is an answer to the question posted at Community page as, https://community.exploratory.io/t/true-false-question-in-x-axis/21/1
This is for testing the PDF parsing function.
This is for testing.
The data is from US Department of Transportation. I have applied dist function to calculate the distance among the columns and cmdscale function to calculate the positions of the columns in 2 dimensional space.
This data is sampled from California Supply Chain Transparency data. I have tokenized, stemmed, removed stopwords, constructed n-grams, then applied SVD to reduce the dimensions, and applied K-means to build clustering models.
Comparing the currency rates of GBP and JPY against USD / EUR / JPY / GBP to see how Brexit was impacting the rates. Data: OANDA.com
Extracted tweets from Clinton and Trump's official accounts and score their sentiments.
Loading