Suppose you have Github issues data frame with resolved dates (i.e. closed_at column). You have another data frame with sprints that have a sprint number and begin/end dates in one month duration. Now, You want to add the sprint number to the Github issues data frame based on which sprint the resolved date (i.e. closed_at) falls in.
For example, Github issue #1002 is resolved on Oct 17, 2021 so it should fall in Sprint 5 which started on Oct 1 and ended on Oct 30.)
So the Issue Data Frame contains closed_at
column which contains the date the the issue is closed.
Sprints Data Frame contains begin and end Dates columns like below.
So you want to join GitHub issue data frame with Sprints Data Frame with the closed_at is between begin and end
condition.
So to address the issue, first you need to install fuzzyjoin Package. Click three bar icon and select Manage R Packages
.
Click CRAN tab and type in fuzzyjoin
and click Install.
And now we are ready to try the fuzzy_left_join to make the date range join condition.
Click the Plus (+) button next to Steps then select Custom R Command
.
Enter an R Script like this.
fuzzy_left_join(., Sprints,
by=c("closed_at"="begin", "closed_at"="end"),
match_fun=list(`>=`, `<=`))
it means it left joins Github issues with Sprints data frame by using closed_at column and below conditions:
With these conditions, hat it can detect which sprint is associated with each Github issue.
And after you run the command, yon can see each Github issue has a sprint number.