contribution - A Tiny Contribution Table Generator Based on ggplot2
王诗翔 · 2019-07-13
Categories:
r  
Tags:
r  
package  
contribution  
The goal of contribution is to generate contribution table for credit assignment in a project. This is inspired by Nick Steinmetz (see twitter https://twitter.com/SteinmetzNeuro/status/1147241138291527681).
Authors contributions sections are new and still don’t appear in major journals, but can be improved. Rather than text listing each author’s contributions, the same data can be presented as a table with rows corresponding to contributions and columns for each author.
Advantages include:
Graphics are faster and easier to comprehend. They are also easier to locate in the document.
This format can be readily augmented with additional information, such as distinguishing between ‘major’ and ‘minor’ contributions of each type.
This format can be read as easily in either direction, answering both “Who did X?” and “What did person Y do?” equally.
This format lends itself to a nice extension for including your author contribution information on your CV: a similar table where each column is one of your papers.
A difficulty with this is the diversity of terminology used, which would be helped by refinement and more widespread adoption of the CRediT framework (https://www.casrai.org/credit.html ).
Table of Contents
Feature
- Support table type
- 3-level contribution (i.e. ‘None’, ‘Minor’ and ‘Major’)
- numeric contribution
- Pull commit number and code lines from GitHub
- Dataset
palette
contains 27 color maps
Installation
You can install the released version of contribution from CRAN with:
install.packages("contribution")
And the development version from GitHub with:
# install.packages("devtools")
devtools::install_github("openbiox/contribution")
Basic example
This is a basic example which shows you how to plot a simple contribution table:
library(contribution)
data("demo")
demo
#> # A tibble: 5 x 4
#> Class WWP NAS TM
#> <chr> <chr> <chr> <chr>
#> 1 Designed research <NA> Major Major
#> 2 Performed research <NA> Major <NA>
#> 3 Developed tool Major Minor <NA>
#> 4 Analyzed data Major Minor Minor
#> 5 Wrote the paper Major Minor Major
The accepted data format is a data.frame
whose first column show the role
and the other
columns show the people or projects. This format is easy to create using R or other tools like Excel.
For a 3-level contribution table, only Minor
and Major
are valid, a NA
value should put
in cell for no contribution.
generate(demo)
The white box represents no contribution, the grey box represents minor contribution, and the black box represents major contribution.
The table is nice, easy to read. The result is a ggplot
object, so you can modify in your way!
You can also use other colors and scale_fill_*
function from ggplot2 to map colors:
library(ggplot2)
generate(demo, text_angle_x = 20, color_map = scale_fill_brewer(palette ="Oranges"))
When it is not easy to see the meaning of color, you can show the legend.
generate(demo, text_angle_x = 20, color_map = scale_fill_brewer(palette ="Set1"), show_legend = TRUE)
GitHub contribution
To show how to generate numeric contribution table, here we use commit number of GitHub project as an example.
To get data from GitHub, I created a function pull_github()
.
# Github project contributions
my_contr = dplyr::tibble(
repo = c("UCSCXenaTools", "maftools"),
owner = c("ShixiangWang", "PoisonAlien"),
username = "ShixiangWang",
role = c("Developer", "Contributor"))
my_contr
#> # A tibble: 2 x 4
#> repo owner username role
#> <chr> <chr> <chr> <chr>
#> 1 UCSCXenaTools ShixiangWang ShixiangWang Developer
#> 2 maftools PoisonAlien ShixiangWang Contributor
contr_tb = pull_github(data = my_contr)
contr_tb
#> # A tibble: 2 x 3
#> role `PoisonAlien/maftools` `ShixiangWang/UCSCXenaTools`
#> <chr> <int> <int>
#> 1 Contributor 8 0
#> 2 Developer 0 203
Now we have data format the generate()
function can accept, show the plot!
generate(contr_tb, show_legend = TRUE, hjust_x = 0)
Palette
To change colors of contribution graph, I generated a dataset palette
according to themes provided by project github-contribution-color-graph.
We can show the palette and choose the color we like.
show_palette()
Just put the name to option palette_name
.
generate(contr_tb, show_legend = TRUE, hjust_x = 0,
palette_name = "psychedelic")
generate(contr_tb, show_legend = TRUE, hjust_x = 0,
palette_name = "cyan")
More custom options please run ?generate
in R console.
CRediT
CRediT can help assign the role, this data has been stored in this package.
data("CRediT")
Show whole table.
knitr::kable(CRediT, format = "html")
Role | Definition |
---|---|
Conceptualization | Ideas; formulation or evolution of overarching research goals and aims. |
Data curation | Management activities to annotate (produce metadata), scrub data and maintain research data (including software code, where it is necessary for interpreting the data itself) for initial use and later re-use. |
Formal analysis | Application of statistical, mathematical, computational, or other formal techniques to analyse or synthesize study data. |
Funding acquisition | Acquisition of the financial support for the project leading to this publication. |
Investigation | Conducting a research and investigation process, specifically performing the experiments, or data/evidence collection. |
Methodology | Development or design of methodology; creation of models. |
Project administration | Management and coordination responsibility for the research activity planning and execution. |
Resources | Provision of study materials, reagents, materials, patients, laboratory samples, animals, instrumentation, computing resources, or other analysis tools. |
Software | Programming, software development; designing computer programs; implementation of the computer code and supporting algorithms; testing of existing code components. |
Supervision | Oversight and leadership responsibility for the research activity planning and execution, including mentorship external to the core team. |
Validation | Verification, whether as a part of the activity or separate, of the overall replication/reproducibility of results/experiments and other research outputs. |
Visualization | Preparation, creation and/or presentation of the published work, specifically visualization/data presentation. |
Writing – original draft | Preparation, creation and/or presentation of the published work, specifically writing the initial draft (including substantive translation). |
Writing – review & editing | Preparation, creation and/or presentation of the published work by those from the original research group, specifically critical review, commentary or revision – including pre- or post-publication stages. |
More about GitHub stats
contribution package supports the following GitHub stats:
- commit number contributed to a repo
- number of code line contributed a repo
- all
- addition
- deletion
Firstly define our targets.
library(contribution)
my_contr = dplyr::tibble(
repo = c("UCSCXenaTools", "XenaShiny", "metawho",
"maftools", "DoAbsolute", "contribution"),
owner = c("ropensci", "openbiox", "ShixiangWang",
"PoisonAlien", "ShixiangWang", "openbiox"),
username = "ShixiangWang",
role = c("Developer", "Project leader", "Developer",
"Contributor", "Developer", "Developer"))
my_contr
#> # A tibble: 6 x 4
#> repo owner username role
#> <chr> <chr> <chr> <chr>
#> 1 UCSCXenaTools ropensci ShixiangWang Developer
#> 2 XenaShiny openbiox ShixiangWang Project leader
#> 3 metawho ShixiangWang ShixiangWang Developer
#> 4 maftools PoisonAlien ShixiangWang Contributor
#> 5 DoAbsolute ShixiangWang ShixiangWang Developer
#> 6 contribution openbiox ShixiangWang Developer
Commit number
At default, pull_github()
function will pull commit number from GitHub API.
contr_tb = pull_github(data = my_contr)
contr_tb
#> # A tibble: 3 x 7
#> role `openbiox/contr… `openbiox/XenaS… `PoisonAlien/ma… `ropensci/UCSCX…
#> <chr> <int> <int> <int> <int>
#> 1 Cont… 0 0 8 0
#> 2 Deve… 0 0 0 203
#> 3 Proj… 0 0 0 0
#> # … with 2 more variables: `ShixiangWang/DoAbsolute` <int>,
#> # `ShixiangWang/metawho` <int>
generate(contr_tb, show_legend = TRUE, hjust_x = 0,
palette_name = "indigo", text_angle_x = 60)
Number of code line
If we want to get number of code line, we can set the type
option.
# Total number
contr_tb1 = pull_github(data = my_contr, report_lines = T, type = 'all')
contr_tb1
#> # A tibble: 3 x 7
#> role `openbiox/contr… `openbiox/XenaS… `PoisonAlien/ma… `ropensci/UCSCX…
#> <chr> <int> <int> <int> <int>
#> 1 Cont… 0 0 246 0
#> 2 Deve… 0 0 0 127542
#> 3 Proj… 0 0 0 0
#> # … with 2 more variables: `ShixiangWang/DoAbsolute` <int>,
#> # `ShixiangWang/metawho` <int>
# Number of additions
contr_tb2 = pull_github(data = my_contr, report_lines = T, type = 'add')
contr_tb2
#> # A tibble: 3 x 7
#> role `openbiox/contr… `openbiox/XenaS… `PoisonAlien/ma… `ropensci/UCSCX…
#> <chr> <int> <int> <int> <int>
#> 1 Cont… 0 0 210 0
#> 2 Deve… 0 0 0 81687
#> 3 Proj… 0 37233 0 0
#> # … with 2 more variables: `ShixiangWang/DoAbsolute` <int>,
#> # `ShixiangWang/metawho` <int>
# Number of deletions
contr_tb3 = pull_github(data = my_contr, report_lines = T, type = 'del')
contr_tb3
#> # A tibble: 3 x 7
#> role `openbiox/contr… `openbiox/XenaS… `PoisonAlien/ma… `ropensci/UCSCX…
#> <chr> <int> <int> <int> <int>
#> 1 Cont… 0 0 36 0
#> 2 Deve… 0 0 0 45855
#> 3 Proj… 0 29053 0 0
#> # … with 2 more variables: `ShixiangWang/DoAbsolute` <int>,
#> # `ShixiangWang/metawho` <int>
generate(contr_tb1, show_legend = TRUE, hjust_x = 0,
palette_name = "pink", text_angle_x = 60)
generate(contr_tb2, show_legend = TRUE, hjust_x = 0,
palette_name = "teal", text_angle_x = 60)
generate(contr_tb3, show_legend = TRUE, hjust_x = 0,
palette_name = "summer", text_angle_x = 60)
GitHub limit and token management
For unauthenticated requests, the rate limit allows for up to 60 requests per hour. For API requests using Basic Authentication or OAuth, you can make up to 5000 requests per hour. Here we use token to manage this. Obtain a personal access token (PAT) from here: https://github.com/settings/tokens.
You can see how many times you left with pull_github_limit()
function.
pull_github_limit()
#> {
#> "resources": {
#> "core": {
#> "limit": 60,
#> "remaining": 25,
#> "reset": 1596954378
#> },
#> "graphql": {
#> "limit": 0,
#> "remaining": 0,
#> "reset": 1596955038
#> },
#> "integration_manifest": {
#> "limit": 5000,
#> "remaining": 5000,
#> "reset": 1596955038
#> },
#> "search": {
#> "limit": 10,
#> "remaining": 10,
#> "reset": 1596951498
#> }
#> },
#> "rate": {
#> "limit": 60,
#> "remaining": 25,
#> "reset": 1596954378
#> }
#> }
Typically, you can set token to GITHUB_PAT
variable in your .Renviron
file using the following format:
GITHUB_PAT=8c70fd8419398999c9ac5bacf3192882193cadf2
You can also set it in your .Rprofile
file using the following format:
Sys.setenv(GITHUB_PAT="8c70fd8419398999c9ac5bacf3192882193cadf2")