- Motivation for combining R and Python
- Levels of integration
- Break analysis into homogeneous chunks
- Use a “bridge”
- Truly integrated workflow
- Break analysis into homogeneous chunks
April 22, 2022
Python
R
Which #superheroe are you?(#batman Vs. #Superman) == (#R Vs. #Python)? #datascience @roopamu https://t.co/B1gO8MT1Zr pic.twitter.com/GR3pUiZ6rS
— Antoine (@AntoineTrdc) November 1, 2015
rpy2
) that allows calling the 2nd language from the first RStudio/reticulate
and Jupyter/rpy2
I will use background colors to indicate code from different languages:
# R code R.version.string
## [1] "R version 4.1.2 (2021-11-01)"
# python code import sys sys.version
## '3.9.12 | packaged by conda-forge | (main, Mar 24 2022, 23:34:28) \n[GCC 10.3.0]'
# shell script (bash) echo ${BASH_VERSION}
## 4.2.46(2)-release
library(reticulate) os <- import("os") os$listdir(".")
## [1] "02_exercise_rstudio_problem-based.html" ## [2] "styles.css" ## [3] "None-requirements.txt" ## [4] "ideas_2022.txt" ## [5] "04_exercise_jupyter.html" ## [6] ".ipynb_checkpoints" ## [7] "01_introduction.html" ## [8] "requirements.txt" ## [9] "FMI_python_config.R" ## [10] "01_introduction.Rmd" ## [11] ".Rhistory" ## [12] "02_exercise_rstudio_problem-based.Rmd" ## [13] "sinfo-requirements.txt" ## [14] "04_exercise_jupyter.ipynb" ## [15] "03_exercise_rstudio.html" ## [16] "R_requirements.R" ## [17] "figures" ## [18] "03_exercise_rstudio.Rmd" ## [19] "pythonenv" ## [20] "rstudio_reticulate_examples.Rmd" ## [21] "nohup.out"
import os os.listdir(".")
## ['02_exercise_rstudio_problem-based.html', 'styles.css', 'None-requirements.txt', 'ideas_2022.txt', '04_exercise_jupyter.html', '.ipynb_checkpoints', '01_introduction.html', 'requirements.txt', 'FMI_python_config.R', '01_introduction.Rmd', '.Rhistory', '02_exercise_rstudio_problem-based.Rmd', 'sinfo-requirements.txt', '04_exercise_jupyter.ipynb', '03_exercise_rstudio.html', 'R_requirements.R', 'figures', '03_exercise_rstudio.Rmd', 'pythonenv', 'rstudio_reticulate_examples.Rmd', 'nohup.out']
R | Python | Examples |
---|---|---|
Single-element vector | Scalar | 1, 1L, TRUE, “foo” |
Multi-element vector | List | c(1.0, 2.0, 3.0), c(1L, 2L, 3L) |
List of multiple types | Tuple | list(1L, TRUE, “foo”) |
Named list | Dict | list(a = 1L, b = 2.0), dict(x = x_data) |
Matrix/Array | NumPy ndarray | matrix(c(1,2,3,4), nrow = 2, ncol = 2) |
Data Frame | Pandas DataFrame | data.frame(x = c(1,2,3), y = c(“a”, “b”, “c”)) |
Function | Python function | function(x) x + 1 |
Raw | Python bytearray | as.raw(c(1:10)) |
NULL, TRUE, FALSE | None, True, False | NULL, TRUE, FALSE |
reticulate
in the background to bridge to Pythonreticulate
for Python calls and object translationimport rpy2.robjects as robjects pi = robjects.r['pi'] pi[0]
## 3.141592653589793
pi
## [1] 3.141593
Code cells are declared in a header:
The special objects
r and
py can be used to access the “other side”:
%R [-i INPUT] [-o OUTPUT] [...] [code [code ...]]
day1_python_and_R/01_introduction.html
day1_python_and_R/02_exercise_rstudio_problem-based.html
day1_python_and_R/03_exercise_rstudio.html
day1_python_and_R/04_exercise_jupyter.html
Thank you!