R Markdown for OSGEO

CUGOS Spring Fling 2023
Bill & Melinda Gates Center for Computer Science (CSE2)

Phil Hurvitz

2023-04-21

1 Introduction

1.1 Rationale

Reproducibility and transparency are essential for research. Tools are quickly evolving that allow combined presentation of narrative documentation, methods, analysis, and results, along with the code that generated the results. Having the code and results within one package is invaluable for checking over results or responding to reviews.

Two main integrated development environments for scientific research are Jupyter Notebooks, used mainly for Python and RStudio Desktop and Server for R.

For this session, we will be using RStudio Desktop with R Markdown to create a self-contained HTML file including code and results of geospatial analyses done within R and PostGIS.

1.2 Overview

In this session, we will include a brief lecture followed by a live demonstration.

1.3 Previously presented resources

This session follows on previous sessions presented at the CUGOS Spring Fling in years past:

2 R Markdown

R Markdown is a text format that combines readable narrative and R code for analysis and/or rendering of tables and graphics.

A process diagram (images from R Markdown Quick Tour) shows that the input .Rmd file is converted using knitr to a .md (markdown) file, which is then processed by pandoc, which produces the final document format (web page, PDF, MS Word document, slide show, handout, book, dashboard, package vignette or other format). The author creates the .Rmd file and the completed output is generated in R using the render() function or by clicking the Knit button in the RStudio interface.

2.1 R Markdown syntax

The benefit of markdown is that the format is very simple compared to more complex coding languages (e.g., HTML or \(\LaTeX\)). On the left is some R Markdown text and on the right is the rendered web page. Here some examples show formatting for section headings, font emphasis (italics, bold), bulleted lists, monospaced code blocks, \(\LaTeX\) format equations, and footnotes.

2.2 R Markdown syntax (continued)

R code chunks are written within delimited code regions. The R code is run during the rendering process, resulting in analytic or other processes. If the R code chunk creates a graph or table, the output is placed within the rendered document. For example, here we see a statistical summary of speed and stopping distance from R’s built-in cars data set, as well as a scatter-plot and locally smoothed regression curve with confidence intervals.

Graphics and tables can also be captioned with automatically generated caption numbers that can also be used in cross-references.

2.3 R Markdown syntax (continued)

“Inline expressions” can be used to include calculated R outputs within the narrative. Usually these are numerical values calculated from your data.

This allows narrative text to include calculated or data-driven values rather than copy/pasted text that would require manual updating and presents the risk of including incorrect results.

An example paragraph using inline code:

For example, in the cars data set there are 50 observations; the mean and standard deviation of speed and distance was 15.4 (5.3) miles per hour and 43 (25.8) feet.

The code for this paragraph is:

For example, in the cars data set there are
`r nrow(cars)` observations;
the mean and standard deviation of speed and distance was
`r cars %>% pull(speed) %>% mean() %>% round(1)`
(`r cars %>% pull(speed) %>% sd() %>% round(1)`)
miles per hour and
`r cars %>% pull(dist) %>% mean() %>% round(1)`
(`r cars %>% pull(dist) %>% sd() %>% round(1)`) feet.

2.4 R Markdown YAML (“YAML Ain’t Markup Language”) header

The first few lines of a .Rmd file contains metadata about the document as well as specifications for output formatting options. For example, for this slide show the following:

‑‑‑
title: "R Markdown for OSGEO"
author: "[Phil Hurvitz](mailto:phurvitz@uw.edu )"
date: '`r format(Sys.time(), "%Y-%m-%d %H:%M")`'
output:
  slidy_presentation:
    css: ['styles.css', 'https://fonts.googleapis.com/css?family=Open+Sans']
‑‑‑

The YAML header specifies the output as slidy_presentation; however, other options can specify a multitude of output formats and formatting characteristics. Usually only minor changes in the .Rmd code are needed to switch from one output format to another.

2.5 R Markdown YAML (“YAML Ain’t Markup Language”) header

Some other examples from https://rmarkdown.rstudio.com/authoring_quick_tour.html:

‑‑‑

title: "Sample Document"
output:
 pdf_document:
  toc: true
  highlight: zenburn

‑‑‑

‑‑‑

title: "Sample Document"
output:
 html_document:
  toc: true
  theme: united
 pdf_document:
  toc: true
  highlight: zenburn

‑‑‑

2.6 R Markdown YAML (“YAML Ain’t Markup Language”) header

Another example, uses a mailto hyperlink for my name, automatically adds render date and time

‑‑‑
title: "GIS examples"
author: "[Phil Hurvitz](mailto:phurvitz@uw.edu )"
date: ‘`r format(Sys.time(), "%Y-%m-%d %H:%M")`’
header-includes: #allows you to add in your own Latex packages
- \usepackage{float} #use the ‘float’ package
- \floatplacement{figure}{H} #make every figure with caption = h
output:
  bookdown::html_document2:
    number_sections: true
    self_contained: true
    code_folding: hide
    toc: true
    toc_float:
      collapsed: true
      smooth_scroll: false
  pdf_document:
    number_sections: true
    toc: true
    fig_cap: yes
    keep_tex: yes
urlcolor: blue 
‑‑‑

2.7 R Markdown rendering

The .Rmd file can be rendered to different outputs using the Knit control in RStudio:

Or by using the R command line, e.g., for this presentation:

rmarkdown::render ("rmarkdown_for_osgeo_hurvitz_20230222.Rmd")

Because the default output format in the YAML header is slidy_presentation, the .Rmd file is automatically rendered to a .html HTML5 file as a Slidy presentation.

3 Demonstration

Now that we have covered the basics of R Markdown, we will shift to a live demonstration using OSGEO.

Switch to CSDE terminal server

Rendered version: CUGOS 2023 Hurvitz

4 Conclusion

4.1 Overview

We covered:

4.2 Parting thoughts

4.3 Acknowledgments

4.4 Contact information

5 Q & A, time permitting.