🗓️ Session 11: A very short introduction to Quarto

Author
Published

10 05 2024

Modified

07 05 2024

Quarto is a modern multi-language version of R Markdown. As with its predessessor, the idea is to provide people with the opportunity to write text and code into the very same document. This makes the creation of nice looking and reproducible reports or paper very easy. Moreoever, with Quarto it is very easy to create very nice papers, reports, websites or interactive apps. This website, for example, is fully written in Quarto. In this lecture, you learn everything you need to get started with writing your first Quarto documents. In fact, its really straighforward once you get the basic idea.

👨‍🏫 Lecture Slides

Either click on the slide area below or click here to download the slides.

---
title: "What a desaster!"
author: "Claudius"
date: '2025-05-02'
format: pdf
---
# Packages used
```{r}
library(tidyverse)
library(DataScienceExercises)
library(knitr)
```
# Exploring flight data
In this short text we explore the following data set on flights departing
from New York.
```{r}
base_data <- DataScienceExercises::nycflights21_small[1:200, ]
data.frame(head(DataScienceExercises::nycflights21_small, 50))
```
To have a first look on the relationship of the variables, consider the
following scatter plots:
```{r}
#| out-width: "40%"
#| out-height: "40%"
arrival_dep <- ggplot(data = base_data) +
geom_point(mapping = aes(x=arr_delay, y=dep_delay),
alpha=0.5, color="#00395B") +
ggplot2::theme_bw() +
labs(x="Arrival delay", y="Departure delay") +
theme(
legend.position = "bottom",
legend.title = ggplot2::element_blank(),
panel.border = ggplot2::element_blank(),
axis.line = ggplot2::element_line(colour = "grey"),
axis.ticks = ggplot2::element_line(colour = "grey")
)
arrival_dist <- ggplot(data = base_data) +
geom_point(mapping = aes(x=arr_delay, y=distance),
alpha=0.5, color="#00395B") +
ggplot2::theme_bw() +
labs(x="Arrival delay", y="Departure delay") +
theme(
legend.position = "bottom",
legend.title = ggplot2::element_blank(),
panel.border = ggplot2::element_blank(),
axis.line = ggplot2::element_line(colour = "grey"),
axis.ticks = ggplot2::element_line(colour = "grey")
)
arrival_month <- ggplot(data = base_data) +
geom_point(mapping = aes(y=arr_delay, x=month),
alpha=0.5, color="#00395B") +
ggplot2::theme_bw() +
labs(x="Arrival delay", y="Departure delay") +
theme(
legend.position = "bottom",
legend.title = ggplot2::element_blank(),
panel.border = ggplot2::element_blank(),
axis.line = ggplot2::element_line(colour = "grey"),
axis.ticks = ggplot2::element_line(colour = "grey")
)
arrival_carrier <- ggplot(data = base_data) +
geom_point(mapping = aes(y=arr_delay, x=carrier),
alpha=0.5, color="#00395B") +
ggplot2::theme_bw() +
labs(x="Arrival delay", y="Departure delay") +
theme(
legend.position = "bottom",
legend.title = ggplot2::element_blank(),
panel.border = ggplot2::element_blank(),
axis.line = ggplot2::element_line(colour = "grey"),
axis.ticks = ggplot2::element_line(colour = "grey")
)
ggpubr::ggarrange(
arrival_dep, arrival_dist,
arrival_month, arrival_carrier,
ncol = 2, nrow = 2)
```
This suggests that there is a strong correlation between departure and arrival
delay. To compute the correlation we might use the following R code:
```{r}
#| echo: false
cor(base_data$arr_delay, base_data$dep_delay)
```
There is indeed a very strong correlation. But is it significant? Lets check
it using the Pearson correlation test:
```{r}
cor.test(base_data$arr_delay, base_data$dep_delay, method = "pearson")
```
Of course, these are just preliminary results, from a methodological point of
view there is still much to do...
---
title: "What a beauty!"
author: "Claudius"
date: '2025-05-02'
execute:
warning: false
message: false
format: pdf
header-includes:
- \usepackage{setspace}
- \onehalfspacing
---
# Packages used
```{r}
library(tidyverse)
library(DataScienceExercises)
library(knitr)
```
# Exploring flight data
In this short text we explore the following data set on flights departing
from New York.
```{r}
#| label: read-data
#| echo: false
base_data <- DataScienceExercises::nycflights21_small[1:200, ]
knitr::kable(head(DataScienceExercises::nycflights21_small, 5))
```
To have a first look on the relationship of the variables, consider the
following scatter plots:
```{r}
#| echo: false
#| out-width: "70%"
#| out-height: "70%"
#| fig-align: center
arrival_dep <- ggplot(data = base_data) +
geom_point(mapping = aes(x=arr_delay, y=dep_delay),
alpha=0.5, color="#00395B") +
ggplot2::theme_bw() +
labs(x="Arrival delay", y="Departure delay") +
theme(
legend.position = "bottom",
legend.title = ggplot2::element_blank(),
panel.border = ggplot2::element_blank(),
axis.line = ggplot2::element_line(colour = "grey"),
axis.ticks = ggplot2::element_line(colour = "grey")
)
arrival_dist <- ggplot(data = base_data) +
geom_point(mapping = aes(x=arr_delay, y=distance),
alpha=0.5, color="#00395B") +
ggplot2::theme_bw() +
labs(x="Arrival delay", y="Departure delay") +
theme(
legend.position = "bottom",
legend.title = ggplot2::element_blank(),
panel.border = ggplot2::element_blank(),
axis.line = ggplot2::element_line(colour = "grey"),
axis.ticks = ggplot2::element_line(colour = "grey")
)
arrival_month <- ggplot(data = base_data) +
geom_point(mapping = aes(y=arr_delay, x=month),
alpha=0.5, color="#00395B") +
ggplot2::theme_bw() +
labs(x="Arrival delay", y="Departure delay") +
theme(
legend.position = "bottom",
legend.title = ggplot2::element_blank(),
panel.border = ggplot2::element_blank(),
axis.line = ggplot2::element_line(colour = "grey"),
axis.ticks = ggplot2::element_line(colour = "grey")
)
arrival_carrier <- ggplot(data = base_data) +
geom_point(mapping = aes(y=arr_delay, x=carrier),
alpha=0.5, color="#00395B") +
ggplot2::theme_bw() +
labs(x="Arrival delay", y="Departure delay") +
theme(
legend.position = "bottom",
legend.title = ggplot2::element_blank(),
panel.border = ggplot2::element_blank(),
axis.line = ggplot2::element_line(colour = "grey"),
axis.ticks = ggplot2::element_line(colour = "grey")
)
ggpubr::ggarrange(
arrival_dep, arrival_dist,
arrival_month, arrival_carrier,
ncol = 2, nrow = 2)
```
These plots suggests that there is a strong correlation between departure and
arrival delay. To compute the correlation we might use the following R code:
```{r}
#| echo: true
#| results: hide
cor_coef <- cor(base_data$arr_delay, base_data$dep_delay)
```
This produces a correlation coefficient of `r round(cor_coef, 3)`, suggesting
that there is indeed a very strong correlation.
But is it significant? Lets check it using the Pearson correlation test:
```{r}
c_test <- cor.test(
x = base_data$arr_delay,
y = base_data$dep_delay,
method = "pearson")
```
The most relevant statistics are:
```{r}
#| echo: false
knitr::kable(tibble(
"t-stat"=c_test$statistic,
"df"=c_test$parameter,
"p-val"=c_test$p.value,
"95% conf interval"=paste0(
"[",
paste(round(c_test$conf.int, 3), collapse = "; "),
"]")
), align = "c")
```
Of course, these are just preliminary results, from a methodological point of
view there is still much to do...
\newpage
# The corrections we did
To make this document look *much* nicer immediately, the following changes
were made:
* Suppress warnings and messages by default
* Set line spacing to one and a half (just looked it up in the internet)
* Do not show the whole table in the beginning but only the first lines;
* Do not show the R code in this context since it is not meaningful;
* Use `knitr::kable()` to print tables
* Do not show the code for preparing the plot, it is not necessary to understand
the message
* Adjust `out-width` and `out-height` options in the plot chunk such that the
plot is easier to read, and center the plot since this looks nicer
* Show the code use to compute the correlation coefficient, but in a readable
way; but summarize the output concisely, focusing on what is relevant
* Let the last section start on a new page using `\newline` to avoid the
buggy page continuation of bullet lists
* Report the result of the Pearson correlation test in a more concise way
Of course, the last sentence above is true: to analyze this data in a
meaningful way, we must invest a bit more thinking into the correct analysis
method!

🎥 Lecture videos

So far, there are no learning videos available for this lecture.

📚 Mandatory Reading

Further Reading

✍️ Coursework

  • Do the exercises Quarto from the DataScienceExercises package
learnr::run_tutorial(
  name = "Quarto", 
  package = "DataScienceExercises", 
  shiny_args=list("launch.browser"=TRUE))
  • Do the following practical exercise:

Create a new Quarto document where you set the title, date, and the author explicitly. Write a sample text that comprises…

  • …at least one level 1 heading
  • …at least two level 2 headings
  • …a YAML part that specifies that R code remains hidden by default
  • …one R chunk where both the output and the code is printed in the final document
  • …one R chunk that produces a simply ggplot object and where the code producing the plot is hidden

Then do the following:

  1. Knit the document to html with a floating table of contents and a special theme.

  2. Make the document available via Netlify Drop and add the possibility to download the underlying Rmd file. > Note: For Netlify Drop to work, the html file must be called `index.html```!

  3. Knit the document to PDF and make sure that it includes a table of contents.

The Netlify version can be found at: https://quarto-ex-solution.netlify.app/
view raw #E7: Solution hosted with ❤ by GitHub
---
title: "A possible solution"
author: "Claudius"
date: "2025-05-02"
execute:
echo: false
format:
html:
toc: true
toc-depth: 2
toc-location: body
number-sections: true
theme: darkly
pdf:
toc: true
toc-depth: 2
toc-location: body
number-sections: true
---
# Introduction
In the (invisible) YAML header we set the default value for the chunk option
`echo` to `FALSE`, meaning that code is hidden by default.
Also check out the header, in which we explicitly set `title`, `date`, and
`author` - the result of this can be seen in the rendered document. To comply
with the respective tasks we added the specifications
`toc: true` and `toc-location: body` to `html` to add the
table of contents at the beginning, and `theme: darkly` to use this theme.
As you could expect, `number-sections: true` ensures section headings are
numbered.
To the `pdf` call we add the TOC related comments and `number-sections: true´.
## A second level heading
Again some text. Here is the first chunk.
The code is shown in the rendered document because we set the chunk option
`echo: true`, and so is the output produced by the code because the chunk
option `include` is set the `TRUE` by default:
```{r}
#| echo: true
vector_random_numbers <- rnorm(
n = 12, mean = 5, sd = 2)
vector_random_numbers
```
## A visualization
Here we produce the desired ggplot (the code for which remains hidden because
we did not deviate from the default value of `echo`, which we set to `FALSE`
in the setup chunk.)
# Using Netlify Drop
The document was published using [Netlify Drop](https://app.netlify.com/drop),
as explained in the slides.
This is the reason the file is called `index.qmd`: Netlify Drop only works if
a `html`-file called `index.html` is included in the uploaded directory.
view raw index.qmd hosted with ❤ by GitHub