๐Ÿ—“๏ธ Sessions 15 and 16: Sampling

Author
Published

07 06 2024

Modified

14 06 2024

A central concept in data science - and in applied statistics more generally - is that of sampling. This refers to the strategy of using (small) samples to learn about a (large) population. For example, if you wanted to understand the effect of TV advertising on the consumer behaviour of young men in Germany, you could study the whole population of young men in Germany. But since this is usually not feasible, you would rather take a sample of young men, study their behaviour and then generalise to the whole population. In this session we will discuss when and how this is possible. In this context, we will also learn about the concept of Monte Carlo simulations and two central concepts of probability theory underlying applied statistics: the central limit theorem and the law of large numbers, both of which underlie much of modern sampling theory.

๐Ÿ‘จโ€๐Ÿซ Lecture Slides

Either click on the slide area below or click here to download the slides.

๐ŸŽฅ Lecture videos

So far, there are no learning videos available for this lecture.

๐Ÿ“š Mandatory Reading

โœ๏ธ Coursework

  • Do the exercises Sampling from the DataScienceExercises package
learnr::run_tutorial(
  name = "Sampling", 
  package = "DataScienceExercises", 
  shiny_args=list("launch.browser"=TRUE))

References

Ismay, C. and Kim, A. Y.-S. (2020) Statistical inference via data science: A ModernDive, into R and the tidyverse, Boca Raton: CRC Press, Taylor and Francis Group, available at https://moderndive.com/index.html.