Course Schedule

Data science learning pathway

R Stats Bootcamp Learning Path

This self-paced bootcamp will take most people about 40-60 hours to complete. The schedule below provides an overview of all course modules and resources. Work through the materials sequentially for the best learning experience.

Community Support
  • Discord coming soon

Module 1: R Foundations

Learn the fundamentals of R programming and the RStudio environment.

1. R and RStudio Setup

Getting started with the R programming environment

  • Installing R and RStudio
  • Navigating the RStudio interface
  • Creating your first R script
  • Understanding R packages

Begin Lesson → | Slides

2. R Language Basics

Core concepts of the R programming language

  • R syntax and data types
  • Variables and assignment
  • Basic operations and calculations
  • Control structures (if/else, loops)

Begin Lesson → | Slides

3. Functions and Packages

Working with functions and extending R’s capabilities

  • Using built-in functions
  • Creating your own functions
  • Installing and loading packages
  • Function documentation and help

Begin Lesson →

4. Data Objects

Understanding R’s data structures

  • Vectors, matrices, and arrays
  • Lists and factors
  • Working with dates and times
  • Type conversion and coercion

Begin Lesson →

5. Data Frames

Working with tabular data in R

  • Creating and manipulating data frames
  • Accessing data frame elements
  • Adding and modifying columns
  • Merging data frames

Begin Lesson →

6. Data Manipulation

Techniques for cleaning and transforming data

  • Filtering and subsetting data
  • Sorting and arranging data
  • Summarizing data
  • Reshaping data (wide vs. long format)

Begin Lesson →

Module 2: Statistical Analysis

Apply R programming to statistical analysis and data visualization.

7. Question, Explore, Analyze

The data analysis workflow

  • Formulating research questions
  • Exploratory data analysis
  • Data visualization principles
  • Communicating findings

Begin Lesson →

8. Sampling Distributions

Understanding probability and sampling

  • Random sampling
  • Probability distributions
  • The Central Limit Theorem
  • Confidence intervals

Begin Lesson →

9. Correlation

Measuring relationships between variables

  • Correlation coefficients
  • Visualizing correlations
  • Testing correlation significance
  • Correlation vs. causation

Begin Lesson →

10. Simple Linear Regression

Modeling relationships between variables

  • Linear regression concepts
  • Fitting regression models in R
  • Interpreting regression output
  • Assessing model fit

Begin Lesson →

11. T-tests

Comparing group means

  • One-sample t-tests
  • Independent samples t-tests
  • Paired samples t-tests
  • Effect sizes and power

Begin Lesson →

12. One-way ANOVA

Analyzing variance between groups

  • ANOVA concepts and assumptions
  • Conducting ANOVA in R
  • Post-hoc tests
  • Reporting ANOVA results

Begin Lesson →

Module 3: Reproducible Research

Learn essential tools and practices for reproducible data science.

13. Reproducibility Principles

Introduction to reproducible research

  • Why reproducibility matters
  • Components of reproducible workflows
  • Documentation best practices
  • File organization strategies

Begin Lesson →

14. R Markdown

Creating dynamic documents

  • R Markdown basics
  • Combining code, results, and narrative
  • Document formatting options
  • Generating reports in multiple formats

Begin Lesson →

15. Git and GitHub Basics

Version control for data science

  • Understanding version control
  • Setting up Git and GitHub
  • Basic Git workflow
  • Tracking changes to your code

Begin Lesson →

16. Collaborative Workflows

Working effectively with others

  • Project organization
  • Sharing code and data
  • Collaboration best practices
  • Maintaining reproducibility in teams

Begin Lesson →