Program

Sunday July 15th

  • 09:00 – 09:30  Registration
  • 09:30 – 12:30  
    • Data maagement  –packages and techniques to manipulate and maintain datasets (e.g. the packages readxl, sqldf, dplyr, spark)
    • Advanced visualization techniques –high-density scatterplots, 3-D graphs, trellis plots, using Google maps in plots.
  • 12:30 – 13:00  Lunch Break
  • 13:00 – 16:00  Assignments

Monday July 16th

  • 09:00 – 12:00  
    • Generalized linear models – normal response, count data (Poisson regression), binomial regression, over-dispersion.
  • 12:00 – 13:00  Lunch Break
  • 13:00 – 16:00  Assignments

Tuesday July 17th

  • 09:00 – 12:00
    • Classification methods – logistic regression, classification and regression trees (CART), random forest, support vector machines.
  • 12:00 – 13:00  Lunch Break
  • 13:00 – 16:00  Assignments

Wednesday July 18th

  • 09:00 – 12:00
    • Variable selection methods – Lasso, SCAD, MCP, Spike and Slab.
    • Missing data methods – simple imputation, missing indicators, multiple imputation via chained equations (MICE).
  • 12:00 – 13:00  Lunch Break
  • 13:00 – 16:00 Assignments

Thursday July 19th

  • Hackathon