Data Collection

Methods

  • Retrospective Study
  • Observational Study
  • Designed Experiment

Treatment and Control Group

  • Treatment Group -- the object of the study
  • Control Group -- the baseline

Observational Study

Simply observes and measures a variable of interest, doesn't have any control of the situation.

Pros

  • Can detect associations (might or might not have meaning)

Cons

  • Can't establish a cause and effect relationship.

Confounding (Lurking) Factors

Variables that we cannot control, or that might be hidden (not identified) and effects the outcome of the study.

Designed (Controlled) Experiments

Designed experiments that deliberately impose a treatment on individuals and record their responses.

they compare the recorded responses vs. a control group, placebo, or other

Goals of this study:

  • Replication
  • Randomization
  • Control of Error

Pros

  • Can analyze cause variables and influence responses.

Cons

  • Cannot be done when the variables cannot be controlled.
  • Cannot apply to come studies due to ethical or moral reasons.

Common Experiment Designs

  • Completely Randomized
  • Block Designed -- subjects are separated then the treatment and control groups are randomly assigned (for example, different "blocks" for males and females in a medication study)
  • Matched Pairs -- Finding a similar individual or unit to compare outcomes (for example NASA sending one twin to space and comparing to their twin upon return, or before and after).

Sampling

Simple Random Sampling (SRS)

  • Everyone in the total population size N has an equal change of being selected for our sample n
  • Subjects selected cannot be sampled twice
  • Ex. SRS of size n=2 out of total population N=4

Systematic (Probability) Sampling

  • Uses chance to select a sample, based on known selection probabilities.

Stratified Sampling

  • Dividing a population into subgroups and then sampling equally from those subgroups

Cluster Sampling

  • Dividing a population into clusters and then randomly selecting all individuals from them
  • Generally the clusters already exist (Ex. the rows of seats in a classroom and who sits where)

Terminology

  • Factors - measured aspects that we are interested in
    • Levels - specific values of factors
    • Treatment combination (interactions) - with multiple factors, a specific combination of levels
  • Bias
    • measurement -- issues with the tools
    • sampling bias -- not sampling the right group
      • Voluntary response samples -- samples where the subject can select whether or not to answer
      • Convenience sampling - only sampling people that are coinvent
      • Incentivized sampling
  • Populations and Samples
    • Population - consists of all subjects or items of interest. It is the group being studied.
    • Sample - a group selected of the population to participate in our experiment or to gather data
  • Blinding - the subjects of the study don't know if they are a control or a treatment group
    • Double Blinding - neither the subject nor the admins know which group is which
  • Time Terms
    • Retrospective - a study that analysis past data
    • Prospective - a study that deals with current data
    • Longitudinal - study that last a long period of time
  • Types of studies
    • Cross sectional study - surveys that collect data at one point in time
    • Case-Control
    • Cohort study