AP Notes, Outlines, Study Guides, Vocabulary, Practice Exams and more!

Statistics I Midterm Flashcards

Terms : Hide Images
939071271Categorical VariablesConsist of Nominal and Ordinal Variables
939071272Continuous VariablesConsist of Interval Scale and Ratio Scale
939071273Nominal VariableA type of categorical variable that has 2 levels (binary, like military vs. civilian) or 3+ levels (branches of the military)
939071274Ordinal VariableA type of categorical variable where categories have logical order (e.g. ranks in the navy)
939071275Interval ScaleA type of continuous variable in which there are equal distances between intervals (e.g. questionnaire ratings)
939071276Ratio ScaleA type of continuous variable in which there is an absolute zero point (e.g. height)
939116406CovarianceA measure of the degree of relationship between 2 variables (week 6, slide 60). H₀: Covariance in the population = 0 COVxy = (∑[X-Xbar][Y-Ybar])/(N-1)
939116407Correlation (r)A measure of the degree of relationship between two variables (week 6, slide 60), effect size measure. H₀:ρ = 0 (between -1 and +1) r = COVxy/SxSy
939116408Regression Coefficient (b)Slope of the regression line (week 6, slide 60). Change in y for a 1 unit change in x. Line passes through (Xbar, Ybar) and (0, a). When r = 0, b = 0. Beta coefficient = regression coefficient when x and y standardized. H₀: b* = 0 b = COVxy/Sx²
939116409Intercept (a)Predicted y when x = 0 (week 6, slide 60). H₀: a* = 0 a = Ybar - (b)Xbar A value often not of interest
939116410R Square (r²)% of Variability in the dependent variable (dv) that is accounted for by variability in the predictor variable (week 6, slide 60). H₀:ρ² = 0 r² = ssγbar/ssγ (Effect Size Measure) ADJUSTED r² is an UNBIASED ESTIMATOR. (ρ²)
939116411Standard Error of the Estimate***Do not need to know formula*** (week 6, slide 60) Measure of: -Degree to which points diverge from the regression line -Accuracy of prediction -Square root of error variance If the standard error of the estimate is ) = no errors of prediction/no residuals If the standard error of the estimate is LARGE = residuals are large
9391164121-Sample Z-Test***Do not need to know formula*** Compare 1 sample mean to known population mean µ₂ (also when population SD, σ, is known) (week 5, slide 4). H₀: µ₁ = µ₂ Example: Test if ACBC scores of 15 hospitalized children are different from population mean of 50 (σ = 10).
9391164131-sample (student) t-test***Do not need to know formula*** Compare 1 sample mean to known population mean (week 5, slide 4). Population mean µ₂ (but not σ) of comparison mean known. H₀: µ₁ = µ₂ Example: Test if PDI scores of 56 LBW infants are different from population mean of 100. Assumption: Score is normally distributed in the population
939116414Paired sample (student) t-test (WITHIN subject t-test)***Do not need to know formula*** Compare 2 sample means from same subjects. Population parameters are NOT KNOWN. H₀: µd = 0 (µ₁-µ₂ = 0) Example: Test of change in weight (difference scores) in 17 anorexics from pre- to post-family therapy. Assumption: Difference score is normally distributed in population.
939116415Independent-samples (Student) t-test (BETWEEN subject t-test)***Do not need to know formula*** Compare 2 sample means from different subjects. Population parameters are NOT KNOWN. H₀: µ₁ = µ₂ Example: Test if Caucasians in stereotype threat condition do worse than Caucasians in control condition on math problem. Notes: You need to use pooled variances for unequal population sizes. Assumptions: Normality σ1² = σ2² If σ1² ≠ σ2² use Satterthwaite t'test.
939134801Independent VariableWhat is manipulated by the experimenter, predictor variable or explanatory variables, IV
939134802Dependent variableWhat is measured, outcome variable or criterion variable, DV
939134803Random Assignment to Conditions(Week 1, slide 41) Is important because it ensures high internal validity. -Strength of study design -Ability to draw causal Inferences
939134804Internal Validity(Week 1, slide 37) A measure of how well a research study has been designed. High: We can draw strong causal inferences. Low: We cannot draw strong causal inferences.
939134805Random SAMPLE/SELECTION of Pariticipants(Week 1, slide 41) It is important because it ensures high external validity. -Whether study sample/s reflect population under investigation -Ability to state if results apply to population of interest
939134806External Validity(Week 1, slide 40) Whether sample/s reflects population. High: Sample representative of population - results likely generalize to population Low: Sample not representative of population - results may not generalize to (unsampled populations)
939134807Parametere.g. Mean favorable ratings in a POPULATION
939134808Statistics ("guesses")e.g. Mean favorable ratings in SAMPLE/S
939134809Population(Week 1, slide 44) Population characteristics = Parameters (Normally) Invisible to investigator Denoted by Greek letters (e.g.): µ (mu) = Population mean σ (sigma) = Population standard deviation ρ (rho) = correlation in the population
939134810SampleSample Characteristics = Statistics Visible to investigator Denoted by Roman letters (e.g.): M or Xbar = Sample mean SD or s = Sample standard deviation r = correlation in a sample
939134811Inferential StatisticsMaking inferences about POPULATION parameters
939134812Descriptive statisticsDescribing the SAMPLE/S, no reference to population parameters
939273963Normal DistributionLooks symmetrical, normal, unimodal
939273964Bimodal DistributionHas 2 humps
939273965Negatively SkewedMost of the data is on the right side (highest point) and very little to no data on the left side, i.e. tail points in the negative direction. Mean > Median > Mode
939273966Positively SkewedMost of the data is on the left side (highest point) and very little to no data on the right side, i.e. tail points in the positive direction. Mean < Median < Mode
939273967Platykurtic/Negative KurtosisKind of flat on the top, no one peak, flattest
939273968Leptokurtic/Positive KurtosisPointier than normal, peaky, a few points, pointiest
939273969Mesokurtic (Normal)In between Platykurtic and Leptokurtic, a normal peak
939273970Example of OutliersReaction time data
939273971ModeMeasure of central tendency, represents most common score
939273972MedianMeasure of central tendency, represents the middle number (N is odd) or the average of two middle numbers (N is even). MEDIAN LOCATION = (N+1)/2 Unbiased, resistant estimator.
939273973MeanMeasure of central tendency, average score Unbiased, sufficient, and efficient estimator.
939273974RangeMeasure of variability or dispersion, it is the distance from the lowest to the highest score
939273975VarianceMeasure of variability or dispersion, it is the standard deviation squared. Summation of the squared differences of X from Xbar divided by (N-1) sx² Doesn't have a natural interpretation. Always greater than the standard deviation but less than the sums of squares.
939273976Standard DeviationMeasure of variability or dispersion, it is the square root of the variance. Square root of(Summation of the squared differences of X from Xbar divided by (N-1)). Average deviation from the mean. sx. Always less than the variance and sums of squares.
939273977Skew(Mean-Median)/SD Measure of asymmetry of distribution Positive = right-tailed Negative = left-tailed
93927397895% Confidence IntervalThis is a sample statistic. We don't know what the true mean is. This means that we have 95% confidence that the true mean, the population mean, likely lies between x and y.
939273979Properties of Estimator of Population ParametersSufficiency, Unbiasedness, Efficiency, Resistence
939273980Why is the mean the predominant measure of central tendency?The mean has an equation. It is influenced by outliers, so on the point of resistance it doesn't do too well. IT is sufficient because everything has a part in computing the mean. The mean is efficient because it is likely that the population mean will be similar to the sample mean. The SD of the mean is smaller than it is for the medians so it is more efficient. Unbiasedness (both mean and median are unbiased)- the grand mean of the population, the sample mean is a good estimate of the population mean. If you sample n=5 10,000 times, the distribution of those means will be normal and the mean of those means will be exactly the same as the population mean. The distribution of the means sampled is the standard error.
939273981SufficiencyMakes use of all data
939273982UnbiasednessExpected value = population parameter
939273983EfficiencySamples cluster tightly around parameter
939273984ResistanceNot influenced by outliers.
939273985OutlierWeek 1, Slide 140 Often observations > ± SDs from the mean. May reflect processes not under investigation.
939273986KurtosisWeek 1, slide 140 Measure of "peakedness" of distribution. Positive = pointy, negative = flat
939273987Trimodal3 peaks in a distribution
939581300Mean = Median0 skew
939581301Mean > MedianPositive Skew
939581302Normal Distribution / "Bell-Shaped Curve"Unimodal (1 peak), Symetrical (skew = 0), Mesokurtic (kurtosis = 0), Mathematically defined (do not need to know) Gaussian distribution, There is an infinite number of normal distributions corresponding to different values of µ and σ.
939581303Standard Normal Distributionµ = 0, σ = 1
939581304Standard Scores(Week 2, slide 43) = z scores = z values. Indicates how many standard deviations an observation is above or below the mean. The unit of measurement of the z-score is the standard deviation. Z score of +1 = score 1 SD above the mean Z score of +0.5 = score half a standard deviation above the mean Z score of -1 = score 1 SD below the mean Z score of -0.5 = score half a standard deviation below the mean Z score for population data = z = (X-µ)/σ Z score for sample data = z = (X-Xbar)/sx
939648546Why is the normal distribution so important?-Many variables appear normally distributed -If variable normal, can make many inferences about values of variable -Many statistical procedures assume scores are normally distributed in the population
939648547Tests for normality-Eyeballing -Quantile-Quantile Plots (normal sample will have a close to straight line, y=x, non-normal sample will have large deviations from a straight line) -Kolmogorov-Smirnov Test (If significance is greater than 0.05, then we can assume normality) If Normal --> Use "parametric test" If NOT normal --> Use "distribution-free" test, use transformation
939648548T-statistics and the null hypothesisThe t-value is distributed around 0 when the null hypothesis is true. When null is true, it is unlikely to get a t value much bigger or smaller than 0.
939648549Steps in Hypothesis Testing-State Alternative/Research Hypothesis -State Null Hypothesis -Collect data -Construct/consult sampling distribution of a particular statistic on the assumption that H0 is true Compare obtained sample statistic to distribution above Decision: Reject H0 or Do not reject H0 based on the probability of observing a sample statistic at least as extreme as the one obtained if the null hypothesis were true (p value)
939648550Ronald FisherIn hypothesis testing -Sampling distributions -Hypothesis Testing & p value -Design of experiments -ANOVA
939648551Karl PearsonIn hypothesis testing -Pearson's Correlation Coefficient (r) -Pearson's Chi Square Statistic
939648552P value(Week 2, slide 90) "The probability of obtaining a pattern of data at least as extreme as the one that was actually observed, given that the null hypothesis is true" -Probability -Varies between 0 and 1 -Appears in virtually every empirical study in science -Conditional probability: p(D|H0) -Widely misunderstood -NOT the probability that the null hypothesis is true (it is NOT p(H0|D))
939648553If we reject H0...We accept the alternative hypothesis (H1) (μ1 ≠ μ2)
939648554If we Do not reject H0Fisher - Suspend judgment
939648555H0 is true, Reject H0(1) Type 1 error, α (alpha)
939648556H0 is false, Reject H0(2) Correct Decision, 1 - β Power = 1 - β
939648557H0 is true, Do not reject H0(3) Correct Decision, 1 - α
939648558H0 is false, Do not reject H0(4) Type II Error, β
939648559α (alpha)Set by the experimenter (normally to 0.05) before data collection, probability of a type I error. Probability of (incorrectly) rejecting H₀ given that H₀ is true (conditional probability).
939648560How do you find the number of significant findings when the null is true?Multiple trials/simulations by p value to get the approximate number of significant findings.
939648561βA function of a particular alternative hypothesis, the probability of a type II error. The probability of (incorrectly) failing to reject H₀ when a particular alternative is true (and H₀ is false) (conditional probability).
939648562PowerProbability of correctly rejecting a false H₀ when a particular alternative hypothesis is true. Power = 1 - β, a/k/a Type II Error
939648563In "real" experiments, i.e. when the true state of the world is not known...If we reject H₀ we do not know if we have made a Type II error or a Correct decision.
9396485641-Tailed TestsDetermined by the experimenter prior to data collection, Strong directional hypothesis (e.g. µ1 > µ2) - Reject H₀ only if difference is in one particular direction. If strong directional hypothesis and no reason to suspect the effect can go in the opposite direction, then 1-tailed can be used. Divide p value by 2 to to get 1-tailed p value.
9396485652-Tailed TestsReject H₀ if difference goes in either direction. Determined by the experimenter prior to data collection. It is preferred. Ordinarily from SPSS. Represents double the 1-tailed value.

Need Help?

We hope your visit has been a productive one. If you're having any problems, or would like to give some feedback, we'd love to hear from you.

For general help, questions, and suggestions, try our dedicated support forums.

If you need to contact the Course-Notes.Org web experience team, please use our contact form.

Need Notes?

While we strive to provide the most comprehensive notes for as many high school textbooks as possible, there are certainly going to be some that we miss. Drop us a note and let us know which textbooks you need. Be sure to include which edition of the textbook you are using! If we see enough demand, we'll do whatever we can to get those notes up on the site for you!