939071271 | Categorical Variables | Consist of Nominal and Ordinal Variables | |
939071272 | Continuous Variables | Consist of Interval Scale and Ratio Scale | |
939071273 | Nominal Variable | A type of categorical variable that has 2 levels (binary, like military vs. civilian) or 3+ levels (branches of the military) | |
939071274 | Ordinal Variable | A type of categorical variable where categories have logical order (e.g. ranks in the navy) | |
939071275 | Interval Scale | A type of continuous variable in which there are equal distances between intervals (e.g. questionnaire ratings) | |
939071276 | Ratio Scale | A type of continuous variable in which there is an absolute zero point (e.g. height) | |
939116406 | Covariance | A measure of the degree of relationship between 2 variables (week 6, slide 60). H₀: Covariance in the population = 0 COVxy = (∑[X-Xbar][Y-Ybar])/(N-1) | |
939116407 | Correlation (r) | A measure of the degree of relationship between two variables (week 6, slide 60), effect size measure. H₀:ρ = 0 (between -1 and +1) r = COVxy/SxSy | |
939116408 | Regression Coefficient (b) | Slope of the regression line (week 6, slide 60). Change in y for a 1 unit change in x. Line passes through (Xbar, Ybar) and (0, a). When r = 0, b = 0. Beta coefficient = regression coefficient when x and y standardized. H₀: b* = 0 b = COVxy/Sx² | |
939116409 | Intercept (a) | Predicted y when x = 0 (week 6, slide 60). H₀: a* = 0 a = Ybar - (b)Xbar A value often not of interest | |
939116410 | R Square (r²) | % of Variability in the dependent variable (dv) that is accounted for by variability in the predictor variable (week 6, slide 60). H₀:ρ² = 0 r² = ssγbar/ssγ (Effect Size Measure) ADJUSTED r² is an UNBIASED ESTIMATOR. (ρ²) | |
939116411 | Standard Error of the Estimate | ***Do not need to know formula*** (week 6, slide 60) Measure of: -Degree to which points diverge from the regression line -Accuracy of prediction -Square root of error variance If the standard error of the estimate is ) = no errors of prediction/no residuals If the standard error of the estimate is LARGE = residuals are large | |
939116412 | 1-Sample Z-Test | ***Do not need to know formula*** Compare 1 sample mean to known population mean µ₂ (also when population SD, σ, is known) (week 5, slide 4). H₀: µ₁ = µ₂ Example: Test if ACBC scores of 15 hospitalized children are different from population mean of 50 (σ = 10). | |
939116413 | 1-sample (student) t-test | ***Do not need to know formula*** Compare 1 sample mean to known population mean (week 5, slide 4). Population mean µ₂ (but not σ) of comparison mean known. H₀: µ₁ = µ₂ Example: Test if PDI scores of 56 LBW infants are different from population mean of 100. Assumption: Score is normally distributed in the population | |
939116414 | Paired sample (student) t-test (WITHIN subject t-test) | ***Do not need to know formula*** Compare 2 sample means from same subjects. Population parameters are NOT KNOWN. H₀: µd = 0 (µ₁-µ₂ = 0) Example: Test of change in weight (difference scores) in 17 anorexics from pre- to post-family therapy. Assumption: Difference score is normally distributed in population. | |
939116415 | Independent-samples (Student) t-test (BETWEEN subject t-test) | ***Do not need to know formula*** Compare 2 sample means from different subjects. Population parameters are NOT KNOWN. H₀: µ₁ = µ₂ Example: Test if Caucasians in stereotype threat condition do worse than Caucasians in control condition on math problem. Notes: You need to use pooled variances for unequal population sizes. Assumptions: Normality σ1² = σ2² If σ1² ≠ σ2² use Satterthwaite t'test. | |
939134801 | Independent Variable | What is manipulated by the experimenter, predictor variable or explanatory variables, IV | |
939134802 | Dependent variable | What is measured, outcome variable or criterion variable, DV | |
939134803 | Random Assignment to Conditions | (Week 1, slide 41) Is important because it ensures high internal validity. -Strength of study design -Ability to draw causal Inferences | |
939134804 | Internal Validity | (Week 1, slide 37) A measure of how well a research study has been designed. High: We can draw strong causal inferences. Low: We cannot draw strong causal inferences. | |
939134805 | Random SAMPLE/SELECTION of Pariticipants | (Week 1, slide 41) It is important because it ensures high external validity. -Whether study sample/s reflect population under investigation -Ability to state if results apply to population of interest | |
939134806 | External Validity | (Week 1, slide 40) Whether sample/s reflects population. High: Sample representative of population - results likely generalize to population Low: Sample not representative of population - results may not generalize to (unsampled populations) | |
939134807 | Parameter | e.g. Mean favorable ratings in a POPULATION | |
939134808 | Statistics ("guesses") | e.g. Mean favorable ratings in SAMPLE/S | |
939134809 | Population | (Week 1, slide 44) Population characteristics = Parameters (Normally) Invisible to investigator Denoted by Greek letters (e.g.): µ (mu) = Population mean σ (sigma) = Population standard deviation ρ (rho) = correlation in the population | |
939134810 | Sample | Sample Characteristics = Statistics Visible to investigator Denoted by Roman letters (e.g.): M or Xbar = Sample mean SD or s = Sample standard deviation r = correlation in a sample | |
939134811 | Inferential Statistics | Making inferences about POPULATION parameters | |
939134812 | Descriptive statistics | Describing the SAMPLE/S, no reference to population parameters | |
939273963 | Normal Distribution | Looks symmetrical, normal, unimodal | |
939273964 | Bimodal Distribution | Has 2 humps | |
939273965 | Negatively Skewed | Most of the data is on the right side (highest point) and very little to no data on the left side, i.e. tail points in the negative direction. Mean > Median > Mode | |
939273966 | Positively Skewed | Most of the data is on the left side (highest point) and very little to no data on the right side, i.e. tail points in the positive direction. Mean < Median < Mode | |
939273967 | Platykurtic/Negative Kurtosis | Kind of flat on the top, no one peak, flattest | |
939273968 | Leptokurtic/Positive Kurtosis | Pointier than normal, peaky, a few points, pointiest | |
939273969 | Mesokurtic (Normal) | In between Platykurtic and Leptokurtic, a normal peak | |
939273970 | Example of Outliers | Reaction time data | |
939273971 | Mode | Measure of central tendency, represents most common score | |
939273972 | Median | Measure of central tendency, represents the middle number (N is odd) or the average of two middle numbers (N is even). MEDIAN LOCATION = (N+1)/2 Unbiased, resistant estimator. | |
939273973 | Mean | Measure of central tendency, average score Unbiased, sufficient, and efficient estimator. | |
939273974 | Range | Measure of variability or dispersion, it is the distance from the lowest to the highest score | |
939273975 | Variance | Measure of variability or dispersion, it is the standard deviation squared. Summation of the squared differences of X from Xbar divided by (N-1) sx² Doesn't have a natural interpretation. Always greater than the standard deviation but less than the sums of squares. | |
939273976 | Standard Deviation | Measure of variability or dispersion, it is the square root of the variance. Square root of(Summation of the squared differences of X from Xbar divided by (N-1)). Average deviation from the mean. sx. Always less than the variance and sums of squares. | |
939273977 | Skew | (Mean-Median)/SD Measure of asymmetry of distribution Positive = right-tailed Negative = left-tailed | |
939273978 | 95% Confidence Interval | This is a sample statistic. We don't know what the true mean is. This means that we have 95% confidence that the true mean, the population mean, likely lies between x and y. | |
939273979 | Properties of Estimator of Population Parameters | Sufficiency, Unbiasedness, Efficiency, Resistence | |
939273980 | Why is the mean the predominant measure of central tendency? | The mean has an equation. It is influenced by outliers, so on the point of resistance it doesn't do too well. IT is sufficient because everything has a part in computing the mean. The mean is efficient because it is likely that the population mean will be similar to the sample mean. The SD of the mean is smaller than it is for the medians so it is more efficient. Unbiasedness (both mean and median are unbiased)- the grand mean of the population, the sample mean is a good estimate of the population mean. If you sample n=5 10,000 times, the distribution of those means will be normal and the mean of those means will be exactly the same as the population mean. The distribution of the means sampled is the standard error. | |
939273981 | Sufficiency | Makes use of all data | |
939273982 | Unbiasedness | Expected value = population parameter | |
939273983 | Efficiency | Samples cluster tightly around parameter | |
939273984 | Resistance | Not influenced by outliers. | |
939273985 | Outlier | Week 1, Slide 140 Often observations > ± SDs from the mean. May reflect processes not under investigation. | |
939273986 | Kurtosis | Week 1, slide 140 Measure of "peakedness" of distribution. Positive = pointy, negative = flat | |
939273987 | Trimodal | 3 peaks in a distribution | |
939581300 | Mean = Median | 0 skew | |
939581301 | Mean > Median | Positive Skew | |
939581302 | Normal Distribution / "Bell-Shaped Curve" | Unimodal (1 peak), Symetrical (skew = 0), Mesokurtic (kurtosis = 0), Mathematically defined (do not need to know) Gaussian distribution, There is an infinite number of normal distributions corresponding to different values of µ and σ. | |
939581303 | Standard Normal Distribution | µ = 0, σ = 1 | |
939581304 | Standard Scores | (Week 2, slide 43) = z scores = z values. Indicates how many standard deviations an observation is above or below the mean. The unit of measurement of the z-score is the standard deviation. Z score of +1 = score 1 SD above the mean Z score of +0.5 = score half a standard deviation above the mean Z score of -1 = score 1 SD below the mean Z score of -0.5 = score half a standard deviation below the mean Z score for population data = z = (X-µ)/σ Z score for sample data = z = (X-Xbar)/sx | |
939648546 | Why is the normal distribution so important? | -Many variables appear normally distributed -If variable normal, can make many inferences about values of variable -Many statistical procedures assume scores are normally distributed in the population | |
939648547 | Tests for normality | -Eyeballing -Quantile-Quantile Plots (normal sample will have a close to straight line, y=x, non-normal sample will have large deviations from a straight line) -Kolmogorov-Smirnov Test (If significance is greater than 0.05, then we can assume normality) If Normal --> Use "parametric test" If NOT normal --> Use "distribution-free" test, use transformation | |
939648548 | T-statistics and the null hypothesis | The t-value is distributed around 0 when the null hypothesis is true. When null is true, it is unlikely to get a t value much bigger or smaller than 0. | |
939648549 | Steps in Hypothesis Testing | -State Alternative/Research Hypothesis -State Null Hypothesis -Collect data -Construct/consult sampling distribution of a particular statistic on the assumption that H0 is true Compare obtained sample statistic to distribution above Decision: Reject H0 or Do not reject H0 based on the probability of observing a sample statistic at least as extreme as the one obtained if the null hypothesis were true (p value) | |
939648550 | Ronald Fisher | In hypothesis testing -Sampling distributions -Hypothesis Testing & p value -Design of experiments -ANOVA | |
939648551 | Karl Pearson | In hypothesis testing -Pearson's Correlation Coefficient (r) -Pearson's Chi Square Statistic | |
939648552 | P value | (Week 2, slide 90) "The probability of obtaining a pattern of data at least as extreme as the one that was actually observed, given that the null hypothesis is true" -Probability -Varies between 0 and 1 -Appears in virtually every empirical study in science -Conditional probability: p(D|H0) -Widely misunderstood -NOT the probability that the null hypothesis is true (it is NOT p(H0|D)) | |
939648553 | If we reject H0... | We accept the alternative hypothesis (H1) (μ1 ≠ μ2) | |
939648554 | If we Do not reject H0 | Fisher - Suspend judgment | |
939648555 | H0 is true, Reject H0 | (1) Type 1 error, α (alpha) | |
939648556 | H0 is false, Reject H0 | (2) Correct Decision, 1 - β Power = 1 - β | |
939648557 | H0 is true, Do not reject H0 | (3) Correct Decision, 1 - α | |
939648558 | H0 is false, Do not reject H0 | (4) Type II Error, β | |
939648559 | α (alpha) | Set by the experimenter (normally to 0.05) before data collection, probability of a type I error. Probability of (incorrectly) rejecting H₀ given that H₀ is true (conditional probability). | |
939648560 | How do you find the number of significant findings when the null is true? | Multiple trials/simulations by p value to get the approximate number of significant findings. | |
939648561 | β | A function of a particular alternative hypothesis, the probability of a type II error. The probability of (incorrectly) failing to reject H₀ when a particular alternative is true (and H₀ is false) (conditional probability). | |
939648562 | Power | Probability of correctly rejecting a false H₀ when a particular alternative hypothesis is true. Power = 1 - β, a/k/a Type II Error | |
939648563 | In "real" experiments, i.e. when the true state of the world is not known... | If we reject H₀ we do not know if we have made a Type II error or a Correct decision. | |
939648564 | 1-Tailed Tests | Determined by the experimenter prior to data collection, Strong directional hypothesis (e.g. µ1 > µ2) - Reject H₀ only if difference is in one particular direction. If strong directional hypothesis and no reason to suspect the effect can go in the opposite direction, then 1-tailed can be used. Divide p value by 2 to to get 1-tailed p value. | |
939648565 | 2-Tailed Tests | Reject H₀ if difference goes in either direction. Determined by the experimenter prior to data collection. It is preferred. Ordinarily from SPSS. Represents double the 1-tailed value. |
Statistics I Midterm Flashcards
Primary tabs
Need Help?
We hope your visit has been a productive one. If you're having any problems, or would like to give some feedback, we'd love to hear from you.
For general help, questions, and suggestions, try our dedicated support forums.
If you need to contact the Course-Notes.Org web experience team, please use our contact form.
Need Notes?
While we strive to provide the most comprehensive notes for as many high school textbooks as possible, there are certainly going to be some that we miss. Drop us a note and let us know which textbooks you need. Be sure to include which edition of the textbook you are using! If we see enough demand, we'll do whatever we can to get those notes up on the site for you!