AP Notes, Outlines, Study Guides, Vocabulary, Practice Exams and more!

Statistics I Final Terms/Concepts Flashcards

Terms : Hide Images
564680777Central Limit Theoremthe fact that as sample size increases, the sampling distribution of the mean becomes increasingly normal, regardless of the shape of the distribution of the sample
564680778Degrees of Freedomroughly, the minimum amount of data needed to calculate a statistic; more practically, it's a number, or numbers, used to approximate the number of observations in the data set for the purpose of determining statistical significance
564680779Expected Value of the Meanthe value of the mean one would expect to get from a random sample selected from a population with a known mean; for example, if one knows the population has a mean of 5 on some variable, one would expect a random sample selected from the population to also have a mean of 5
564680780Inferential Statisticsstatistics generated from sample data that are used to make inferences about the characteristics of the population the sample is alleged to represent
564680781Populationthe group from which data are collected or a sample is selected; this encompasses the entire group for which the data are alleged to apply
564680782Random Chancethe probability of a statistical event occurring due simply to random variations in the characteristics of samples of a given size selected randomly from a population
564680783Samplean individual or group, selected from a population, from whom or which data are collected
564680784Sampling Distribution of the Meanthe distribution of scores that would be generated if one were to repeatedly draw samples of a given size from a population and calculate the mean for each sample drawn
564680785Sampling Distributiona theoretical distribution of any statistic that one would get by repeatedly drawing random samples of a given size from the population and calculating the statistic of interest for each sample
564680786Probability Value (p-value)the probability of obtaining a statistic of a given size from a sample of a given size by chance, or due to random error
564680787Standard Errorthe standard deviation of the sampling distribution
564680788Confidence Intervalan interval calculated using sample statistics to contain the population parameter, within a certain degree of confidence (e.g., 95%)
564680789Statistical Significancewhen the probability of obtaining a statistic of a given size due strictly to random sampling error, or chance, is less than the selected alpha level; it also represents a rejection of the null hypothesis
564680790Null Hypothesisthe hypothesis that there is no effect in the population (e.g., that two population means are not different from each other, that two variables are not correlated in the population)
564680791Alternative Hypothesisthe opposite of the null hypothesis; usually, it's the hypothesis that there's some effect present in the population (e.g., two population means are unequal, two variables are correlated, a sample mean is different from a population mean, etc.)
564680792Alphathe probability of rejecting a hypothesis (the null hypothesis) when that hypothesis is true; also referred to as the probability of making a Type I error
564680793Alpha Levelthe a priori probability of falsely rejecting a null hypothesis that the researcher is willing to accept; it's used, in conjunction with the p-value, to determine whether a sample statistic is statistically significant
564680794Powerthe probability of rejecting the null hypothesis (that there are no differences) when, in fact, that hypothesis is false; alternatively, detecting a difference between groups when, in fact, a difference truly exists
564680795Type I Errorrejecting the null hypothesis when, in fact, the null hypothesis is true; the probability of making this type of error is referred to as alpha
564680796Type II Erroraccepting a hypothesis (the null hypothesis) when it is false; the probability of making this type of error is referred to as beta
564680797Effect Sizea measure of the size of the effect observed in some statistic; a way of determining the practical significance of a statistic by reducing the impact of sample size; a measure of the strength or magnitude of an experimental effect; a way of expressing the effect in terms of a common metric across measures and studies (standard deviation units)
564680798Random Sampling Errorthe error, or variation, associated with randomly selecting samples of a given size from a population
564680799One-Taileda test of statistical significance that is conducted just for one tail of the distribution (e.g., that the sample mean will be larger than the population mean); when conducting this test, the researcher has ruled out interest in one of the directions, and the test is the probability of getting a result as strong or stronger only in one direction
564680800Two-Taileda test of statistical significance that is conducted just for both tails of the distribution (e.g., that the sample mean will be different from the population mean); when conducting this test, the researcher is testing the probability of getting a result as strong or stronger than the observed result, where "strong or stronger" refers to different in either direction (e.g., that far above or below the mean, or that different from zero in either a positive or negative direction)
564710750Correlation Coefficienta statistic that reveals the strength and direction of the relationship between two variables
564710751Covariancethe average of the cross products of a distribution
564710752Coefficient of Determinationa statistic found by squaring the Pearson correlation coefficient that reveals the percentage of variance explained in each of the two correlated variables by the other variable; tells us how much of the variance in the scores of one variable can be understood, or explained, by the scores on a second variable; equal to r^2
564710753Cross Productthe product of multiplying each individual's scores on two variables
564710754Cross-Product Deviationsthe product of the deviations of one variable and the corresponding deviations for a second variable
564710755Curvilinear Relationshipa relationship between two variables that is positive at some values but negative at other values; may result in a correlation coefficient that is quite small, suggesting a weaker relationship than may actually exist
564710756Dichotomous Variablea categorical, or nominal, variable with two categories
564710757Explained Variancethe percentage of variance in one variable that we can account for, or understand, by knowing the value of the second variable in the correlation
564710758Negative Correlationa descriptive feature of a correlation indicating that as scores on one of the correlated variables increase, scores on the other variable decrease, and vice versa
564710761Positive Correlationa characteristic of a correlation; when the scores on the two correlated variables move in the same direction, on average; as the scores on one variable rise, scores on the other variable rise, and vice versa
564710763Shared Variancethe concept of two variables overlapping such that some of the variance in each variable is shared; the stronger the correlation between two variables, the greater this overlap is
564710766Truncated Range (Restricted Variance)when the responses on a variable are clustered near the top or the bottom of the possible range of scores, thereby limiting the range of scores and possibly limiting the strength of the correlation; may attenuate (weaken/lower) the correlation coefficient
564722719Strength (Magnitude)a characteristic of a correlation with a focus on how strongly two variables are related
564722720Directiona characteristic of a correlation that describes whether two variables are positively or negatively related to each other
564722721Perfect Positive Correlationa correlation of +1.00; indicates that for every member of the sample or population, a higher score on one variable is related to a higher score on the other variable
564722722Perfect Negative Correlationa correlation of -1.00; indicates that for every member of the sample or population, a higher score on one variable is related to a lower score on the other variable
565356896Pearson Product-Moment Correlation (r)correlation coefficient; both variables must be measured on an interval or ratio scale (continuous variables); designed to examine linear relationships among variables; equal to the standardized covariance
565356897Point-Biserial Correlationcorrelation coefficient; should be calculated when one of the variables is continuous and the other is a discrete dichotomous variable
565356898Phi Coefficientcorrelation coefficient; should be calculated when researchers want to know if two dichotomous variables are correlated
565356899Spearman Rho Coefficientcorrelation coefficient; should be used to calculate the correlation between two variables that use ranked data (i.e., ordinal)
565356900Bonferroni Adjustmenta correction used by researchers to adjust their level of significance; the purpose of this is to decrease the chances of a Type I error (failing to reject the null hypothesis when it's true) when multiple tests are conducted (experiment-wise error rate); equal to the Type I error risk (.05) divided by the number of coefficients to be tested
565356901Outlieran extreme score that is more than two standard deviations above or below the mean; attenuates (weakens/lowers) correlation coefficients; can be visually identified via scatterplots
565382142Erroramount of difference between the predicted value and the observed value of the dependent variable; it's also the amount of unexplained variance in the dependent variable
565382143Interceptthe point at which the regression line intersects the Y-axis; also, the value of Y when X=0
565382144Predicted Valuesestimates of the value of Y at given values of X that are generated by the regression equation
565382145Regression Coefficient (b)a measure of the relationship between each predictor variable and the dependent variable; in simple linear regression, this is also the slope of the regression line; indicates the effect of the IV on the DV; specifically, for each unit change of the IV, there is an expected change equal to the size of this value in the DV; the average amount the dependent variable increases when the independent variable increases one unity; the slope of the regression line; the larger this is, the steeper the slope, and the more the dependent changes for each unit change in the independent
565382146Ordinary Least Squares (OLS) Regressiona common form of regression that uses the smallest sum of squared deviations to generate the regression line
565382147Overpredictedobserved values of Y at given values of X that are below the predicted values of Y (i.e., the values predicted by the regression equation)
565382148Regression Equationthe components, including the regression coefficients, intercept, error term, and X and Y values that are used to generate predicted values for Y and the regression line
565382149Regression Linethe line that can be drawn through a scatterplot of the data that best "fits" the data (i.e., minimizes the squared deviations between observed values and this line)
565382150Residualserrors in prediction; the difference between observed and predicted values of Y
565382151Simple Linear Regressionthe regression model employed when there is a single dependent and a single independent variable
565382152Slopethe average amount of change in the Y variable for each one unit change in the X variable
565382153Underpredictedobserved values of Y at given values of X that are above the predicted values of Y (i.e., the values predicted by the regression equation)
565382154Variance of the Estimatethe variance of the scores about the regression line; indicates the degree of variability about the regression line; this is the variance of the residuals and is equal to the MSR; this can be used to compute the standard error of b
567147417Categorical (Nominal) Variablevariables that are measured using either categories or names
567147418Continuous (Interval Scaled) Variablevariables that are measured using numbers along a continuum with equal distances, or values, between each number along the continuum
567147419Dependent Variablea variable for which the values may depend on, or differ by, the value of the IV; when this is related to the IV, the value is predicted by the value of the IV
567147420Independent Variablea variable that may predict or produce variation in the dependent variable; this may be nominal or continuous and is sometimes manipulated by the researcher (e.g., when the researcher assigns participants to an experimental or control group, thereby creating a two-category variable of this type)
567221034Matched (Paired or Dependent) Sampleswhen each score of one sample is matched to one score from a second sample; or, in the case of a single sample measured at two times, when each score at Time 1 is matched with the score for the same individual at Time 2
567221035Matched (Paired or Dependent) Samples t Testtest comparing the means of paired, matched, or dependent samples on a single variable
567221036Standard Error of the Difference Between the Meansa statistic indicating the standard deviation of the sampling distribution of the difference between the means
567221037One-Sample t Testtype of t test; used to compare the mean of a test variable (DV) with a constant, or test value; for example, the test value could be the midpoint of a variable, the average of a variable (DV) based on past research (e.g., the target population), etc.; the null hypothesis for a one-sample t test is that there is no difference between the mean of the sample and the mean of the population; the null would also be that the mean difference (MD) between the two means equals zero
567221038Paired-Samples t Testtype of t test; used to compare the means of a single sample in a longitudinal design with only two time points (e.g., pretest nd posttest); also used to compare the means of two variables measured w/in a single sample (e.g., depression and quality of life); also referred to as a dependent samples t test or a correlated t test; the null hypothesis for a paired-sample t test is that there is no difference between the mean of the sample at Time 1 (pretest) and the mean of the sample at Time 2 (posttest); the null would also be the that the mean difference between the two means equals zero
567221039Independent-Samples t testtype of t test; used when you want to compare the means of two separate samples (in which a subject cannot be a member of both sub-samples) on a given variable; requires one categorical (or nominal) IV, with two levels or groups, and one continuous DV (i.e., interval or ratio scale); in this type of t test, we want to know whether the average scores on the DV differ according to which group one belongs; the null hypothesis for an independent-samples t test is that there is no difference between the mean for one condition when compared to the mean for the other condition; the null would also be that the mean difference between the two group means equals zero
567254036Eta-Squared (η2)the effect size statistic for an independent-samples t test; interpreted as the proportion of variance of the test variable (DV) that is a function of the grouping variable; values of .01, .06, and .14 are, by convention, interpreted as small, medium, and large effect sizes, respectively
567340526a priori Contrastscomparisons of means that are planned before the ANOVA is conducted; can include comparing the mean of one group to two or more other groups combined; a planned comparison test; a contrast that you decided to test prior to an examination of the data; rather than employing a data driven approach and testing all possible pairwise comparisons using post hoc tests, we have specific hypotheses that we want to test; this is a contrast represented by a linear combination of means; usually such a combination takes the form of a difference between two means, or a difference between averages of two sets of means; there are specific "rules" for figuring out what the coefficients in a contrast should be (e.g., the coefficients must sum to zero within a contrast)
567340538Between Grouptefers to effects (e.g., variance, differences) that occur between the members of different groups in an ANOVA
567340541F Valuethe statistic used to indicate the average amount of difference between group means relative to the average amount of variance within each group
567340544Within Grouptefers to effects (e.g., variance, differences) that occur between the members of the same groups in an ANOVA
567340546Grand Meanthe statistical average for all of the cases in all of the groups on the dependent variable
567340548Mean Square Betweenthe average squared deviation between the group means and the grand mean
567340550Mean Square Errorthe average squared deviation between each individual and their respective group means
567340553Post Hoc Testsstatistical tests conducted after obtaining the overall F value from the ANOVA to examine whether each group mean differs significantly from each other group mean; sometimes referred to as an a posteriori test; a contrast that you decide to test only after observing the result of the omnibus F test; this is an exploratory data analysis strategy when one does not have specific hypotheses regarding group differences before the analysis is conducted; most of these tests deal with experimentwise error rate (the likelihood of making a Type I error when multiple pairwise comparisons are made)
567340555Random Errorrefers to differences between individual scores and sample means that are presumed to occur simply b/c of the random effects inherent in selecting cases for the sample; this more broadly refers to differences between sample data or statistics and population data or parameters caused by random selection procedures
567340557Studentized Range Statisticdistributions used to determine the statistical significance of post hoc tests
567340559Sum of Squares Betweensum of the squared deviations between the group mean and the grand mean
567340561Sum of Squares Errorsum of the squared deviations between individual scores and group means on the dependent variable
567340563Sum of Squares Totalsum of the squared deviations between individual scores and the grand mean on the dependent variable; this is also the sum of the sum of squares between and the sum of squares error
567585611One-Way Analysis of Variance (ANOVA)a test of the significance of group differences between two or more means; analyzes variation between and within each group; the purpose is to compare the means of two or more groups (the IV) on the DV to see if the group means are significantly different from each other; in order to conduct this test, you need to have a categorical (or nominal) variable that has at least 2 independent groups (the IV) and a continuous variable (the DV)
567585612Eta-Squared (η2)the effect size statistic for a one-way ANOVA; ranges in value from 0 to 1; interpreted as the proportion of variance of the test variable (DV) that is a function of the grouping variable; values of .01, .06, and .14 are by convention interpreted as small, medium, and large effect sizes, respectively

Need Help?

We hope your visit has been a productive one. If you're having any problems, or would like to give some feedback, we'd love to hear from you.

For general help, questions, and suggestions, try our dedicated support forums.

If you need to contact the Course-Notes.Org web experience team, please use our contact form.

Need Notes?

While we strive to provide the most comprehensive notes for as many high school textbooks as possible, there are certainly going to be some that we miss. Drop us a note and let us know which textbooks you need. Be sure to include which edition of the textbook you are using! If we see enough demand, we'll do whatever we can to get those notes up on the site for you!