13553836559 | Categorical | Use bar graphs, pie graphs, or segmented bar charts | 0 | |
13553843695 | Marginal Distribution | In a two-way table consider only one variable and use the total row/column of the table only. | 1 | |
13553847319 | Conditional Distributions | describe the distribution of one variable for a specific value of the other (one row/column inside the table). | 2 | |
13553854864 | Quantitative Data | Use dotplots, stemplots, histograms, or boxplots for quantitative variables such as age or weight. | 3 | |
13554452346 | SOCS | Shape - Skewed Left, Skewed Right, Symmetric, Uniform, Unimodal, Bimodal Outliers - Discuss them if there are obvious ones Center - Mean or Median Spread - Range, IQR, or Standard Deviation Note: Also be on the lookout for gaps, clusters or other unusual features of the data set. | 4 | |
13554457530 | Comparing Distributions | Address: Shape, Outliers, Center, Spread in context! YOU MUST USE comparison phrases like "is greater than" or "is less than" for Center & Spread | 5 | |
13554463208 | Outlier Rule | Upper Cutoff = Q3 + 1.5(IQR) Lower Cutoff = Q1 - 1.5(IQR) IQR=Q3 -Q1 | 6 | |
13554467010 | Interpret Standard Deviation | measures spread by giving the "typical" distance that the observations (context) are away from the mean (context). | 7 | |
13554472372 | How does shape affect measures of center? | In general, Skewed Left (Mean < Median) Skewed Right (Mean > Median) Fairly Symmetric (Mean ≈ Median) | 8 | |
13554478172 | Interpret a z-score | describes how many standard deviations a value falls away from the mean of the distribution and in what direction. | 9 | |
13554481989 | Percentiles | The kth percentile of a distribution is the point that has k% of the values less than that point. | 10 | |
13554489842 | Linear Transformations | Adding "a" to every member of a data set adds "a" to the measures of position, but does not change the measures of spread or the shape. Multiplying every member of a data set by "b" multiplies the measures of position by "b" and multiplies most measures of spread by |b|, but does not change the shape. | 11 | |
13554497418 | The Standard Normal Distribution | distribution with a mean of 0 and a standard deviation of 1. | 12 | |
13554504441 | Using Normalcdf | Using boundaries to find area: Normalcdf (min, max, mean, SD) | 13 | |
13554508840 | InvNorm | Using area to find boundary: Invnorm (area to the left as a decimal, mean, SD) | 14 | |
13554516848 | Describing an association in a scatterplot | Address the following, in context: Direction Outliers Form Strength | 15 | |
13554520718 | Interpret r | Correlation measures the strength and direction of the linear relationship between x and y. r is always between -1 and 1. Close to zero = very weak, Close to 1 or -1 = stronger Exactly 1 or -1 = perfectly straight line Positive r = positive correlation Negative r = negative correlation | 16 | |
13554524993 | Interpret LSRL Slope "b" | For every one unit change in the x variable (context) the y variable (context) is predicted to increase/decrease by ____ units (context). | 17 | |
13554533971 | Interpret LSRL y-intercept "a" | When the x variable (context) is zero, the y variable (context) is predicted to be ______. | 18 | |
13554537206 | What is a Residual? | y - y-hat measures the difference between the actual (observed) y-value in a scatterplot and the y-value that is predicted by the LSRL using its corresponding x value. | 19 | |
13554544260 | Interpreting a Residual Plot | If there is a leftover pattern, then the model used does not have the same form as the association (the model is not appropriate). If there is no leftover pattern in the residual plot, then the model is appropriate. | 20 | |
13554732429 | Interpret L S R L " y-hat" | the "estimated" or "predicted" y-value (context) for a given x-value (context) | 21 | |
13554735122 | Extrapolation | Using a LSRL to predict outside the domain of the explanatory variable. | 22 | |
13554742498 | Interpret LSRL "s" | is the standard deviation of the residuals. It measures the typical distance between the actual y values (context) and their predicted y values (context) in a regression setting | 23 | |
13554751131 | Interpret r-squared | __% of the variation in y (context) is accounted for by the LSRL of y (context) on x (context) | 24 | |
13554757576 | Outliers in Regression | Any point that falls outside the pattern of the association should be considered an outlier. | 25 | |
13554765782 | Influential Points in Regression | A point that has a big effect on a calculation, such as the correlation or equation of the least-squares regression line. Points separated in the x-direction are often influential. | 26 | |
13554776427 | SRS | is a sample taken in such a way that every set of n individuals has an equal chance to be the sample actually selected. | 27 | |
13554784393 | Sampling Techniques | 1. SRS- Number the entire population, draw numbers from a hat (every set of n individuals has equal chance of selection) 2. Stratified - Split the population into homogeneous groups, select an SRS from each group. 3. Cluster - Split the population into heterogeneous groups called clusters, and randomly select whole clusters for the sample. Ex. Choosing a carton of eggs actually chooses a cluster (group) of 12 eggs. 4. Census - An attempt to reach the entire population 5. Convenience- Selects individuals easiest to reach 6. Voluntary Response - People choose themselves by responding to a general appeal. | 28 | |
13554787888 | Advantage of using a Stratified Random Sample Over an SRS | Stratified random sampling guarantees that each of the strata will be represented. When strata are chosen properly, a stratified random sample will produce better (less variable/more precise) information than an SRS of the same size. | 29 | |
13554791856 | Bias | A sampling method that will consistently produces estimates that are too small or consistently produces estimates that are too large. | 30 | |
13554801564 | Experiment | researchers impose a treatment upon the experimental units. | 31 | |
13554822120 | Observational Study | researchers make no attempt to influence the results and cannot conclude cause- and-effect. | 32 | |
13554827914 | Confounding | occurs when two variables are associated in such a way that their effects on a response variable cannot be distinguished from each other | 33 | |
13554830264 | Why use a control group? | gives the researchers a comparison group to be used to evaluate the effectiveness of the treatment(s). | 34 | |
13554843270 | Blinding | a technique where the subjects do not know whether they are receiving a treatment or a placebo | 35 | |
13554847906 | Experimental Designs | CRD (Completely Randomized Design) - Units are allocated at random among all treatments RBD (Randomized Block Design) -Units are put into homogeneous blocks and randomly assigned to treatments within each block. Matched Pairs - A form of blocking in which each subject receives both treatments in a random order or subjects are matched in pairs with one subject in each pair receiving each treatment, determined at random. | 36 | |
13554855967 | Benefit of Blocking | the reduction of the effect of variation within the experimental units. | 37 | |
13554868598 | Scope of Inference: Generalizing to a Larger Population | We can generalize the results of a study to a larger population if we used a random sample from that population. | 38 | |
13554874203 | Scope of Inference: Cause-and-Effect | We can make a cause-and-effect conclusion if we randomly assign treatments to experimental units in an experiment. Otherwise, Association is NOT Causation! | 39 | |
13558077452 | Interpreting Probability | the proportion of times the event would occur in a very large number of repetitions. | 40 | |
13558107639 | Law of Large Numbers | if we observe many repetitions of a chance process, the observed proportion of times that an event occurs approaches a single value, called the probability of that event. | 41 | |
13558117870 | Conducting a simulation | State: Ask a question about some chance process. Plan: Describe how to use a random device to simulate one trial of the process and indicate what will be recorded at the end of each trial. Do: Do many trials. Conclude: Answer the question of interest. | 42 | |
13558121797 | Complementary Events | Two or more mutually exclusive events that together cover all possible outcomes. The sum of the probabilities of complementary events is 1. | 43 | |
13558125511 | Conditional Probability | the probability that one event happens given that another event is already known to have happened | 44 | |
13558131257 | Two Events are Independent If... | P(B) = P(B|A) Or P(B) = P(B|Ac) Meaning: Knowing that Event A has occurred (or not occurred) doesn't change the probability that event B occurs. | 45 | |
13558136373 | Two Events are Mutually Exclusive If... | P(A and B) = 0 Events A and B are if they share no outcomes. | 46 | |
13558143261 | Interpreting Expected Value/Mean | If we were to repeat the chance process (context) many times, the average value of _____ (context) would be about _______. | 47 | |
13558168057 | Binomial Setting and Random Variable | Binary? Each trial can be classified as success/failure Independent? Trials must be independent. Number? The number of trials (n) must be fixed in advance Success? The probability of success (p) must be the same for each trial. X = number of successes in n trials | 48 | |
13559406658 | Geometric Setting and Random Variable | Arises when we perform independent trials of the same chance process and record the number of trials it takes to get one success. On each trial, the probability p of success must be the same. X = number of trials needed to achieve one success | 49 | |
13559409705 | Parameter | measures a characteristic of a population, such as a population mean μ or population proportion p. | 50 | |
13559413648 | Statistic | measures a characteristic of a sample, such as a sample mean x or sample proportion pˆ . | 51 | |
13559416841 | What is a sampling distribution? | Is the distribution of a sample statistic in all possible samples of the same size. It describes the possible values of a statistic and how likely these values are. | 52 | |
13559419238 | What is the Central Limit Theorem (CLT)? | If the population distribution is not Normal the sampling distribution of the sample mean (x bar) will become more and more Normal as n increases. | 53 | |
13559425623 | Unbiased Estimator | if the mean of its sampling distribution equals the true value of the parameter being estimated. In other words, the sampling distribution of the statistic is centered in the right place. | 54 | |
13559428879 | 4-Step Process Confidence Intervals | STATE: What parameter do you want to estimate, and at what confidence level? PLAN: Choose the appropriate inference method. Check conditions. DO: If the conditions are met, perform calculations. CONCLUDE: Interpret your interval in the context of the problem. | 55 | |
13559429680 | Interpreting a Confidence Interval | I am ___% confident that the interval from ___ to ___ captures the true ____. | 56 | |
13559432322 | Interpreting a Confidence Level | If many similar samples were taken, _____% of them would result in intervals that contain the true mean/proportion. | 57 | |
13559451108 | What factors affect the Margin of Error? | The margin of error decreases when: -The sample size increases -The confidence level decreases | 58 | |
13559451861 | Inference for Means (Conditions) | Random: Data from a random sample(s) or randomized experiment Normal: Population distribution is normal or large sample(s) (n1 ≥ 30 or n1 ≥ 30 and n2 ≥ 30) Independent: Independent observations and independent samples/groups; 10% condition if sampling without replacement | 59 | |
13559453569 | Inference for Proportions (Conditions) | Random: Data from a random sample(s) or randomized experiment Normal: At least 10 successes and failures (in both groups, for a two sample problem) Independent: Independent observations and independent samples/groups; 10% condition if sampling without replacement | 60 | |
13559457083 | 4-Step Process Significance Tests | State: What hypotheses do you want to test. and at what significance level? Define any parameters you use. Plan: Choose the appropriate inference method. Check conditions. Do: If the conditions are met, perform calculations. Compute the test statistic and find the P-value. Conclude: Interpret the result of your test in the context of the problem. | 61 | |
13559459292 | Explain a P-value | Assuming that the null is true (context) there is a ___ probability of observing a statistic (context) as large as or larger than the one actually observed by chance alone. | 62 | |
13559462918 | Type I Error | Rejecting H0 when H0 is actually true | 63 | |
13559467044 | Type II Error | Failing II reject H0 when Ha is true | 64 | |
13559470256 | Power | Probability of finding convincing evidence that Ha is true when in reality Ha is true. | 65 | |
13559474682 | Factors that Affect Power | 1. Sample Size: To increase power, increase sample size. 2. Increase α: A 5% test of significance will have a greater chance of rejecting the null than a 1% test. 3. Consider an alternative that is farther away from μ0: Values of μ that are in Ha, but lie close to the hypothesized value are harder to detect than values of μ that are far from μ0. | 66 | |
13559478629 | Chi-Square Tests (Conditions) | Random: Data from a random sample(s) or randomized experiment 10%: The sample must be ≤ 10% of the population. Large Counts: All expected counts are at least 5. | 67 | |
13559479823 | Types of Chi-Square Tests | 1. Goodness of Fit: 2. Homogeniety: 3. Indepencence: | 68 | |
13559481582 | Goodness of Fit | Use to test the distribution of one group or sample as compared to a hypothesized distribution. | 69 | |
13559482932 | Homogeniety | Use when you you have a sample from 2 or more independent populations or 2 or more groups in an experiment. Each individual must be classified based upon a single categorical variable. | 70 | |
13559485068 | Indepencence | Use when you have a single sample from a single population. Individuals in the sample are classified by two categorical variables. | 71 | |
13559487721 | Goodness of fit - degrees of freedom | df = k - 1 | 72 | |
13559494122 | Chi-Square Homogeneity/Independence - degrees of freedom | df = (row - 1 )( col. - 1 ) | 73 | |
13559497063 | Inference for Regression (Conditions) | Linear: True relationship between the variables is linear. Independent observations, 10% condition if sampling without replacement Normal: Responses vary normally around the regression line for all x-values Equal Variance around the regression line for all x- values Random: Data from a random sample or randomized experiment | 74 |
AP Statistics Vocabulary Review Flashcards
Primary tabs
Need Help?
We hope your visit has been a productive one. If you're having any problems, or would like to give some feedback, we'd love to hear from you.
For general help, questions, and suggestions, try our dedicated support forums.
If you need to contact the Course-Notes.Org web experience team, please use our contact form.
Need Notes?
While we strive to provide the most comprehensive notes for as many high school textbooks as possible, there are certainly going to be some that we miss. Drop us a note and let us know which textbooks you need. Be sure to include which edition of the textbook you are using! If we see enough demand, we'll do whatever we can to get those notes up on the site for you!