AP Statistics Vocabulary Flashcards
Terms : Hide Images [1]
4443406119 | Symmetric | data on which both sides are fairly the same shape and size. "Bell Curve" | 0 | |
4443408477 | Parameter | value of a population (typically unknown) | 1 | |
4443410117 | Statistic | a calculated value about a population from a sample(s). | 2 | |
4443412238 | Median | the middle point of the data (50th percentile) when the data is in numerical order. | 3 | |
4443412239 | Variability | allows statisticians to distinguish between usual and unusual occurrences. | 4 | |
4443416170 | Standard Deviation | measures the typical or average deviation of observations from the mean | 5 | |
4443417891 | Skewed Right | mean is a larger value than the median. | ![]() | 6 |
4443420801 | Z-score/T-score | is a standardized score. This tells you how many standard deviations from the mean an observation is. | 7 | |
4443426482 | Normal Model | is a bell shaped and symmetrical curve. As σ increases the curve flattens. As σ decreases the curve thins. | 8 | |
4443433847 | Mutually Exclusive | A and B have no intersection. They cannot happen at the same time. | ![]() | 9 |
4443435622 | Independent | if knowing one event does not change the outcome of another. | 10 | |
4443437739 | Law of Large Numbers | as an experiment is repeated the experimental probability gets closer and closer to the true (theoretical) probability. | 11 | |
4443440085 | Correlation Coefficient (r) | is a quantitative assessment of the strength and direction of a linear relationship. | 12 | |
4443442655 | Least Squares Regression Line (LSRL) | is a line of mathematical best fit. Minimizes the deviations (residuals) from the line. Used with bivariate data. | 13 | |
4443454328 | Residual (error) | is vertical difference of a point from the LSRL. They should all add to zero. Is the difference between the observed and expected value. | 14 | |
4443463188 | Coefficient of Determination (r-squared) | gives the proportion of variation in y (response) that is explained by the relationship of (x, y). | 15 | |
4443465418 | Extrapolation | LRSL cannot be used to find values outside of the range of the original data. | ![]() | 16 |
4443465419 | Influential Points | are points that if removed significantly change the LSRL. | ![]() | 17 |
4443472341 | Census | a complete count of the population. Disadvantages of this: Not accurate, Expensive, Impossible to do | 18 | |
4443477612 | Simple Random Sample | one chooses so that each unit has an equal chance and every set of units has an equal chance of being selected. | ![]() | 19 |
4443479151 | Stratified Sampling | divide the population into homogeneous groups then SRS from every group. [Observational studies] | ![]() | 20 |
4443489828 | Cluster Sampling | Usually can be based on location. Select a random location and sample ALL at that location. Divide the population into heterogeneous groups and SRS a certain amount of groups. Take all members/things in that group. | 21 | |
4443491632 | Bias | favors a certain outcome, has to do with center of sampling distributions - if centered over true parameter then considered unbiased | ![]() | 22 |
4443491633 | Voluntary Response Bias | people choose themselves to participate. | 23 | |
4443492993 | Convenience Sampling | ask people who are easy, friendly, or comfortable asking. | 24 | |
4443494569 | Undercoverage | some group(s) are left out of the selection process. | 25 | |
4443494570 | Nonresponse Bias | someone cannot or does not want to be contacted or participate. | 26 | |
4443498262 | Control Group | a group used to compare the factor to for effectiveness - does NOT have to be placebo | 27 | |
4443500393 | Single Blind | a method used so that the subjects are unaware of the treatment (who gets a placebo or the real treatment). | 28 | |
4443500394 | Double Blind | neither the subjects nor the evaluators know which treatment is being given. | 29 | |
4443505140 | Replication | A MUST for EVERY experimental design. Uses many subjects to quantify the natural variation in the response. | 30 | |
4443507346 | Completely Randomized Design | all units are allocated to all of the treatments randomly [Experiment] | ![]() | 31 |
4443513634 | Randomized Block | units are separated based on a KNOWN factor. Then randomly assign treatments in each group -reduces variation | ![]() | 32 |
4443517589 | Matched-Pair Design | Once a pair receives a certain treatment, then the other pair automatically receives the second treatment. OR individuals do both treatments in random order (before/after or pretest/post-test) Assignment is dependent | 33 | |
4443523911 | Confounding Variables | are where the effect of the variable on the response cannot be separated from the effects of the factor being tested - happens in observational studies - when you use random assignment to treatments you do NOT have this! | 34 | |
4443526121 | Randomization | reduces bias by spreading extraneous variables to all groups in the experiment. MUST have in EVERY experiment | 35 | |
4443529865 | Binomial Probability | Trials have two outcomes; Trials are independent; and most importantly, the number of trials are fixed! | 36 | |
4443532830 | Geometric Probability | two mutually exclusive outcomes, each trial is independent, probability (p) of success is the same for all trials. (NOT a fixed number of trials) | 37 | |
4443538875 | Sampling Distribution | is the distribution of all possible values of all possible samples. Use normalcdf to calculate probabilities | 38 | |
4443541166 | Standard Error (SE) | estimate of the standard deviation of the statistic | 39 | |
4443545143 | Central Limit Theorem | when n is sufficiently large (n > 30) the sampLING distribution is approximately normal even if the population distribution is not normal. | ![]() | 40 |
4443548664 | Confidence Interval | used to estimate the unknown population parameter by providing a range of possible parameters | 41 | |
4443552647 | Hypothesis Test | tells us if a value occurs by random chance or not. If it is unlikely to occur by random chance then it is statistically significant. | 42 | |
4443553900 | P-Value | assuming the null is true, the probability of obtaining the observed result or more extreme | ![]() | 43 |
4443558098 | Level of Significance | is the amount of evidence necessary before rejecting the null hypothesis. [Alpha - Chances of Type I error occurring] | 44 | |
4443559692 | Type I Error | is when one rejects H0 when H0 is actually true. | 45 | |
4443559693 | Type II Error | is when you fail to reject H0, and H0 is actually false. | ![]() | 46 |
4443565578 | Power (of the test) | is the probability that the test will reject the null hypothesis when the null hypothesis is false assuming the null is true. [The chances you make the right decision!] | 47 | |
4443567107 | Chi-Square | is used to test counts of categorical data. | 48 | |
4443570332 | T-Test | is used when your test involves sample means/averages | 49 | |
4443570333 | Z-Test | is used when your test involves proportions/percents. (3 out of 100) | 50 | |
4443577286 | Goodness of Fit | is for univariate categorical data from a single sample. Does the observed count "fit" what we expect. Must use list to perform | 51 | |
4443580805 | Confidence level | In repeated sampling, ______% of all the possible intervals that can be constructed by this method will give us a correct estimate. | 52 | |
4443589059 | Low P-Value | Conclusion "reject the null" and "there is enough evidence to support the HA" | 53 | |
4443592083 | High P-Value | Conclusion "fail to reject" | 54 | |
4443792835 | Lurking Variable | is a variable that is not included as an explanatory or response variable in the analysis but can affect the interpretation of relationships between variables. It can falsely identify a strong relationship between variables or it can hide the true relationship. | 55 | |
4445221831 | Systematic Sampling | Use random number generator to select the first person. Then select every "third" or "fourth" or "fifth" etc...after that | 56 | |
4445226151 | Simulation | is a way to model random events, such that simulated outcomes closely match real-world outcomes | 57 | |
4445230266 | Placebo effect | A remarkable phenomenon in which a fake treatment, can sometimes improve a patient's condition simply because the person has the expectation that it will be helpful | 58 | |
4445237746 | Factors | is an explanatory variable manipulated by the experimenter. Combinations of these help create the number of treatments | 59 | |
4445242002 | Histogram | A graphical display that represents a frequency distribution by means of rectangles whose widths represent class intervals or "bins" | 60 |