AP Notes, Outlines, Study Guides, Vocabulary, Practice Exams and more!

AP Statistics (Set 3) Flashcards

AP Statistics vocabulary.

Terms : Hide Images
9391410062contextideally tells who was measured, what was measured, how the data were collected, where the data were collected, and when and why the study was performed0
9391410063datasystematically recorded information, whether numbers or labels, together with its context1
9391410064data tablean arrangement of data in which each row represents a case and each column represents a variable2
9391410065casean individual about whom or which we have data3
9391410066variableholds information about the same characteristic for many cases4
9391410067categorical variablea variable that names categories (whether with words or numerals)5
9391410068quantitative variablea variable in which the numbers act as numerical values; always has units6
9391410069unitsa quantity or amount adopted as a standard of measurement, such as dollars, hours, or grams7
9391410070frequency tablelists the categories in a categorical variable and gives the count or percentage of observations for each category8
9391410071distributiongives the possible values of the variable and the relative frequency of each value9
9391410072area principlein a statistical display, each data value should be represented by the same amount of area10
9391410073bar chartshows a bar representing the count of each category in a categorical variable11
9391410074pie chartshows how a "whole" divides into categories by showing a wedge of a circle whose area corresponds to the proportion in each category12
9391410075contingency tabledisplays counts and, sometimes, percentages of individuals falling into named categories on two or more variables; categorizes the individuals on all variables at once, to reveal possible patterns in one variable that may be contingent on the category of the other13
9391410076marginal distributionthe distribution of either variable alone in a contingency table; the counts or percentages are the totals found in the margins (last row or column) of the table14
9391410077conditional distributionthe distribution of a variable restricting the who to consider only a smaller group of individuals15
9391410078independencevariables are said to be this if the conditional distribution of one variable is the same for each category of the other16
9391410079simpson's paradoxwhen averages are taken across different groups, they can appear to contradict the overall averages17
9391410080distributiongives the possible values of the variable and the frequency or relative frequency of each value18
9391410081histogramuses adjacent bars to show the distribution of vales in a quantitative variable; each bar represents the frequency (or relative frequency) of values falling in an interval of values19
9391410082stem-and-leaf displayshows quantitative data values in a way that sketches the distribution of the data20
9391410083dotplotgraphs a dot for each case against a single axis21
9391410084shapeto describe this aspect of a distribution, look for single vs. multiple modes, and symmetry vs. skewness22
9391410085centera value that attempts the impossible by summarizing the entire distribution with a single number, a "typical" value23
9391410086spreada numerical summary of how tightly the values are clustered around the "center"24
9391410087modea hump or local high point in the shape of the distribution of a variable; the apparent locations of these can change as the scale of a histogram is changed25
9391410088unimodalhaving one mode; this is a useful term for describing the shape of a histogram when it's generally mound-shaped26
9391410089bimodaldistributions with two modes27
9391410090multimodaldistributions with more than two modes28
9391410091uniforma distribution that's roughly flat29
9391410092symmetrica distribution is this if the two halves on either side of the center look approximately like mirror images of each other30
9391410093tailsthe parts of a distribution that typically trail off on either side; they can be characterized as long or short31
9391410094skeweda distribution is this if it's not symmetric and one tail stretches out farther than the other32
9391410095outliersextreme values that don't appear to belong with the rest of the data33
9391410096timeplotdisplays data that change over time34
9391410097centersummarized with the mean or the median35
9391410098medianthe middle value with half of the data above and half below it36
9391410099spreadsummarized with the standard deviation, interquartile range, and range37
9391410100rangethe difference between the lowest and highest values in a data set38
9391410101quartilethe lower of this is the value with a quarter of the data below it; the upper of this has a quarter of the data above it39
9391410102interquartile rangethe difference between the first and third quartiles40
9391410103percentilethe ith ___ is the number that falls above i% of the data41
93914101045-number summaryconsists of the minimum and maximum, the quartiles Q1 and Q3, and the median42
9391410105boxplotdisplays the 5-number summary as a central box with whiskers that extend to the non-outlying data values43
9391410106meanfound by summing all the data values and dividing by the count44
9391410107variancethe sum of squared deviations from the mean, divided by the count minus one45
9391410108standard deviationthe square root of the variance46
9391410109comparing distributionswhen doing this, consider their shape, center, and spread47
9391410110shiftingadding a constant to each data value adds the same constant to the mean, the median, and the quartiles, but does not change the standard deviation or IQR48
9391410111rescalingmultiplying each data value by a constant multiplies both the measures of position and the measures of spread by that constant49
9391410112standardizingdone to eliminate units; values can be compared and combined even if the original variables had different units and magnitudes50
9391410113standardized valuevalue found by subtracting the mean and dividing by the standard deviation51
9391410114normal modeluseful family of models for unimodal, symmetric distributions52
9391410115parameternumerically valued attribute of a model53
9391410116statisticvalue calculated from data to summarize aspects of the data54
9391410117z-scoretells how many standard deviations a value is from the mean; have a mean of zero and a standard deviation of one55
9391410118standard normal modela normal model with a mean of 0 and a standard deviation of 156
939141011968-95-99.7 rulein a normal model, about 68% of values fall within 1 standard deviation of the mean, about 95% fall within 2 standard deviations of the mean, and about 99.7% fall within 3 standard deviations of the mean57
9391410120normal percentilethis corresponding to a z-score gives the percentage of values in a standard normal distribution found at that z-score or below58
9391410121normal probability plota display to help assess whether a distribution of data is approximately normal; if it is nearly straight, the data satisfy the nearly normal condition59
9391410122changing center and spreaddoing this is equivalent to changing its units60
9391410123scatterplotsshows the relationship between two quantitative variables measured on the same cases61
9391410124directiona positive ____ or association means that, in general, as one variable increases, so does the other; when increases in one variable generally correspond to decreases in the other, the association is negative62
9391410125formthe ____ we care about most is straight63
9391410126strengtha scatterplot shows an association that is this if there is little scatter around the underlying relationship64
9391410127correlationa numerical measure of the direction and strength of a linear association65
9391410128outliera point that does not fit the overall pattern seen in the scatterplot66
9391410129lurking variablea variable other than x and y that simultaneously affects both variables, accounting for the correlation between the two67
9391410130modelan equation or formula that simplifies and represents reality68
9391410131linear modelan equation of the form y-hat = b0 + b1x69
9391410132residualsthe differences between data values and the corresponding values predicted by the regression model; ____ = observed value - predicted value70
9391410133predicted valuefound by substituting the x-value in the regression equation; they're the values on the fitted line71
9391410134slopegives a value in "y-units per x-unit"; changes of one unit in x are associated with changes of b1 units in predicted values of y72
9391410135regression to the meaneach predicted y-hat tends to be fewer standard deviations from its mean than its corresponding x was from its mean73
9391410136regression linethe linear equation y-hat = b0 + b1x that satisfies the least squares criterion74
9391410137interceptthis, b0, gives a starting value in y-units; it's the y-hat-value when x is 075
9391410138least squaresthis criterion specifies the unique line that minimizes the variance of the residuals or, equivalently, the sum of the squared residuals76
9391410139r2the square of the correlation between y and x; gives the fraction of the variability of y accounted for by the least squares linear regression on x; an overall measure of how successful the regression is in linearly relating y to x77
9391410140subsetif data consist of two or more groups that have been thrown together, it is usually best to fit different linear models to each group than to try to fit a single model to all of the data78
9391410141extrapolationalthough linear models provide an easy way to predict values of y for a given value of x, it is unsafe to predict for values of x far from the ones used to find the linear model equation; predictions should not be trusted79
9391410142outlierany data point that stands away from the others; can be extraordinary by having a large residual or by having high leverage80
9391410143leveragedata points whose x-values are far from the mean of x are said to exert ____ on a linear model; with high enough ____, residuals can appear to be deceptively small81
9391410144influential pointwhen omitting a point from the data results in a very different regression model, the point is an ____82
9391410145lurking variablea variable that is not explicitly part of a model but affects the way the variables in the model appear to be related83
9391410146re-express datawe do this by taking the logarithm, the square root, the reciprocal, or some other mathematical operation on all values in the data set84
9391410147ladder of powersplaces in order the effects that many re-expressions have on the data85
9391410148randoman event is this if we know what outcomes could happen, but not which particular values will happen86
9391410149random numbersthese are hard to generate, but several websites offer an unlimited supply of equally likely random values87
9391410150simulationmodels random events by using random numbers to specify event outcomes with relative frequencies that correspond to the true real-world relative frequencies we are trying to model88
9391410151simulation componentthe most basic situation in a simulation in which something happens at random89
9391410152outcomean individual result of a component of a simulation90
9391410153trialthe sequence of several components representing events that we are pretending will take place91
9391410154response variablevalues of this record the results of each trial with respect to what we were interested in92
9391410155populationthe entire group of individuals or instances about whom we hope to learn93
9391410156samplea representative subset of a population, examined in hope of learning about the population94
9391410157sample surveya study that asks questions of a sample drawn from some population in the hope of learning something about the entire population95
9391410158biasany systematic failure of a sampling method to represent its population; common errors are voluntary response, undercoverage, nonresponse ____, and response ____96
9391410159randomizationthe best defense against bias, in which each individual is given a fair, random chance of selection97
9391410160matchingany attempt to force a sample to resemble specified attributes of the population98
9391410161sample sizethe number of individuals in a sample99
9391410162censusa sample that consists of the entire population100
9391410163population parametera numerically valued attribute of a model for a population101
9391410164representativea sample is this if the statistics computed from it accurately reflect the corresponding population parameters102
9391410165simple random samplethis of sample size n is one in which each set of n elements in the population has an equal chance of selection103
9391410166sampling framea list of individuals from whom the sample is drawn104
9391410167sampling variabilitythe natural tendency of randomly drawn samples to differ105
9391410168stratified random samplea sampling design in which the population is divided into several subpopulations, and random samples are then drawn from each stratum106
9391410169cluster samplea sampling design in which entire groups are chosen at random107
9391410170multistage samplesampling schemes that combine several sampling methods108
9391410171systematic samplea sample drawn by selecting individuals systematically from a sampling frame109
9391410172voluntary response biasbias introduced to a sample when individuals can choose on their own whether to participate in the sample110
9391410173convenience sampleconsists of the individuals who are conveniently available111
9391410174undercoveragea sampling scheme that biases the sample in a way that gives a part of the population less representation than it has in the population112
9391410175nonresponse biasbias introduced to a sample when a large fraction of those sampled fails to respond113
9391410176response biasanything in a survey design that influences response114
9391410177observational studya study based on data in which no manipulation of factors has been employed115
9391410178retrospective studyan observational study in which subjects are selected and then their previous conditions or behaviors are determined116
9391410179prospective studyan observational study in which subjects are followed to observe future outcomes117
9391410180experimentmanipulates factor levels to create treatments, randomly assigns subjects to these treatment levels, and then compares the responses of the subject groups across treatment levels118
9391410181random assignmentto be valid, an experiment must assign experimental units to treatment groups at random119
9391410182factora variable whose levels are controlled by the experimenter120
9391410183responsea variable whose values are compared across different treatments121
9391410184experimental unitsindividuals on whom an experiment is performed122
9391410185levelthe specific values that the experimenter chooses for a factor123
9391410186treatmentthe process, intervention, or other controlled circumstance applied to randomly assigned experimental units124
9391410187principles of experimental designcontrol, randomize, replicate, block125
9391410188statistically significantwhen an observed difference is too large for us to believe that is is likely to have occurred naturally126
9391410189control groupthe experimental units assigned to a baseline treatment level, typically either the default treatment, which is well understood, or a null, placebo treatment127
9391410190blindingany individual associated with an experiment who is not aware of how subjects have been allocated to treatment groups128
9391410191single-blindwhen either those who could influence or evaluate the results is blinded129
9391410192double-blindwhen both those who could influence and evaluate the results are blinded130
9391410193placeboa treatment known to have no effect, administered so that all groups experience the same conditions131
9391410194placebo effectthe tendency of many human subjects (often 20% or more of experiment subjects) to show a response even when administered a placebo132
9391410195blockwhen groups of experimental units are similar, it is a good idea to gather them together into these133
9391410196matchedin a retrospective or prospective study, subjects who are similar in ways not under study may be ____ and then compared with each other on the variables of interest134
9391410197randomized block designrandomization occurring within blocks135
9391410198completely randomized designall experimental units have an equal chance of receiving any treatment136
9391410199confoundedwhen the levels of one factor are associated with the levels of another factor so their effects cannot be separated137

Need Help?

We hope your visit has been a productive one. If you're having any problems, or would like to give some feedback, we'd love to hear from you.

For general help, questions, and suggestions, try our dedicated support forums.

If you need to contact the Course-Notes.Org web experience team, please use our contact form.

Need Notes?

While we strive to provide the most comprehensive notes for as many high school textbooks as possible, there are certainly going to be some that we miss. Drop us a note and let us know which textbooks you need. Be sure to include which edition of the textbook you are using! If we see enough demand, we'll do whatever we can to get those notes up on the site for you!