AP Stat Midterm Review Flashcards

AP Stat Midterm Review

614947945	the five-number summary	consists of the smallest observation, the first quartile, the median, the third quartile, and the largest observation, written in order from smallest to largest. In symbols, the five-number summary is Minimum Q1 M Q3 Maximum	0
614947946	Minimum and Q1	about 25% of the data fall between here	1
614947947	Q1 and Median	about 25% of the data fall between here	2
614947948	Median and Q3	about 25% of the data fall between here	3
614947949	Q3 and maximum	about 25% of the data fall between here	4
614947950	Normal distribution	described by a normal density curve. any particular normal distribution is completely specified by two numbers: its mean µ and standard deviation δ. the mean of a normal distribution is at the center of the symmetric normal curve. the standard deviation is the distance from the center to the change-of-curvature points on either side. we abbreviate the normal distribution with mean µ and standard deviation δ as N(µ,δ).	5
614947951	convenience sample	choosing individuals who are easiest to reach	6
614947952	voluntary response sample	consists of people who choose themselves by responding to a general appeal. Voluntary response samples show bias because people with strong opinions(often in the same direction) are most likely to respond.	7
614947953	nonresponse	occurs when an individual chosen for sample can't be contacted or refuses to participate	8
614947954	response bias	another type of nonsampling error occurs when someone gives an incorrect response. a systematic pattern of incorrect responses in a sample suvery leads to this	9
614947955	z- formula	z=x-µ/δ, the variable z has the standard normal distribution with mean 0 and standard deviation 1.	10
614947956	positive association	when two variables have a above-average values of one tend to accompany above-averages of the other, and when below-average values also tend to occur together.	11
614947957	negative association	when two variables have above-average values of one tend to accompany below-average values of the other.	12
614947958	lurking variable	a variable that is not among the explanatory or response variables in a study but that may influence the response variable.	13
614947959	outlier	an important kind of departure, an individual value that falls outside the overall pattern.an observation that lies outside the overall pattern of the other observations. points that are outliers in the y-direction but not the x-direction of a scatterplot have large residuals. other outliers may not have large residuals.	14
614947960	experiment	deliberately imposes some treatment on individuals to measure their responses.	15
614947961	simple random sample(SRS)	of size n consists of n individuals from the population chosen in such a way that every set of n individuals has an equal chance to be the sample actually selected.	16
614947962	stratified random sample	to select, first classify the population into groups of similar individuals, called strata. Then choose a separate SRS in each stratum and combine these SRSs to form the full sample.	17
614947963	completely randomized design	the treatments are assigned to all the experimental units completely by chance.	18
614947964	matched pairs design	a common form of blocking for comparing just two treatments. In some matched pairs designs, each subject receives both treatments in a random order. In others, the subjects are matched in pairs as closely as possible, and each subject in a pair receives one of the treatments.	19
614947965	double-blind design	what many behavioral and medical experiments are. that is, neither the subjects nor those interacting with them and measuring their responses know who is receiving which treatment.	20
614947966	table of random digits	a long string of the digits 0,1,2,3,4,5,6,7,8,9 with these properties: each entry in the table is equally likely to be any of the 10 digits 0 through 9. the entries are independent of each other. that is, knowledge of one part of the table gives no information about any other part.	21
614947967	placebo	a fake treatment that some experiments give to a control group.	22
614947968	treatment	a specific condition applied to the individuals in an experiment. if an experiment has several explanatory variables, a treatment is a combination of specific values of these variables.	23
615086015	correlation	r measures the strength and direction of the linear association between two quantitative variables x and y. Although you can calculate a correlation for any scatterplot, r measures only straight-line relationships. it indicates the direction of a linear relationship by its sign r > 0 for a positive association and r < 0 for a negative association. Correlation always satisfies -1≤r≤1 and indicates the strength of a relationship by how close it is to -1 or 1. perfect correlation, r =±1, occurs only when the points on a scatterplot lie exactly on a straight line. correlation ingores the distinction between explanatory and response variables. the value of r is not affected by changes in the unit of measurement of either variable. correlation is not resistant, so outliers can greatly change the value of r.	24
615086016	regression line	a line that describes how a response variable y changes as an explanatory variable x changes. we often use a regression line to predict the value of y for a given value x. y= a+bx	25
615086017	y("y" hat)	the predicted value of the response variable y for a given value of the explanatory variable x	26
615086018	b	the slope, the amount by which y is predicted to change when x increases by one unit.	27
615086019	a	the y-intercept, the predicted value of y when x=0.	28
615086020	residual	the difference between an observed value of the response variable and the value predicted by the regression line. that is, residual=observed y-predicted y= y-y("y" hat)	29
615086021	least-squares regression line	of y on x is the line that makes the sum of the squared residuals as small as possible. the most common method of fitting a line to a scatterplot. the straight line y=a+bx that minimizes the sum of the squares of the vertical distances of the observed points from the line.	30
615086022	equation of the least-squares regression line	we have data on an explanatory variable x and a response variable y for n individuals.from the data, calculate the means x and y and the standard deviations sx and sy of the two variables and their correlation r. the least-squares regression line is the line y=a+bx with slope b=r (sy/sx )and y-intercept a=y-bx. this line always passes through the point(x,y).	31
615086023	categorical variable	places an individual into one of several groups or categories.	32
615086024	quantitative variable	takes numerical values for which its makes sense to find an average.	33
615086025	placebo effect	when some patients get better because they expect the treatment to work even though they have received an inactive treatment.	34
615086026	control group	its primary purpose is to provide a baseline for comparing the effects of the other treatments	35
615086027	random assignment	uses chance to assign subjects to the treatments. creates treatment groups that are similar(expect for chance variation) before the treatments are applied.	36
615086028	subjects	when the experimental units are human beings, this is what they are often called	37
615086029	probability model	a description of some chance process that consists of two parts:a sample space S and a probability for each outcome	38
615086030	independent	when the chance that event B occurs is not affected by whether event A occurs, we say that events A and B are this. For independent events A and B, P(B\|A)=P(B) and P(A\|B)=P(A). If two events A and B are mutually exclusive(disjoint), they cannot be independent.	39
615086031	the general addition rule	can be used to find P(A or B): P(A ∪ B)=P(A)+P(B)-P(A ∩ B)	40
615086032	probability	of any outcome of a chance process is a number between 0 and 1 that describes the proportion of times the outcome would occur in a very long series of repetitions.	41
615086033	mutually exclusive(disjoint)	when two events have no outcomes in common and so can never occur together.	42
615086034	dotplot	used to show the distribution of a quantitative variable, displays individual values on a number line.	43
615086035	the 1.5 X IQR rule for outliers	call an observation an outlier if it falls more than 1.5 X IQR above the third quartile or below the first quartile. Q1-1.5 X IQR Q3 -1.5 X IQR	44
615086036	calculating quartiles and IQR	to calculate the quartiles: 1. arrange to observations in increasing order and locate the median M in the ordered list of observations. 2. the first quartile Q1 is the median of the observations whose position in the ordered list is to the left of the median. 3. The third quartile Q3 is the median of the observations whose position in the ordered list is to the right of the median. The IQR is defined as IQR=Q3-Q1	45
615086037	boxplots	based on the five-number summary are useful for comparing distributions. the box spans the quartiles and shows the spread of the central half of the distribution. the median is marked within the box. lines extend from the box to the smallest and the largest observations that are not outliers. outliers are plotted as isolated points.	46
615086038	SOCS	when examining any graph, look for an overall pattern and for notable departures from that pattern. shape, center, and spread describe the overall pattern of the distribution of a quantitative variable. outliers are observations that lie outside the overall pattern of a distribution. always look for outliers and try to explain them. don't forget your....	47
615086039	two way table/venn diagram	can be use to display the sample space for a chance process. two-way tables and venn diagrams can also be used to find probabilities involving events A and B, like the union(A ∪ B) and intersection (A ∩ B). The event A ∪ B("A or B") consists of all outcomes in event A, event B, or both. The event A ∩ B("A and B") consists of outcomes in both A and B.	48
615086040	complement rule	P(A c)=1-P(A),where A c is the complement of event A; that is, the event that A does not happen.	49
615109803	randomized block design	the random assignment of experimental units to treatments is carried out separately within each block	50
615109804	block	a group of experimental units that are known before the experiment to be similar in some way that is expected to affect the response to the treatments	51
615109805	undercoverage	occurs when some groups in the population are left out of the process of choosing the sample	52
615109806	cluster sample	first divide the population into smaller groups. ideally, these clusters should mirror the characteristics of the population. then choose an srs of the clusters. all individuals in the chosen clusters are included in the sample.	53