AP Stat Midterm Review Flashcards
AP Stat Midterm Review
Terms : Hide Images [1]
614947945 | the five-number summary | consists of the smallest observation, the first quartile, the median, the third quartile, and the largest observation, written in order from smallest to largest. In symbols, the five-number summary is Minimum Q1 M Q3 Maximum | 0 | |
614947946 | Minimum and Q1 | about 25% of the data fall between here | 1 | |
614947947 | Q1 and Median | about 25% of the data fall between here | 2 | |
614947948 | Median and Q3 | about 25% of the data fall between here | 3 | |
614947949 | Q3 and maximum | about 25% of the data fall between here | 4 | |
614947950 | Normal distribution | described by a normal density curve. any particular normal distribution is completely specified by two numbers: its mean µ and standard deviation δ. the mean of a normal distribution is at the center of the symmetric normal curve. the standard deviation is the distance from the center to the change-of-curvature points on either side. we abbreviate the normal distribution with mean µ and standard deviation δ as N(µ,δ). | 5 | |
614947951 | convenience sample | choosing individuals who are easiest to reach | 6 | |
614947952 | voluntary response sample | consists of people who choose themselves by responding to a general appeal. Voluntary response samples show bias because people with strong opinions(often in the same direction) are most likely to respond. | 7 | |
614947953 | nonresponse | occurs when an individual chosen for sample can't be contacted or refuses to participate | 8 | |
614947954 | response bias | another type of nonsampling error occurs when someone gives an incorrect response. a systematic pattern of incorrect responses in a sample suvery leads to this | 9 | |
614947955 | z- formula | z=x-µ/δ, the variable z has the standard normal distribution with mean 0 and standard deviation 1. | 10 | |
614947956 | positive association | when two variables have a above-average values of one tend to accompany above-averages of the other, and when below-average values also tend to occur together. | 11 | |
614947957 | negative association | when two variables have above-average values of one tend to accompany below-average values of the other. | 12 | |
614947958 | lurking variable | a variable that is not among the explanatory or response variables in a study but that may influence the response variable. | 13 | |
614947959 | outlier | an important kind of departure, an individual value that falls outside the overall pattern.an observation that lies outside the overall pattern of the other observations. points that are outliers in the y-direction but not the x-direction of a scatterplot have large residuals. other outliers may not have large residuals. | 14 | |
614947960 | experiment | deliberately imposes some treatment on individuals to measure their responses. | 15 | |
614947961 | simple random sample(SRS) | of size n consists of n individuals from the population chosen in such a way that every set of n individuals has an equal chance to be the sample actually selected. | 16 | |
614947962 | stratified random sample | to select, first classify the population into groups of similar individuals, called strata. Then choose a separate SRS in each stratum and combine these SRSs to form the full sample. | 17 | |
614947963 | completely randomized design | the treatments are assigned to all the experimental units completely by chance. | 18 | |
614947964 | matched pairs design | a common form of blocking for comparing just two treatments. In some matched pairs designs, each subject receives both treatments in a random order. In others, the subjects are matched in pairs as closely as possible, and each subject in a pair receives one of the treatments. | 19 | |
614947965 | double-blind design | what many behavioral and medical experiments are. that is, neither the subjects nor those interacting with them and measuring their responses know who is receiving which treatment. | 20 | |
614947966 | table of random digits | a long string of the digits 0,1,2,3,4,5,6,7,8,9 with these properties: each entry in the table is equally likely to be any of the 10 digits 0 through 9. the entries are independent of each other. that is, knowledge of one part of the table gives no information about any other part. | 21 | |
614947967 | placebo | a fake treatment that some experiments give to a control group. | 22 | |
614947968 | treatment | a specific condition applied to the individuals in an experiment. if an experiment has several explanatory variables, a treatment is a combination of specific values of these variables. | 23 | |
615086015 | correlation | r measures the strength and direction of the linear association between two quantitative variables x and y. Although you can calculate a correlation for any scatterplot, r measures only straight-line relationships. it indicates the direction of a linear relationship by its sign r > 0 for a positive association and r < 0 for a negative association. Correlation always satisfies -1≤r≤1 and indicates the strength of a relationship by how close it is to -1 or 1. perfect correlation, r =±1, occurs only when the points on a scatterplot lie exactly on a straight line. correlation ingores the distinction between explanatory and response variables. the value of r is not affected by changes in the unit of measurement of either variable. correlation is not resistant, so outliers can greatly change the value of r. | 24 | |
615086016 | regression line | a line that describes how a response variable y changes as an explanatory variable x changes. we often use a regression line to predict the value of y for a given value x. y= a+bx | 25 | |
615086017 | y("y" hat) | the predicted value of the response variable y for a given value of the explanatory variable x | 26 | |
615086018 | b | the slope, the amount by which y is predicted to change when x increases by one unit. | 27 | |
615086019 | a | the y-intercept, the predicted value of y when x=0. | 28 | |
615086020 | residual | the difference between an observed value of the response variable and the value predicted by the regression line. that is, residual=observed y-predicted y= y-y("y" hat) | 29 | |
615086021 | least-squares regression line | of y on x is the line that makes the sum of the squared residuals as small as possible. the most common method of fitting a line to a scatterplot. the straight line y=a+bx that minimizes the sum of the squares of the vertical distances of the observed points from the line. | 30 | |
615086022 | equation of the least-squares regression line | we have data on an explanatory variable x and a response variable y for n individuals.from the data, calculate the means x and y and the standard deviations sx and sy of the two variables and their correlation r. the least-squares regression line is the line y=a+bx with slope b=r (sy/sx )and y-intercept a=y-bx. this line always passes through the point(x,y). | 31 | |
615086023 | categorical variable | places an individual into one of several groups or categories. | 32 | |
615086024 | quantitative variable | takes numerical values for which its makes sense to find an average. | 33 | |
615086025 | placebo effect | when some patients get better because they expect the treatment to work even though they have received an inactive treatment. | 34 | |
615086026 | control group | its primary purpose is to provide a baseline for comparing the effects of the other treatments | 35 | |
615086027 | random assignment | uses chance to assign subjects to the treatments. creates treatment groups that are similar(expect for chance variation) before the treatments are applied. | 36 | |
615086028 | subjects | when the experimental units are human beings, this is what they are often called | 37 | |
615086029 | probability model | a description of some chance process that consists of two parts:a sample space S and a probability for each outcome | 38 | |
615086030 | independent | when the chance that event B occurs is not affected by whether event A occurs, we say that events A and B are this. For independent events A and B, P(B|A)=P(B) and P(A|B)=P(A). If two events A and B are mutually exclusive(disjoint), they cannot be independent. | 39 | |
615086031 | the general addition rule | can be used to find P(A or B): P(A ∪ B)=P(A)+P(B)-P(A ∩ B) | 40 | |
615086032 | probability | of any outcome of a chance process is a number between 0 and 1 that describes the proportion of times the outcome would occur in a very long series of repetitions. | 41 | |
615086033 | mutually exclusive(disjoint) | when two events have no outcomes in common and so can never occur together. | 42 | |
615086034 | dotplot | used to show the distribution of a quantitative variable, displays individual values on a number line. | 43 | |
615086035 | the 1.5 X IQR rule for outliers | call an observation an outlier if it falls more than 1.5 X IQR above the third quartile or below the first quartile. Q1-1.5 X IQR Q3 -1.5 X IQR | 44 | |
615086036 | calculating quartiles and IQR | to calculate the quartiles: 1. arrange to observations in increasing order and locate the median M in the ordered list of observations. 2. the first quartile Q1 is the median of the observations whose position in the ordered list is to the left of the median. 3. The third quartile Q3 is the median of the observations whose position in the ordered list is to the right of the median. The IQR is defined as IQR=Q3-Q1 | 45 | |
615086037 | boxplots | based on the five-number summary are useful for comparing distributions. the box spans the quartiles and shows the spread of the central half of the distribution. the median is marked within the box. lines extend from the box to the smallest and the largest observations that are not outliers. outliers are plotted as isolated points. | 46 | |
615086038 | SOCS | when examining any graph, look for an overall pattern and for notable departures from that pattern. shape, center, and spread describe the overall pattern of the distribution of a quantitative variable. outliers are observations that lie outside the overall pattern of a distribution. always look for outliers and try to explain them. don't forget your.... | 47 | |
615086039 | two way table/venn diagram | can be use to display the sample space for a chance process. two-way tables and venn diagrams can also be used to find probabilities involving events A and B, like the union(A ∪ B) and intersection (A ∩ B). The event A ∪ B("A or B") consists of all outcomes in event A, event B, or both. The event A ∩ B("A and B") consists of outcomes in both A and B. | 48 | |
615086040 | complement rule | P(A c)=1-P(A),where A c is the complement of event A; that is, the event that A does not happen. | 49 | |
615109803 | randomized block design | the random assignment of experimental units to treatments is carried out separately within each block | 50 | |
615109804 | block | a group of experimental units that are known before the experiment to be similar in some way that is expected to affect the response to the treatments | 51 | |
615109805 | undercoverage | occurs when some groups in the population are left out of the process of choosing the sample | 52 | |
615109806 | cluster sample | first divide the population into smaller groups. ideally, these clusters should mirror the characteristics of the population. then choose an srs of the clusters. all individuals in the chosen clusters are included in the sample. | 53 |