AP Statistics Vocabulary Flashcards

4443406119	Symmetric	data on which both sides are fairly the same shape and size. "Bell Curve"	0
4443408477	Parameter	value of a population (typically unknown)	1
4443410117	Statistic	a calculated value about a population from a sample(s).	2
4443412238	Median	the middle point of the data (50th percentile) when the data is in numerical order.	3
4443412239	Variability	allows statisticians to distinguish between usual and unusual occurrences.	4
4443416170	Standard Deviation	measures the typical or average deviation of observations from the mean	5
4443417891	Skewed Right	mean is a larger value than the median.	6
4443420801	Z-score/T-score	is a standardized score. This tells you how many standard deviations from the mean an observation is.	7
4443426482	Normal Model	is a bell shaped and symmetrical curve. As σ increases the curve flattens. As σ decreases the curve thins.	8
4443433847	Mutually Exclusive	A and B have no intersection. They cannot happen at the same time.	9
4443435622	Independent	if knowing one event does not change the outcome of another.	10
4443437739	Law of Large Numbers	as an experiment is repeated the experimental probability gets closer and closer to the true (theoretical) probability.	11
4443440085	Correlation Coefficient (r)	is a quantitative assessment of the strength and direction of a linear relationship.	12
4443442655	Least Squares Regression Line (LSRL)	is a line of mathematical best fit. Minimizes the deviations (residuals) from the line. Used with bivariate data.	13
4443454328	Residual (error)	is vertical difference of a point from the LSRL. They should all add to zero. Is the difference between the observed and expected value.	14
4443463188	Coefficient of Determination (r-squared)	gives the proportion of variation in y (response) that is explained by the relationship of (x, y).	15
4443465418	Extrapolation	LRSL cannot be used to find values outside of the range of the original data.	16
4443465419	Influential Points	are points that if removed significantly change the LSRL.	17
4443472341	Census	a complete count of the population. Disadvantages of this: Not accurate, Expensive, Impossible to do	18
4443477612	Simple Random Sample	one chooses so that each unit has an equal chance and every set of units has an equal chance of being selected.	19
4443479151	Stratified Sampling	divide the population into homogeneous groups then SRS from every group. [Observational studies]	20
4443489828	Cluster Sampling	Usually can be based on location. Select a random location and sample ALL at that location. Divide the population into heterogeneous groups and SRS a certain amount of groups. Take all members/things in that group.	21
4443491632	Bias	favors a certain outcome, has to do with center of sampling distributions - if centered over true parameter then considered unbiased	22
4443491633	Voluntary Response Bias	people choose themselves to participate.	23
4443492993	Convenience Sampling	ask people who are easy, friendly, or comfortable asking.	24
4443494569	Undercoverage	some group(s) are left out of the selection process.	25
4443494570	Nonresponse Bias	someone cannot or does not want to be contacted or participate.	26
4443498262	Control Group	a group used to compare the factor to for effectiveness - does NOT have to be placebo	27
4443500393	Single Blind	a method used so that the subjects are unaware of the treatment (who gets a placebo or the real treatment).	28
4443500394	Double Blind	neither the subjects nor the evaluators know which treatment is being given.	29
4443505140	Replication	A MUST for EVERY experimental design. Uses many subjects to quantify the natural variation in the response.	30
4443507346	Completely Randomized Design	all units are allocated to all of the treatments randomly [Experiment]	31
4443513634	Randomized Block	units are separated based on a KNOWN factor. Then randomly assign treatments in each group -reduces variation	32
4443517589	Matched-Pair Design	Once a pair receives a certain treatment, then the other pair automatically receives the second treatment. OR individuals do both treatments in random order (before/after or pretest/post-test) Assignment is dependent	33
4443523911	Confounding Variables	are where the effect of the variable on the response cannot be separated from the effects of the factor being tested - happens in observational studies - when you use random assignment to treatments you do NOT have this!	34
4443526121	Randomization	reduces bias by spreading extraneous variables to all groups in the experiment. MUST have in EVERY experiment	35
4443529865	Binomial Probability	Trials have two outcomes; Trials are independent; and most importantly, the number of trials are fixed!	36
4443532830	Geometric Probability	two mutually exclusive outcomes, each trial is independent, probability (p) of success is the same for all trials. (NOT a fixed number of trials)	37
4443538875	Sampling Distribution	is the distribution of all possible values of all possible samples. Use normalcdf to calculate probabilities	38
4443541166	Standard Error (SE)	estimate of the standard deviation of the statistic	39
4443545143	Central Limit Theorem	when n is sufficiently large (n > 30) the sampLING distribution is approximately normal even if the population distribution is not normal.	40
4443548664	Confidence Interval	used to estimate the unknown population parameter by providing a range of possible parameters	41
4443552647	Hypothesis Test	tells us if a value occurs by random chance or not. If it is unlikely to occur by random chance then it is statistically significant.	42
4443553900	P-Value	assuming the null is true, the probability of obtaining the observed result or more extreme	43
4443558098	Level of Significance	is the amount of evidence necessary before rejecting the null hypothesis. [Alpha - Chances of Type I error occurring]	44
4443559692	Type I Error	is when one rejects H0 when H0 is actually true.	45
4443559693	Type II Error	is when you fail to reject H0, and H0 is actually false.	46
4443565578	Power (of the test)	is the probability that the test will reject the null hypothesis when the null hypothesis is false assuming the null is true. [The chances you make the right decision!]	47
4443567107	Chi-Square	is used to test counts of categorical data.	48
4443570332	T-Test	is used when your test involves sample means/averages	49
4443570333	Z-Test	is used when your test involves proportions/percents. (3 out of 100)	50
4443577286	Goodness of Fit	is for univariate categorical data from a single sample. Does the observed count "fit" what we expect. Must use list to perform	51
4443580805	Confidence level	In repeated sampling, ______% of all the possible intervals that can be constructed by this method will give us a correct estimate.	52
4443589059	Low P-Value	Conclusion "reject the null" and "there is enough evidence to support the HA"	53
4443592083	High P-Value	Conclusion "fail to reject"	54
4443792835	Lurking Variable	is a variable that is not included as an explanatory or response variable in the analysis but can affect the interpretation of relationships between variables. It can falsely identify a strong relationship between variables or it can hide the true relationship.	55
4445221831	Systematic Sampling	Use random number generator to select the first person. Then select every "third" or "fourth" or "fifth" etc...after that	56
4445226151	Simulation	is a way to model random events, such that simulated outcomes closely match real-world outcomes	57
4445230266	Placebo effect	A remarkable phenomenon in which a fake treatment, can sometimes improve a patient's condition simply because the person has the expectation that it will be helpful	58
4445237746	Factors	is an explanatory variable manipulated by the experimenter. Combinations of these help create the number of treatments	59
4445242002	Histogram	A graphical display that represents a frequency distribution by means of rectangles whose widths represent class intervals or "bins"	60