AP Statistics Flashcards

6804127035	5 number summary	The minumum value, lower quartile, median, upper quartile, and maximum value for a data set. These five values give a summary of the shape of the distribution and are used to make box plots. The five numbers that help describe the center, spread and shape of data	0
6804127036	z score	a measure of how many standard deviations you are away from the norm (average or mean) -Number of standard deviations a score is above or below the mean (positive above, negative below	1
6804127037	standard deviation	A statistical measure of how far away each value is, on average, from the mean. A measure of spread. Specifically, the typical distance the data points are from the mean.	2
6804127038	population	(statistics) the entire aggregation of items from which samples can be drawn What the sample in an experiment or study usually reperesents	3
6804127039	categorical data	Data that can be placed into categories . For example "gender" is a categorical data and the categories are "male" and "female". Labels or names used to identify categories of like items If you asked people in which month they were born or what their favorite class is, they would answer with names, which would be categorical data. However, if you asked them how many siblings they have, they would answer with numbers, not categories Labels or names used to identify categories of like items	4
6804127040	quantitative data	Data associated with mathematical models and statistical techniques used to analyze spatial location and association. numerical information describing how much, how little, how big, how tall, how fast, etc. age is quantitative	5
6804127041	bar graph	a type of graph in which the lengths of bars are used to represent and compare data in categories A graph that uses horizontal or vertical bars to represent data.	6
6804127042	parameter	(n) a determining or characteristic element; a factor that shapes the total outcome; a limit, boundary a characteristic or constant factor something that determines the limits of certain data values	7
6804127043	sample	A relatively small proportion of people who are chosen in a survey so as to be representative of the whole. a small part of a population that represents the whole A survey in star city representing the entire state of arkansas	8
6804127044	random	Assigning participants to experimental and control conditions by chance, thus minimizing preexisting differences between those assigned to the different groups. Assigning subjects to expenrimental groups based on chance. pulling names or numbers out of a hat	9
6804127045	bias	Any systematic failure of a sampling method to represent its population Any way that tampers with the accuracy of the sample	10
6804127046	Undercoverage	A sampling scheme that biases the sample in a way that gives a part of the population less representation than it has in the population. When some groups in the population are left out of the process of choosing the sample	11
6804127047	nonresponse	bias introduced to a sample when a large fraction of those sampled fails to respond When many people of a sample do not respond	12
6804127048	voluntary response bias	Bias introduced to a sample when individuals can choose on their own whether to participate in the sample.	13
6804127049	statistic	Application of mathematics to describing and analyzing data	14
6804127050	independent	(statistics) a variable whose values are independent of changes in the values of other variables	15
6804127051	historgram	graphical representation of a frequency distribution using vertical bars but bars touch each other to indicate variables are related	16
6804127052	box plot	A dsiplay that shows the distribution of values in a data set seperated into four equal-sized groups. A box plot is constructed from the five number summary of the data.	17
6804127053	scatterplot	A graphed cluster of dots, each of which represents the values of two variables. The slope of the points suggests the direction of the relationship between the two variables. The amount of scatter suggests the strength of the correlation (little scatter indicates high correlation).	18
6804127054	correlation	A measure of the extent to which two factors vary together, and thus of how well either factor predicts the other. The correlation coefficient is the mathematical expression of the relationship, ranging from -1 to +1	19
6804127055	skewness	The extent to which cases are clustered more at one or the other end of the distribution of a quantitative variable rather than in a symmetric pattern around its center	20
6804127056	variance	commons measure of spread about the mean as center	21
6804127057	statistical significance	A statistical statement of how likely it is that an obtained result occurred by chance/The condition that exists when the probability that the observed findings are due to chance is very low	22
6804127058	P-value	A measure of statistical significance. The lower, the more likely the results of an experiment did not occur simply chance.	23
6804127059	empirical rule	The rules gives the approximate % of observations w/in 1 standard deviation (68%), 2 standard deviations (95%) and 3 standard deviations (99.7%) of the mean when the histogram is well approx. by a normal curve	24
6804127060	lurking variable	A variable that has an important effect on the relationship among the variables in a study but is not one of the explanatory variables studied	25
6804127061	null hypothesis	Hypothesis that predicts NO relationship between variables. The aim of research is to reject this hypothesis	26
6804127062	alternate hypothesis	, is the hypothesis to be considered as an alternative to the null hypothesis. The null hypothesis will be rejected in favor of the Ha only if the sample data strongly indicate that the null hypothesis is false.	27
6804127064	probability	A number with a value from 0 to 1 that describes the likelihood that an event will occur. example, if a bag contains a red marble, a white marble and a blue marble then the probability of selecting a red marble is 1/3.	28
6804127065	descriptive statistics	Mathematical procedures for organizing collections of data, such as determining the mean, the median, the range, the variance, and the correlation coefficient	29
6804127066	mean	A measure of center in a set of numerical data, computed by adding the values in a list and then dividing by the number of values in the list.	30
6804127067	median	A measure of center in a set of numerical data. The median of a list of values is the value appearing at the center of a sorted version of the list - or the mean of the two central values if the list contains an even number of values.	31
6804127068	mode	Measure of central tendency that uses most frequently occurring score.	32
6804127069	range	Distance between highest and lowest scores in a set of data.	33
6804127071	Q1	A location measure of the data such that has one fourth or 25% of the data is smaller than it. Found by dividing the ordered data set in half (excluding the middle observation if n is odd) and finding the median of the lower half of the data.	34
6804127072	Q3	A location to measure when counting data to such as the median where instead of counting 50% it is 75% from the beginning of the sorted data	35
6804127073	minimum	(n.) the smallest possible amount; (adj.) the lowest permissible or possible	36
6804127074	outlier	A value much greater or much less than the others in a data set	37
6804127075	margin of error	In statistical research, the range of outcomes we expect for a population, given the data revealed by a sample drawn from that population	38
6804127077	simple random sample	A sample selected in such a way that every element in the population or sampling frame has an equal probability of being chosen. Equivalently, all samples of size n have an equal chance of being selected. A sample of size n selected from the population in such a way that each possible sample of size n has an equal chance of being selected.	39
6804127078	sampling distribution	Distribution of sample proportions from sample to sample. A sampling distribution of a sample statistic for a fixed sample size n is the distribution of that statistic derived from every possible sample of size n for a given population. A distribution of statistics obtained by selecting all the possible samples of a specific size from a population	40
6804127079	stratified random sample	A method of sampling that involves dividing your population into homogeneous subgroups and taking a simple random sample in each subgroup. a sampling design in which the population is divided into several groups, and random samples are then drawn from each stratum	41
6804127080	systematic sample	A sample drawn by selecting individuals systematically from a sampling frame A sample drawn by selecting individuals systematically from a sampling frame. When there is no relationship between the order of the sampling frame and the variables of interest, a systematic sample can be representative.	42
6804127081	cluster sample	Is obtained by selecting all individuals within a randomly selected collection or group of individuals.	43
6804127082	10% rule	a sample has to be less than 10% of the whole population	44
6804127083	Interpolation	The estimation of an unknown number between known numbers. Interpolation is a way of approximating price or yield using bond tables that do not give the net yield on every amount invested at every rate of interest and for every maturity.	45
6804127084	Qualitative	Data in the form of recorded descriptions rather than numerical measurements.	46
6804127085	theoretical probability	A probability obtained by analyzing a situation. If all of the outcomes are equally likely, you can find the theoretical probability of an event by listing all of the possible outcomes and then finding the ratio of the number of outcomes producing the desired event to the total number of outcomes. For example, there are 36 possible equally likely outcomes (number pairs) when two fair number cubes are rolled. of these six have a sum of 7, so the probability of rolling a sum of 7 is 6/36 or 1/6	47
6804127086	experimental probability	the ratio of the number of times an event occurs to the total number of trials or times the activity is performed.	48
6804127087	block design	The subjects in an experiment are first divided into groups (called 'blocks') based on some common characteristic (such as gender) that is hypothesised to have an effect on the response. Randomization of treatments then happens within each block (each block is like its own mini-experiment)."	49
6804127088	blinding	The practice of concealing group assignment from study subjects, investigators, and/or those who assess subject outcomes, typically in the context of a randomized controlled trial. For ex, study subjects may receive capsules with identical appearance and taste; however, the treatment group receives the active drug, whereas the control group receives the placebo.	50
6804127089	double blind	An experiment in which neither the subjects nor the people who work with them know which treatment each subject is receiving Neither the subjects nor the people who have contact with them know which treatment a subject received	51
6804127090	placebo	A fake treatment. A chemically inert substance that produces real medical benefits because the patient believes it will help her	52
6804127091	least squares regression line	the line with the smallest sum of squared residuals	53
6804127092	type I error	An error that occurs when a researcher concludes that the independent variable had an effect on the dependent variable, when no such relation exists; a "false positive	54
6804127093	type II error	An error that occurs when a researcher concludes that the independent variable had no effect on the dependent variable, when in truth it did; a "false negative	55
6804127094	joint frequency		56
6804127095	matched pairs	an observational technique that involves matching each participant in the experimental group with a specific participant in the control group in order to eliminate the possibility that a third variable (and not the independent variable) caused changes in the dependent variable	57
6804127096	conditional prabability	probability given that something else has already occurred	58
6804127097	sample space	Set of all possible outcomes of an experiment	59
6804127098	confounded variable	A variable whose effect on the response variable cannot be separated from the effect of the explanatory variable on the response variable. (Note: Usually confounded variables are lurking variables but only a few lurking variables are also confounded.)	60
6804127099	marginal frequency	A set of intervals, usually adjacent and of equal width, into which the range of a statistical distribution is divided, each associated with a frequency indicating the number of measurements in that interval.	61
6804127100	coefficient of determination	The statistic or number determined by squaring the correlation coefficient. Represents the amount of variance accounted for by that correlation. Statistic that represents amount of variance accounted for by a correlation.	62
6804127101	binomial	A two-name naming system.	63
6804127102	unimodal	having one mode; this is a useful term for describing the shape of a histogram when it's generally mound-shaped a data set with one mode such a normal distribution usually has only one mode	64
6804127103	bimodal	A type of distribution, where there is two or more categories with an equal count or cases and with more cases than the other categories. A distribution with two modes	65
6804127104	experiment	A kind of research in which the researcher controls all the conditions and directly manipulates the conditions, including the independent variable. Testing the hypothesis	66
6804127105	law of large numbers	(statistics) law stating that a large number of items taken at random from a population will (on the average) have the population statistics	67
6804127106	extrapolation	calculation of the value of a function outside the range of known values	68
6804127107	snowball	Huyen wanted to conduct market research to find out why students were unhappy with Marketing 431, probably the finest course ever to be offered by a university. In order to do this she needed to find people who were unhappy with the course. Figuring that these people would talk to each other, she used a sampling technique where she found one person who was unhappy with the course and, after asking her research questions, asked this person for the name of another person who was unhappy with the course.	69
6804127108	IQR	A measure of variability, based on dividing a data set into quartiles Difference between upper and lower quartile of a boxplot	70
6804127109	Confidence interval	A range of values for a variable of interest; the specified probability is called the confidence level and the end points of the confidence interval are called the confidence limits A range of numbers in which most of the data values are likely to fall. we are 95% confident that etc.	71
6804127110	Standard Error	A statistic providing an estimate of the possible magnitude to error. The larger the standard error of measurement, the less reliable the score. Standard deviation of sampling distribution	72
6804127111	Residual		73
6804127112	Convenience sample	Whenever a sample is taken it gives an improper results because the sample was taken from a very convenient area instead of representing a population	74
6804127113	simulation	A representation of a situation or problem with a similar but simpler model or a more easily manipulated model in order to determine experimental results.	75
6804127114	degrees of freedom	The number of individual scores that can vary without changing the sample mean. Statistically written as 'N-1' where N represents the number of subjects.	76
6804127115	two way table	A table containing counts for two categorical variables. It has r rows and c columns. describes to categorical variables with row variable and column variable	77
6804127116	spread	The visible variation in a sample distribution	78
6804127117	center	The measure of the distance the mode is from the center of a distribution	79
6804127118	shape		80
6804127119	discrete random variable		81
6804127120	central limit theorem		82
6804127121	standardized value		83
6804127122	mutually exclusive		84
6804127123	wording bias	Whenever a bias is created in a sample by the way the survey is worded to favor one question	85
6804127124	causation		86
6804127125	z test		87
6804127126	t test		88
6804127127	chi squared goodness of fit	tests how well close the observes data is to what would be expected under the model. If a sign diff is found b/w the two then ob. data has not been generated by chance. nominal data Determine if scores from one variable match expectations for that distribution a gambler placed $1,000 into a game of greed in which he lost. He hopes to catch his opponent and bust him for loading the dice. He does this by choosing one dice to roll 36 times. He knows that the each side has an equal chance of landing face up. He hopes to get an outcome abnormal to this. Given the data below, can we prove that the dice are loaded	89
6804127128	frequency table	A grouping of qualitative data into mutually exclusive classes showing the number of observations in each class. A chart showing the number of times a specific event happens.	90
6804127129	area principle	the area occupied by a part of the graph should correspond to the magnitude of the value it represents	91
6804127130	simpsons paradox		92
6804127131	contingency table	displays counts, and, sometimes, percentages of individuals falling into named categories on two or more variables. The table categorizes the individuals on all variables at once, to reveal possible patterns in one variable that may be contingent on the category of the other. A two-variable table with cross-tabulated data.	93
6804127132	stem and leaf display	A multiple column table depicting the individual digits of the scores. A score of 95 would have a stem of 9 and a leaf of 5, a score of 62 would have a stem of 6 and a leaf of 2. If a particular stem has more than one leaf, such as the scores 54, 58, and 51, the stem of 5 has three leaves, in this case 458. . It shows the range of values of the variable	94
6804127133	multimodal	Describes a graph of quantitative data with more than two clear peaks. A distribution with more than two modes	95
6804127134	uniform	A histogram doesn't appear to have any mode and in which all the bars are approximately the same height Evenly spaced	96
6804127135	symetric	When in a normal distribution both sides are identical	97
6804127136	time plot	Displays data that change over time. Often, successive values are connected with lines to show trends more clearly. Sometimes a smooth curve is added to the plot to help show long-term patterns and trends. Displays data that change over time.	98
6804127137	se	standard deviation of residuals	99
6804127138	r2	overall measure of how successful the regression is in linearlly relating to y and x	100
6804127139	leverage		101
6804127140	influential point	a point when omitted will give very different results	102
6804127141	census	When a survey has no sample but instead test or surveys the entire population	103
6804127142	multistage samole		104
6804127143	pilot	small trial run of a survey to see if questions are clear	105
6804127144	convenience sample	Choosing a sample because it is convenient. failing to get a proper representation of the population because If you survey everyone on your soccer team who attends tonight's practice, you are surveying a convenience sample.	106
6804127145	response bias	Anything in a survey design that influences responses falls under the heading of response bias. One typical response bias arises from the wording of questions, which may suggest a favored response. Voters, for example, are more likely to express support of "the president" than support of the particular person holding that office at the moment. Anything that changes the response in a survey A police officer asking teenagers about drug use	107
6804127146	observational study	A study based on data in which no manipulation of factors has been employed. A study that observes characteristics of an existing population. usually a survey	108
6804127147	retrospective study	What study examines whether a past association exists between an exposure of interest and development of a present condition? data are collected from the past by going back in time	109
6804127148	prospective study	an observational study in which subjects are followed to observe future outcomes	110
6804127149	statistic factor	A multifactor model in which statistical methods are applied ot a set of historical returns to determine portfolios that best explain either historical return covariances or variances.	111
6804127150	control group	In an experiment, the group that is not exposed to the treatment; contrasts with the experimental group and serves as a comparison for evaluating the effect of the treatment.	112
6804127151	blinding	The practice of concealing group assignment from study subjects, investigators, and/or those who assess subject outcomes, typically in the context of a randomized controlled trial. For ex, study subjects may receive capsules with identical appearance and taste; however, the treatment group receives the active drug, whereas the control group receives the placebo.	113
6804127152	placebo effect	Experimental results caused by expectations alone; any effect on behavior caused by the administration of an inert substance or condition, which is assumed to be an active agent.	114
6804127153	trial	A performed experiment based upon the hypothesis you made.	115
6804127154	maximum	(n.) the greatest possible amount or degree in a data sample the largest value in a set of data	116

Class Notes

Social Science

Math

Science

Fine Arts

Test Prep

Textbook Notes

Members Only

Forum

Blogs

Textbook Request

AP Statistics Flashcards

Primary tabs

Need Help?

Need Notes?

About Course-Notes.Org

You are here

AP Statistics Flashcards

Primary tabs

Need Help?

Need Notes?

About Course-Notes.Org