Statistics Flashcards
Terms : Hide Images [1]
2012583830 | Standardizing | We ________ to eliminate units | 0 | |
2012583831 | Standardized Value | Value found by subtracting the mean and dividing by the standard deviation. | 1 | |
2012583832 | Shifting | Adding a constant to the mean, the median, and the quartiles, but does not change the standard deviation or IQR. | 2 | |
2012583833 | Rescaling | Multiple each data value by a constant multiplies both the measures of position and the measures of spread by that constant. | 3 | |
2012583834 | Normal Model | A useful family of models for unimodel, symmetric distributions. | 4 | |
2012583835 | Parameter | A numerically valued attribute of a model. | 5 | |
2012583836 | Statistic | A value calculated from data to summarize aspects of the data. | 6 | |
2012583837 | Z-score | Tells how many standard deviations a value is from the mean. | 7 | |
2012583838 | Boxplot | Displays the 5-number summary as a central box with whiskers that extend to the non-outlying data values. | 8 | |
2012583839 | Far Outlier | If a point is more than 3.0 IQR from either end of the box in a boxplot. | 9 | |
2012583840 | Comparing Distributions | Consider: shape, center, spread | 10 | |
2012583841 | Comparing Boxplots | Compare Shapes; Compare Medians; Compare IQRS; Check for outliers | 11 | |
2012583842 | Timeplot | Displays data that change overtime. | 12 | |
2012583843 | Standard Deviation | Square Root of the Var. | 13 | |
2012583844 | Variance | The sum of squared dev. from the mean, divided by the count minus 1. | 14 | |
2012583845 | Resistant | A calculated summary is said to be ________ if outliers have only a small effect on it. | 15 | |
2012583846 | Mean | Found by summing all the data values and dividing by the count. | 16 | |
2012583847 | 5 Number Summary | Reports the min., Q1, the median, Q3 and the max. | 17 | |
2012583848 | Percentile | The # that falls above i% of the data. | 18 | |
2012583849 | Interquartile Range (IQR) | The difference between the 1st and 3rd Quartiles. | 19 | |
2012583850 | Range | The difference between the lowest and highest values in a data set. Range = Max-Mir | 20 | |
2012583851 | Median | Middle value, if it is not an even #, you take the average of the 2 middle #'s. | 21 | |
2012583852 | Outliers | Extreme values that don't appear to belong with the rest of the data. Any point more than 1.5 IQR from either end of the box in a Boxplot. | 22 | |
2012583853 | Skewed | Distribution is _________ if it's not symmetric and 1 tail stretches out farther than the other. | 23 | |
2012583854 | Tails | The parts that typically trail off on either side. | 24 | |
2012583855 | Symmetric | 2 Halves on either side of the center look approximately like mirror images of each other. | 25 | |
2012583856 | Uniform | A distribution that's roughly flat. | 26 | |
2012583857 | Unimodal | 1 mode | 27 | |
2012583858 | Bimodal | 2 modes | 28 | |
2012583859 | Multimodal | More than 2 modes | 29 | |
2012583860 | Mode | A hump or local high point in the shape of the distribution of a var. | 30 | |
2012583861 | Spread | A numerical summary of how tightly the values are clustered around the center. Measures: IQR, Standard Dev. | 31 | |
2012583862 | Center | The place in the distribution of a variable that you'd point to if you wanted to attempt the impossible by summarizing the entire distribution with a single #. Measures: Mean, Median | 32 | |
2012583863 | Shape | To describe the _____ of a distribution, look for: single vs. mult. modes; symmetry vs skewness; outliers and gaps. | 33 | |
2012583864 | Dotplot | Graphs a dot for each case against a single axis. | 34 | |
2012583865 | Stem and Leaf Display | Shows quantitative data values in a way that sketches the distribution of the data. | 35 | |
2012583866 | Gap | A region of the distribution where there are no values. | 36 | |
2012583867 | Histogram | Uses adjacent bars to show the distribution of a quantitative var. | 37 | |
2012583868 | Frequency Table (Relative Frequency Table) | Lists the categories in a categorical var. and gives the count of percentages of each categories observation. | 38 | |
2012583869 | Distribution | The _____________ of a var. gives: possible values of the variance; the relative frequency of each value. | 39 | |
2012583870 | Area Principle | In a statistical display, each data value should be represented by the same amount of area. | 40 | |
2012583871 | Bar Chart | Shows a bar whose area represents the count (or percentage) of observations for each category of a categorical variance. | 41 | |
2012583872 | Pie Chart | Show how a "whole" divides into categories by showing a wedge of a circle whose area corresponds to the proportion in each category. | 42 | |
2012583873 | Contingency Table | Displays counts and, sometimes, percentages of individuals falling into named categories on 2 or more var. | 43 | |
2012583874 | Marginal Distribution | In a contingency table, the distribution of either var. alone. | 44 | |
2012583875 | Conditional Distribution | The distribution of a var. restricting the who to consider only a smaller group of individuals. | 45 | |
2012583876 | Independence | Variables are ________ if the conditional distribution of one variables is the same for each category of the other. | 46 | |
2012583877 | Segmented Bar Chart | Displays the conditional distribution of a categorical var. within each category of another var. | 47 | |
2012583878 | Simpson's paradox | When averages are taken across different groups, they can appear to contradict the overall averages. | 48 | |
2012583879 | Context | Tells who was measured, what was measured, how the data were collected, where the data was collected, and when and why the study was performed. | 49 | |
2012583880 | Data | Systematically recorded info., whether #'s or labels, together with its contact. | 50 | |
2012583881 | Data Table | An arrangement of data in which each row represents a case and each column represents a variable. | 51 | |
2012583882 | Case | Individual about whom or which we have data. | 52 | |
2012583883 | Population | All the cases we wish we knew about. | 53 | |
2012583884 | Sample | The cases we actually examine in seeking to understand the much larger population. | 54 | |
2012583885 | Variable | Holds info about the same characteristic for many cases. | 55 | |
2012583886 | Units | A quantity or amount adopted as a standard of measurement, such as dollars, hours, or grams. | 56 | |
2012583887 | Categorical Variable | A variable that names categories (words/numbers) | 57 | |
2012583888 | Quantitative Variable | A variable in which the numbers act as numerical values - always have units. | 58 | |
2027682233 | Random Phenomenon | If we know what outcomes could happen, but not which particular valves will happen. | 59 | |
2027682234 | Trial | A single attempt or realization of a random phenomenon. | 60 | |
2027682235 | Outcome | The value measured, observed, or reported for an individual instance of that trial. | 61 | |
2027682236 | Event | A collection of outcomes. | 62 | |
2027682237 | Sample Space | The collection of all possible outcome values. | 63 | |
2027682238 | Law of Large Numbers | States that the long run-run relative frequency of repeated independent events gets closer and closer to the true relative frequency as the number of trials increases. | 64 | |
2027682239 | Independence | If one event occurs it does not change the probability thta that the other event occurs. | 65 | |
2027682240 | Empirical Probability | The probability comes from the long-run relative frequency of the event's occurence. | 66 | |
2027682241 | Theoretical Probability | When the probability comes from a model. | 67 | |
2027682242 | Personal Probability | When the probability is subjective and represents your personal degree of belief. | 68 | |
2027682243 | Observational Study | A study based on data in which no manipulation of factors has been employed. | 69 | |
2027682244 | Retrospective Study | An observational study in which subjects are selected and then their previous conditions or behaviors are determined. | 70 | |
2027682245 | Prospective Study | An observational study in which subjects are followed to observe future outcomes. | 71 | |
2027682246 | Experiment | Manipulates factor levels to create treatments. Randomly assigns subjects to these treatment levels. Compares the responses of the subject groups across treatment levels. | 72 | |
2027682247 | Factor | A variance whose levels are manipulated by the experiment. | 73 | |
2027682248 | Response | A variance whose values are compared across different treatments. | 74 | |
2027682249 | Experimental Units | Individuals on whom an experiment is performed. | 75 | |
2027682250 | Level | The specific values that the experimenter chooses for a factor. | 76 | |
2027682251 | Treatment | The process, intervention, or other controlled circumstance applied to randomly assigned experimental units. | 77 | |
2027682252 | Priciples of Experimental Design | Control; Randomize; Replicate; Block | 78 | |
2027682253 | Control Group | The experimental units assigned to a basseline treatment level. | 79 | |
2027682254 | Placebo Effect | The tendency of many human subjects to show a response even when adminstered a placebo. | 80 | |
2027682255 | Blinding | Any individual associated with an experiment who is not aware of how subjects have been allocated to treatment groups. | 81 | |
2027682256 | Placebo | A treatment known to have no affect. | 82 | |
2027682257 | Confounding | Levels of one factor are associated with the levels of another factor in such a way that their effects cannot be separated. | 83 | |
2027682258 | Sample Survey | A study that asks questions of a sample drawn from some population in the hope of learning something about the entire population. | 84 | |
2027682259 | Bias | Any systematic failure of a sampling method. | 85 | |
2027682260 | Randomization | The best defense against bias; each individual is given a fair, random chance of selection. | 86 | |
2027682261 | Sample Size | Number of individuals in a sample represents the population. | 87 | |
2027682262 | Census | Sample that consists of the entire population. | 88 | |
2027682263 | Population Parameter | Numericlaly valued attribute of a model for a population. | 89 | |
2027682264 | Representative | A sample is said to be ___________ if the stats computed from it accurately reflect the corresponding population parameters. | 90 | |
2027682265 | Simple Random Sample (SRS) | A sample in which each set of "n" elements in the population has an equal chance of selection. | 91 | |
2027682266 | SRS | Simple Random Sample | 92 | |
2027682267 | Sampling Frame | List of individuals from whom the same is drawn. | 93 | |
2027682268 | Sampling Variability | The natural tendency of randomly drawn samples to differ, one from another. | 94 | |
2027682269 | Stratified Random Sample | A sampling design in which the population is divided into several subpopulations, or strata, and random samples are then drawn from each stratum. | 95 | |
2027682270 | Cluster Sample | A sampling design in which entire groups are chosen at random. | 96 | |
2027682271 | Multistage Sample | Sampling schemes that combine several sampling methods. | 97 | |
2027682272 | Systematic Sample | A sample drawn by selecting individuals systematically from a sampling frame. | 98 | |
2027682273 | Pilot | A small trial run of a survey to check whether questions are clear. | 99 | |
2027682274 | Voluntary Response Bias | Bias introduced to a sample when individuals can choose on their own whether to participate in the sample. | 100 | |
2027682275 | Convenience Sample | Consists of the individuals who are conveniently available to sample. | 101 | |
2027682276 | Undercoverage | A sampling scheme that biases the sample in a way that gives a part of the population less representation. | 102 | |
2027682277 | Nonresponse Bias | Bias introduced when a large fraction of those sampled fails to respond. | 103 | |
2027682278 | Response Bias | Anything in a survey design that influences response. | 104 | |
2027682279 | Random | If we know the possible values it can have, but not which particular value it takes. | 105 | |
2027682280 | Simulation | Models a real-world situation by using random-digit outcomes to mimic the uncertainty of a response variance of interest. | 106 | |
2027682281 | Simulation Component | A component uses equally likely random digits to model simple random occurrences whose outcomes may not be equally likely. | 107 | |
2027682282 | Trial (Chapter 11) | The sequence of several componets representing events that we are pretending will take place. | 108 | |
2027682283 | Re-expression | We _______ data by taking the logarithm, the square root, the reciprocal, or some other mathematical operation on all values of a variance. | 109 | |
2027682284 | Ladder of Powers | Places in order the effects that many re-expressions have on the data. | 110 | |
2027682285 | Correlation Coefficient | Numerical measure of the direciton and strength of a line or association. | 111 | |
2027682286 | Scatterplot | Shows relationship between two quantitative variables measured on the same cases. | 112 | |
2027682287 | Lurking Variable | A variable other than x and y that simultaneously affects both variables, accounting for the correlation between the two. | 113 | |
2027682288 | Model | An equation of formula that simplifies and represents reality. | 114 | |
2027682289 | Linear Model | An equation of a line. To interpret a linear model, we need to know the variables and their units. | 115 | |
2027682290 | Predicted Value | The value of y^ found for a given x-value in the data. This is found by substituting the x-value in reg. equation. | 116 | |
2027682291 | Residuals | Difference between data values and the corresponding values predicted by the regression model. Observed Value minus predicted value (e= y-y^) | 117 | |
2027682292 | Least Squares | Specifics the unique line that minimizes the variance of the residuals or, equivalently, the sum of the squared residuals. | 118 | |
2027682293 | Regression to the mean | Because correlation is always less than 1.0 in magnitude, each predicted y^ tends to be fewer standard deviation from its mean than its corresponding x was from its mean. | 119 | |
2027682294 | Intercept | The intercept b (little o), gives a starting value in y-units. It's the y^ - value when x = 0. | 120 | |
2027682295 | Extrapolation | Although linear models provide an easy way to predict values of y for a given value of x, it is unsafe to predict for values of x far from the ones used to find the linear model equation. | 121 | |
2027682296 | Leverage | Data points whose x-value are far from the man of x, are said to exert _____________ on a linear model. | 122 | |
2027682297 | Influential Point | If omitting a point from the data results in a very different regression model. | 123 | |
2031495160 | Disjoint(mutually exclusive) | 2 events share no outcomes in common. | 124 |