Statistics 101 - Discovering Statistics
1047419210 | Mean | This "measure of center" is the AVERAGE of the values in a data set. (Mean is sensitive to extreme values.) | 0 | |
1047419211 | Median | A measure of center in a set of numerical data. The median of a list of values is the value appearing at the center of a sorted version of the list - or the mean of the two central values if the list contains an even number of values. (Median is NOT sensitive to extreme values) | 1 | |
1047419212 | Mode | The value that occurs most often in a set of data. | 2 | |
1047419213 | Skewness Affect - Right-Skewed Distribution | Mean > Median > Mode | 3 | |
1047419214 | Skewness Affect - Left-Skewed Distribution | Mean < Median < Mode | 4 | |
1047419215 | Skewness Affect - Symmetric Unimodal Distribution | Mean = Median = Mode. Unimodal means it has ONE MODE. This is also an example of a NORMAL DISTRIBUTION. | 5 | |
1047419216 | Range | The difference between the largest value and smallest value of a data set. (Range = Largest Value - Smallest Value) (A larger range is an indication of greater VARIABILITY, or greater spread, in the data set) | 6 | |
1047419217 | Deviation | The difference between a data value and the mean of the data set. (The distance between the data value and the mean) If data value x > mean, deviation will be positive. If data value x < mean, deviation will be negative. If data value x = mean, deviation will be zero. | 7 | |
1047967015 | Population Variance ϭ² | The mean of the squared deviations in the population. | 8 | |
1047967016 | Population Standard Deviation ϭ | The positive square root of the population variance. | 9 | |
1047998646 | Sample Variance s² | Approximately the mean of the squared deviations in the sample. | 10 | |
1047998647 | Sample Standard Deviation s | The positive square root of the sample variance s². | 11 | |
1068291515 | Standard Deviation | A common measure of the variability, or spread, of a data set. It is a typical deviation from the mean. | 12 | |
1068441343 | z-Score | Indicates how many standard deviations a particular data value is from the mean. If the z-score is positive, the data value is above the mean, and if the z-score is negative, the data value is below the mean. | 13 | |
1068441344 | Outlier | An extremely large or extremely small data value relative to the rest of the data set. | 14 | |
1068441345 | Detecting Outliers - Z-score Method | Identify an outlier by determining is it is farther than 3 standard deviations from the mean, i.e., Z-score less than -3 or greater than 3. | 15 | |
1068441346 | Percentile | The location of a data value relative to other values in the data set, i.e., a score in the 90th percentile means that 90% of all scores are at or below the same level, and 10% scored higher than this score. | 16 | |
1068441347 | Percentile Calculation | i = (P/100)n. MORE TO THIS.... | 17 | |
1070722932 | Percentile Rank | Percentage of scores falling at or below a specific score. A percentile rank of 95 means that 95% of all of the scores fall at or below this point. In other words, the score is as good as or better than 95% of the scores. | 18 | |
1070722933 | Quartiles | The 25th, 50th, and 75th percentiles, referred to as the first quartile, the second quartile (median), and third quartile, respectively. The quartiles can be used to divide a data set into four parts, with each part containing approximately 25% of the data. | 19 | |
1070722934 | Interquartile Range (IQR) | A robust measure of variability, calculated as IQR=Q3-Q1. It is interpreted as the spread of the middle 50% of the data, and it is NOT affected to outliers since it ignores the highest 25% and the lowest 25% of the data set. | 20 | |
1070722935 | Five-Number Summary | An exploratory data analysis technique that uses five numbers to summarize the data: 1. smallest value, 2. first quartile, 3. median (second quartile), 4. third quartile, and 5. largest value. | 21 | |
1070722936 | Boxplot | A graphic display that represents the distribution of data by focusing on five key measures: Min, Q1, Q2, Q3, Max. | 22 | |
1070722937 | Boxplot Upper and Lower Fences | Upper Fence = Q1 - 1.5(IQR) Lower Fence = Q3 + 1.5(IQR) | 23 | |
1070722938 | Detecting Outliers - IQR Method | A data value is an outlier is a. it is located 1.5(IQR) or more below Q1, or b. it is located 1.5(IQR) or more above Q3. | 24 | |
1070722939 | Chebyshev's Rule | The proportion (or fraction) of any set of data lying within K standard deviations of the mean is always at least 1-1/k^2, where k is any positive number greater that 1. | 25 | |
1070722940 | The Empirical Rule | This says that, in a normal bell-shaped curve, 68% of the data fall within one standard deviation, 95% within two, and 99.7% within three. | 26 | |
1070722941 | The Empirical Rule in terms of z-Scores | 68% of the data will have z-scores between -1 and 1, 95% between -2 and 2, and 99.7% -3 and 3. | 27 | |
1070745942 | Scatterplot | A graphed cluster of dots, each of which represents the values of two variables. The slope of the points suggests the direction of the relationship between the two variables. The amount of scatter suggests the strength of the correlation (little scatter indicates high correlation). | 28 | |
1072935679 | Scatterplot Variables x and y | x is horizontal axis, and y is vertical axis. x is the "predictor" variable, and y is the "response" variable. | 29 | |
1072935680 | Correlation Coefficient | A statistic, r, that summarizes the strength and direction of the linear relationship between two variables. It always takes on a value between -1 and 1, inclusive. | 30 | |
1072935681 | Comparison Test for Linear Correlation | 1. Find the absolute value of the correlation coefficient r, |r|. |0.5|=0.5 and |-0.4|=0.4 2. Use the Table of Critical Values for the Correlation Coefficient and select the row corresponding to sample size n. 3. Compare the absolute value |r| from Step 1 to the critical value from the table in Step 2, and a.) If |r| is greater than the critical value, you can conclude that x and y are LINEARLY CORRELATED. i.) If r>0, then x and y are POSITIVELY CORRELATED. ii.) If r<0, then x and y are NEGATIVELY CORRELATED. And b.) If |r| is not greater than the critical value, then x and y are NOT LINEARLY CORRELATED. | 31 |