The Median is the middle value of a series of numbers that have been ordered.
Example:
\(\left(x\right)\) = [1, 2, 3, 4, 5]
the median of \(\left(x\right)\) is 3.
If a series of numbers is even in length, then the median is two numbers.
Example:
\(\left(x\right)\) = [1, 2, 3, 4]
the median of \(\left(x\right)\) are 2 and 3.
The Modes are the numbers that appear most frequently in a series of numbers or numbers with the highest frequency in a series of numbers.
Example:
\(\left(x\right)\) = [4, 3, 3, 7, 6]
the modes of \(\left(x\right)\) is 3.
if there are several numbers having the same frequency, then the mode is that numbers.
Example:
\(\left(x\right)\) = [6, 2, 3, 4, 9, 23, 5, 3, 11, 2]
the modes of \(\left(x\right)\) are 2 and 3.
The Percentiles are used in statistics to give you a number that describes the value that a given percent of the values are lower than.
Example:
If we have The 25% percentile value of a series is 100, it means that 25% of values in that series is equal or lower than 100.
The Sum is the result of adding two or more numbers.
Mathematically we can write as follows,
$$Sum\left(x\right)=\sum_i^n x_i$$
Where:
Example:
\(\left(x\right)=[2, 3, 4]\)
\(Sum\left(x\right)=2+3+4=9\)
The Mean is the average of a series of numbers or the sum of a series divided by its length.
Mathematically we can write as follows,
$$Mean\left(x\right)=\frac{\sum_i^n x_i}{n}$$
Where:
Example:
\(\left(x\right)=[1, 3, 2]\)
\(Mean\left(x\right)=\frac{1+3+2}{3}=2\)
The Harmonic Mean is the reciprocal of the mean of the reciprocals of all values in a series.
Mathematically we can write as follows,
$$Harmonic Mean\left(x\right)=\frac{n}{\sum_i^n \frac{1}{x_i}}$$
Where:
Reference:
Stojiljkovic, M.
Python Statistics Fundamentals: How to Describe Your Data.
Real Python.
The Square is the result of a value times by it self.
Mathematically we can write as follows,
$$Square\left(x\right)=x^2$$
Where:
The Cubic is the result of a value times by it self then times by it self again.
Mathematically we can write as follows,
$$Cubic\left(x\right)=x^3$$
Where:
The Square Sum is the result of square of sum of a series.
Mathematically we can write as follows,
$$Square Sum\left(x\right)=\left(\sum_i^n x_i\right)^2$$
Where:
The Sum Square is the result of sum of square of each value in a series.
Mathematically we can write as follows,
$$Sum Square\left(x\right)=\sum_i^n \left(x_i^2\right)$$
Where:
The Mean Square is the result of mean of square of each value in a series.
Mathematically we can write as follows,
$$Mean Square\left(x\right)=\frac{\sum_i^n \left(x_i^2\right)}{n}$$
Where:
The Root Mean Square is the result of root of mean of square of each value in a series.
Mathematically we can write as follows,
$$RMS\left(x\right)=\sqrt\frac{\sum_i^n \left(x_i^2\right)}{n}$$
Where:
The Variances reflects the differences we see in the distributions. Although the variance is an exceptionally important concept and one of the most commonly used statistics, it does not have the direct intuitive interpretation we would like. Because it is based on squared deviations, the result is in terms of squared units.
Mathematically we can write as follows,
$$S^2\left(x\right)={\sum_i^n\left(x_i-\left({\sum_i^nx_i\over n}\right)\right)^2\over n-1}$$
Where:
Squared units in variance are awkward things to talk about and have little intuitive meaning with respect to the data. Fortunately, the solution to this problem is simple: Take the square root of the variance. The Standard Deviation is defined as the positive square root of the variance and, for a sample, is symbolized as \(S\) (with a subscript identifying the variable, if necessary).
Mathematically we can write as follows,
$$S\left(x\right)=\sqrt{S^2}$$
Where:
It can be difficult to compare the value 1.0 with the value 790, but if we scale them both into comparable values, we can easily see how much one value is compared to the other. There are different methods for scaling data, here we will use a method called standardization.
Mathematically we can write as follows,
$$z\left(x_i\right)=\frac{x_i-\left(\frac{\sum_i^n x_i}{n}\right)}{S}$$
Where:
Reference:
w3schools.
Machine Learning - Scale.
The Skewness is a measure of the degree to which a series is asymetrical. Negatively skewed means A distribution that trails off to the left. Positively skewed means A distribution that trails off to the right.
Mathematically we can write as follows,
$$Skewness\left(x\right)=\frac{n\left(\sum_i^n\left(\left(x_i-\left(\frac{\sum_i^n x_i}{n}\right)\right)^3\right)\right)}{\left(n-1\right)\left(n-2\right)S^3}$$
Where:
The Covariance is basically a number that reflects the degree to which two variables vary together. If, for example, high scores on one variable tend to be paired with high scores on the other, the covariance will be large and positive. When high scores on one variable are paired about equally often with both high and low scores on the other, the covariance will be near zero, and when high scores on one variable are generally paired with low scores on the other, the covariance is negative.
It is possible to show that the covariance will be at its positive maximum whenever X and Y are perfectly positively correlated \(\left(r=+1.00\right)\) and at its negative maximum whenever they are perfectly negatively correlated \(\left(r=-1.00\right)\). When there is no relationship \(\left(r=0\right)\), the covariance will be zero.
Mathematically we can write as follows,
$$cov\left(x, y\right)={\sum_i^n\left(x_i-\left({\sum_i^{n_x}x_i\over n_x}\right)\right)\left(y_i-\left({\sum_i^{n_y}y_i\over n_y}\right)\right)\over n-1}$$
Where:
When we are dealing with the relationship between two variable, we are concerned with correlation, and our measure of the degree or strength of this realtionship is represented by correlation coefficient. We can use a number of different correlation coefficients, depending primarily on the underlying nature of the measurements, but the most common correlation coefficient—the Pearson product-moment correlation coefficient \(\left(r\right)\).
It is important to note that the sign of the correlation coefficient has no meaning other than to denote the direction of the relationship. A negative relationship is a relationship in which increases in one variable are associated with decreases in the other. A positive relationship is a realtionship in which increases in one variable are associated with increases in the other. The correlation coefficient is simply a point on the scale between −1.00 and +1.00, and the closer it is to either of those limits, the stronger is the relationship between the two variables.
Mathematically we can write as follows,
$$r\left(x, y\right)={n\left(\sum_i^{n_x}\left(x_i\right)\left(y_i\right)\right)-\left(\left(\sum_i^{n_x}x_i\right)\left(\sum_i^{n_y}y_i\right)\right)\over\sqrt{\left(n_x\left(\sum_i^{n_x}x_i^2\right)-\left(\sum_i^{n_x}x_i\right)^2\right)\left(n_y\left(\sum_i^{n_y}y_i^2\right)-\left(\sum_i^{n_y}y_i\right)^2\right)}}$$
Where:
The squared correlation coefficient or coefficient of determination \(\left(R^2\right)\) is very important statistic to explain the strength of the relationship we have between two variables.
It is important to note that the sign of the determination coefficient has no meaning other than to denote the direction of the relationship. A negative relationship is a relationship in which increases in one variable are associated with decreases in the other. A positive relationship is a realtionship in which increases in one variable are associated with increases in the other. The determination coefficient is simply a point on the scale between −1.00 and +1.00, and the closer it is to either of those limits, the stronger is the relationship between the two variables.
Mathematically we can write as follows,
$$R^2\left(x, y\right) = r^2\left(x, y\right)$$
Where:
The \(Y=bX+a\) equation is the regression equation, or the equation that predict Y from X, and the values of the intercept (b) and the slope (a) are called the regression coefficients, but it's often refers only to slope. The interpretation of this equation is straightforward.
Mathematically we can write as follows,
$$Y=aX+b$$
$$a={\sum_i^n\left(x_i-\left({\sum_i^{n_x}x_i\over n_x}\right)\right)\left(y_i-\left({\sum_i^{n_y}y_i\over n_y}\right)\right)\over \sum_i^n\left(x_i-\left({\sum_i^nx_i\over n}\right)\right)^2}$$
$$b={\sum_i^{n_y}y_i-\left(b\right)\left(\sum_i^{n_x}x_i\right)\over n}$$
Where:
. . . . .