Statistics is a branch of mathematics that deals with the collection, analysis, interpretation, presentation, and organizing of data. In applying statistics for, for example, scientific, industrial, or social issues, it is conventional to start with statistical populations or statistical modeling processes to be studied. The population can be various topics such as "everyone who lives in a country" or "every atom makes a crystal". Statistics relate to all aspects of data including data collection planning in terms of survey and experiment design. See a list of possibilities and statistics.
When census data can not be collected, statisticians collect data by developing experimental designs and specific survey samples. The sampling representative ensures that conclusions and conclusions can be extended from the sample to the population as a whole. An experimental study involves taking measurements of the system under investigation, manipulating the system, and then taking additional measurements using the same procedure to determine whether the manipulation has modified the measurement values. In contrast, observational studies do not involve experimental manipulation.
Two main statistical methods are used in data analysis: descriptive statistics, which summarize data from samples using indexes such as mean or standard deviation, and inferential statistics, drawing conclusions from data subject to random variation (eg, observation error, sampling variation). Descriptive statistics most often relate to two sets of properties of the distribution (sample or population): central tendency (or location ) attempt to characterize the value of the center or typical distribution, while dispersion (or variability ) characterizes the extent to which distribution members depart from their centers and each other. The conclusions on mathematical statistics are made under the framework of probability theory, which is related to the analysis of random phenomena.
The standard statistical procedure involves testing the relationship between two sets of statistical data, or a set of synthetic data and data taken from an idealized model. A hypothesis is proposed for statistical relationships between two sets of data, and this is compared as an alternative to the ideal null hypothesis of no relationship between two sets of data. Reject or refute the null hypothesis is done by using a statistical test that measures the meaning at which zero can be proved wrong, given the data used in the test. Working from the null hypothesis, two basic forms of error are recognized: Type I errors (the null hypothesis is falsely rejected by giving a "false positive" error) and Type II (the null hypothesis fails to be rejected and the apparent difference between the population loses the giving of a "false negative"). Some issues have been associated with this framework: starting from getting sufficient sample size to determine an adequate null hypothesis.
The measurement process that generates statistical data can also experience errors. Many of these errors are classified as random (noise) or systematic (biased), but other types of errors (eg, Errors, like when an analyst reports the wrong unit) can also be important. The presence of lost data or censorship may lead to biased estimates and specific techniques have been developed to address this problem.
Statistics can be said to have begun in ancient civilizations, going back at least to the 5th century BC, but it was not until the 18th century that began to draw more from calculus and probability theory. In recent years, statistics rely more heavily on statistical software to generate tests such as descriptive analysis.
Video Statistics
Coverage
Some definitions are:
- Merriam-Webster Dictionary defines statistics as "a branch of mathematics dealing with the collection, analysis, interpretation, and mass presentation of numeric data."
- The statistician Sir Arthur Lyon Bowley defines statistics as "Statement of fact figures in each department of inquiry related to each other."
Statistics is a body of mathematical science concerned with the collection, analysis, interpretation or explanation, and the presentation of data, or as a branch of mathematics. Some people regard statistics as a different mathematical science than the branch of mathematics. While many scientific investigations use data, statistics relate to the use of data in the context of uncertainty and decision making in the face of uncertainty.
Mathematical statistics
Mathematical statistics is a math application for statistics. The mathematical techniques employed for this include mathematical analysis, linear algebra, stochastic analysis, differential equations, and probability-theoretical probability theory.
Maps Statistics
Overview
In applying statistics to a problem, it is common practice to start with the population or process to be studied. The population can be various topics such as "everyone who lives in a country" or "every atom makes a crystal".
Ideally, statisticians collect data about the entire population (an operation called a census). This may be administered by government statistical agencies. Descriptive statistics can be used to summarize population data. Numerical descriptors include mean and standard deviations for continuous data types (such as revenue), while frequency and percentage are more useful in terms of describing categorical data (such as race).
When the census is not feasible, a subset selected from the so-called sample population is studied. Once a representative sample of the population is determined, data are collected for sample members in an observational or experimental setting. Again, descriptive statistics can be used to summarize sample data. However, the sample image has been subjected to an element of randomness, then the numerical descriptor assigned from the sample is also due to uncertainty. To keep meaningful conclusions about the entire population, inferential statistics are required. It uses patterns in the sample data to draw conclusions about the population being represented, accounting for randomness. This conclusion may be: answering yes/no questions about the data (hypothesis testing), estimating the numerical characteristics of the data (estimating), describing the relationships in the data (correlation) and modeling relationships in the data (eg, using regression analysis). Inference may include forecasting, predicting, and estimating unobserved values ââeither in or related to the population under study; it can include extrapolation and interpolation of time series or spatial data, and may also include data mining.
Data collection â ⬠<â â¬
Sampling
When full census data can not be collected, statisticians collect sample data by developing experimental designs and specific survey samples. The statistics themselves also provide tools for prediction and forecasting through statistical models. The idea of ââdrawing conclusions based on sample data begins around the mid-1600s regarding population estimates and developing life insurance precursors.
To use the sample as a guide for the entire population, it is important that it really represents the entire population. The sampling representative ensures that conclusions and conclusions can safely extend from the sample to the population as a whole. The main problem lies in determining the extent to which the selected sample is truly representative. Statistics offer methods for estimating and correcting any bias in the sample and data collection procedures. There is also an experimental design method for experiments that can reduce this problem at the start of the study, reinforcing its ability to distinguish truths about the population.
Sampling theory is part of the mathematical discipline of probability theory. Probability is used in mathematical statistics to study the sampling distribution of sample statistics and, more generally, the nature of statistical procedures. The use of any statistical method is valid when the system or population under consideration satisfies the assumption of the method. The difference in viewpoint between the classical probability theory and the sampling theory, approximately, probability theory starts from the given parameters of the total population to infer the probabilities associated with the sample. Statistical inference, however, moves in the opposite direction - inductively infer from sample to larger or total population parameter.
Experimental and observational studies
The general objective for a statistical research project is to investigate causality, and in particular to draw conclusions about the effect of changes in predictor values ââor independent variables on the dependent variable. There are two main types of causal statistical studies: experimental studies and observational studies. In both types of research, the effect of differences from independent variables (or variables) on the behavior of the dependent variable observed. The difference between the two types lies in how the research is actually done. Each can be very effective. An experimental study involves taking measurements of the system under investigation, manipulating the system, and then taking additional measurements using the same procedure to determine whether the manipulation has modified the measurement values. In contrast, observational studies do not involve experimental manipulation. Instead, data is collected and correlations between predictors and responses are investigated. While data analysis tools work best on data from randomized studies, they are also applied to other types of data - such as natural experiments and observational studies - in which a statistician will use modified and structured estimation methods (eg, Differences in estimation differences and instrumental variables, among many others) that yield consistent estimators.
Experiments
The basic steps of a statistical experiment are:
- Planning the study, including finding the number of research replications, uses the following information: initial estimates of treatment effect measures, alternative hypotheses, and estimates of experimental variability. Consideration of selection of experimental subjects and research ethics is required. Statistical experts recommend that trials compare (at least) a new treatment with standard care or control, to allow unbiased estimates of differences in treatment effects.
- The design of the experiment, using blocking to reduce the effect of confounding variables, and randomly assigned treatment to the subject to allow for an estimate of the effect of unbiased treatment and experimental error. At this stage, the researcher and statistician write experimental protocol that will guide the performance of the experiment and which determines the principal analysis of the experimental data.
- Experiment with experimental protocols and analyze data following experimental protocols.
- Further check the data set in the secondary analysis, to suggest a new hypothesis for further research.
- Documenting and presenting research results.
Experiments on human behavior have special problems. The famous Hawthorne research examines the changing work environment at the Hawthorne plant at Western Electric Company. Researchers are interested in determining whether increased lighting increases the productivity of assembly line workers. The researchers first measured productivity at the plant, then modified the illumination in the plant area and examined whether changes in lighting affected productivity. It turns out that productivity is indeed increasing (under experimental conditions). However, this research is widely criticized today due to errors in experimental procedures, particularly for the lack of control and blindness groups. The Hawthorne effect refers to finding that results (in this case, worker productivity) change because of the observation itself. Those who studied at Hawthorne became more productive not because the lighting changed but because they were observed.
Observational study
An example of observational studies is a study that explored the relationship between smoking and lung cancer. This type of research usually uses surveys to collect observations about areas of interest and then perform statistical analysis. In this case, the researchers will collect observations of both smokers and non-smokers, perhaps through cohort studies, and then look for the number of lung cancer cases in each group. A case-control study is another type of observational study in which people with and without attractive outcomes (eg lung cancer) are invited to participate and the history of their exposure is collected.
Data type â ⬠<â â¬
Various attempts have been made to produce taxonomic levels of measurement. Psychophysicist Stanley Smith Stevens defines the nominal, ordinal, interval, and ratio scale. Nominal measurements do not have a meaningful ranking rank among values, and allow one-to-one transformation. Ordinal measurements have imprecise differences between successive values, but have a meaningful sequence for those values, and allow sequence preservation of sequences. Interval measurements have significant distances between specified measurements, but the zero value is arbitrary (as in the case with longitude and temperature measurements in Celsius or Fahrenheit), and allows for linear transformation. Ratio measurement has a meaningless zero value and distance between different measurements is set, and allows rescaling transformation.
Since variables that are only in accordance with nominal or ordinal measurements can not be numerically measured, they are sometimes grouped together as category variables, whereas the ratio and interval measurements are grouped together as quantitative variables, which can be either discrete or sustainable, due to their numerical nature. Such differences can often be loosely correlated with data types in computer science, in the dichotomous category variables can be represented by Boolean data types, polytomous category variables with randomly assigned integers in integral data types, and continuous variables with real data types that involves the calculation of floating point. But the mapping of computer science data types to statistical data types depends on which category is currently being implemented.
Other categorizations have been proposed. For example, Mosteller and Tukey (1977) distinguish values, ratings, number of fractions, amount, amount, and balance. Nelder (1990) describes continuous numbers, continuous ratios, count ratios, and categorical data modes. See also Chrisman (1998), van den Berg (1991).
The issue of whether it is appropriate to apply different types of statistical methods to data obtained from different types of measurement procedures is complicated by issues of variable transformation and the proper interpretation of research questions. "The relationship between the data and what they describe only reflects the fact that some types of statistical statements may have non-invariant truth values ââunder some transformation whether or not a reasonable transformation to contemplate depends on the question being answered." (Hand , 2004, p. 82).
Terminology and theory of inferential statistics
Important stats, estimates and counts
Consider the independent distributed random variable (IID) with the given probability distribution: standard statistical inference and estimate theory define a random sample as a random vector given by this IID variable column vector. The population examined is explained by probability distributions that may have unknown parameters.
Statistics is a random variable that is a function of a random sample, but is not a function of an unknown parameter . The probability distribution of statistics, though, may have unknown parameters.
Consider now the function of the unknown parameter: the estimator is the statistic used to estimate the function. Commonly used estimators include mean samples, unbiased sample variance and covariance samples.
A random variable that is a function of a random sample and an unknown parameter, but the probability distribution does not depend on unknown parameters is called an important quantity or pivot. The widely used shafts include z-scores, chi square statistics and Student t values.
Between the two parameter estimators given, the one with the lower mean squared error is said to be more efficient. Furthermore, the estimator is said to be unbiased if the expected value is equal to the true value of the unknown parameter expected, and without bias asymptotically if the expected value fuses at the limit to the true value of the parameter.
Other properties desired for the valuer include: UMVUE estimators that have the lowest variant for all possible values ââof parameters to be estimated (this is usually a property that is easier to verify than efficiency) and a consistent estimator that integrates in probability to the true value of those parameters..
This still leaves the question of how to obtain estimators in certain situations and carry calculations, several methods have been proposed: the moment method, maximum likelihood method, the least squares method and the newer equation estimation methods.
The null hypothesis and alternative hypothesis
Interpretation of statistical information can often involve the development of a null hypothesis that is usually (but not necessarily) that there is no relationship between variables or that no changes occur over time.
The best illustration for beginners is the difficulties faced by criminal trials. The null hypothesis, H 0 , asserts that the defendant is innocent, while the alternative hypothesis, H 1 , confirms that the defendant is guilty. The indictment came because of suspicion of guilt. H 0 (status quo) stands contrary to H 1 and is maintained unless H 1 is supported by "no doubt" evidence. However, the "failure to reject H 0 " in this case does not mean innocence, but merely that evidence is not sufficient to punish. So the jury should not accept H 0 but fail to reject H 0 . While one can not "prove" the null hypothesis, one can test how close to being right with the power test, which tests for type II errors.
What are called statisticians as alternate hypotheses is just a hypothesis that goes against the null hypothesis.
Error
Working from the null hypothesis, two basic forms of error are recognized:
- Type I error where the null hypothesis is rejected incorrectly by giving "false positive".
- A Type II error in which the null hypothesis fails to be rejected and the apparent difference between missed populations gives "false negative".
The standard deviation refers to the extent to which individual observations in the sample differ from the central value, such as the sample or population mean, whereas the Standard error refers to the approximate difference between sample mean and population mean.
The statistical error is the number at which an observation is different from the expected value, the residual is the number of observations that differ from the expected value of the value expected to assume in the given sample (also called the prediction).
Average square error is used to obtain an efficient estimator, a widely used class of estimators. Root mean square error is only the square root of the mean square error.
Many statistical methods attempt to minimize the sum of the remaining squares, and this is called "least squares method" in contrast to Minimal absolute deviations. The latter gives equal weight to small and large errors, while the former gives greater weight to big mistakes. The sum of the residual squares can also be differentiated, which provides a practical property for regression. The smallest squares applied to linear regression are called the smallest least squares method and the least squares applied to nonlinear regression are called least non-linear squares. Also in the linear regression model the non-deterministic part of the model is called the term error, interference or more simple noise. Both linear regression and non-linear regression are discussed in the least squares of polynomials, which also describes the variance in the prediction of the dependent variable (y-axis) as a function of the independent variable (x-axis) and the deviation (error, noise, disturbance) of the curve estimates.
The measurement process that generates statistical data can also experience errors. Many of these errors are classified as random (noise) or systematic (biased), but other types of errors (eg, Errors, like when an analyst reports the wrong unit) can also be important. The presence of lost data or censorship may lead to biased estimates and specific techniques have been developed to address this problem.
Interval estimation
Most of the research is only a sample of a population, so the results are not fully representative of the entire population. Each estimate obtained from the sample is only close to the population value. The confidence interval allows statisticians to state how close the sample estimates are to the true values ââacross the population. Often they are expressed as 95% confidence intervals. Formally, the 95% confidence interval for a value is the range at which, if sampling and analysis are repeated under the same conditions (yielding different datasets), the interval will include the actual (population) value in 95% of all possible cases. This not implies that the probability that the true value in the confidence interval is 95%. From a frequentist perspective, such claims do not even make sense, since the true value is not a random variable. Whether the value is true or not within the given interval. However, it is true that, before the data is sampled and given a plan how to construct a confidence interval, the probability is 95% that the countless interval will include the true value: at this point, the interval limit is an unobserved random variable. One approach that produces intervals that can be interpreted as having a given probability of true value is to use the credible interval of Bayesian statistics: this approach depends on different ways of interpreting what is meant by "probability", ie as Bayesian probability.
In principle, the confidence interval can be symmetrical or asymmetrical. The interval can be asymmetric because it acts as a lower or upper limit for parameters (left-sided or right-side intervals), but can also be asymmetric as a two-sided interval is built in violation of the symmetry around the forecast. Sometimes the boundaries for the confidence interval are achieved asymptotically and these are used to approach the correct limits.
Significance
Statistics rarely give a simple Yes/No answer to questions under analysis. Interpretation often goes down to the level of statistical significance applied to numbers and often refers to the probability of values ââthat accurately reject the null hypothesis (sometimes referred to as the p value).
The standard approach is to test the null hypothesis against alternative hypotheses. The critical region is the set of estimator values ââthat lead to deny the null hypothesis. The probability of type I error is the probability that the estimator belongs to the critical region given that the null hypothesis is true (statistical significance) and the probability of type II error is the probability that the estimator does not belong to the given critical region. that the alternative hypothesis is true. The statistical power of a test is the probability that it correctly rejects the null hypothesis when the null hypothesis is false.
Referring to statistical significance does not necessarily mean that overall results are significant in real-world terms. For example, in a large study the drug can be shown that the drug has a statistically significant but very small beneficial effect, so the drug is unlikely to help the patient significantly.
Although basically the level of statistical significance can be debated, p-value is the smallest level of significance that allows testing to reject the null hypothesis. This is logically equivalent to saying that p-value is a probability, assuming the null hypothesis is true, observing the results at least as extreme as the test statistic. Therefore, the smaller the p-value, the lower the probability of making a type I error.
Some issues are usually associated with this framework (See hypothesis testing criticism):
- A very statistically significant difference can still have no practical meaning, but it is possible to formulate the test correctly to take this into account. One response involves going beyond just reporting a level of significance to include a value p when reporting whether the hypothesis is rejected or accepted. The p value, however, does not indicate the size or significance of the observed effect and can also seem to overstate the importance of small differences in large studies. A better and more general approach is to report confidence intervals. Although this is generated from the same calculations as those performed on the hypothesis test or p -the value, both describe the size of the effect and the surrounding uncertainty.
- The conditional transition is mistaken, the alias mistake: the criticism arises because the hypothesis testing approach forces one hypothesis (null hypothesis) to be favored, since what is being evaluated is the probability of the observed result given the null hypothesis and no probability of the null hypothesis given the result observed. An alternative to this approach is offered by Bayesian inference, although it requires setting a prior probability.
- Rejecting the null hypothesis does not automatically prove an alternative hypothesis.
- Since everything in the inferential statistics depends on the sample size, and therefore below the fat p-value can be calculated seriously wrong.
Example
Some of the well-known statistical tests and procedures are:
Abuse
The misuse of statistics can produce subtle, but serious errors in descriptions and interpretations - subtle in the sense that even experienced professionals make such mistakes, and are serious in the sense that they can lead to a devastating decision error. For example, social policies, medical practices, and structural reliability like bridges all rely on the use of appropriate statistics.
Even when statistical techniques are applied correctly, the results can be difficult to interpret for those who lack expertise. Statistical significance of trends in data - which measure the extent to which a trend may be caused by random variations in the sample - may or may not agree with the intuitive meaning of its significance. A basic set of statistical skills (and skepticism) that people need to deal with information in their daily lives is properly referred to as statistical literacy.
There is a general perception that statistical knowledge is too often misused by finding ways to interpret only data that is beneficial to the presenter. Mistrust and statistical misunderstandings are associated with quotations, "There are three types of lies: lies, condemned lies, and statistics". The misuse of statistics can be unintentional and deliberate, and the book How to Lying to Statistics outlines the various considerations. In an attempt to explain the use and misuse of statistics, a review of the statistical techniques employed in a particular field is done (eg Warne, Lazo, Ramos, and Ritter (2012)).
Ways to avoid misuse of statistics include using appropriate diagrams and avoiding bias. Abuse may occur when conclusions are too general and claimed to be more representative than they really are, often intentionally or unconsciously ignoring the sampling bias. Bar graphs are the easiest diagrams to use and understand, and can be made by hand or with a simple computer program. Unfortunately, most people are not looking for bias or mistakes, so they are not noticed. Thus, people may often believe that something is true even if it is not well represented. To make the data collected from statistics reliable and accurate, the samples taken must be representative of the whole. According to Huff, "The dependence of the sample can be destroyed by [bias]... let yourself a little skeptical."
To help understand statistics, Huff posed a series of questions to ask in each case:
- Who said so? (Does he have an ax to do something?)
- How does he know? (Does he have the resources to know the facts?)
- What's missing? (Did he give us the full picture?)
- Did someone change the subject? (Does he offer us the right answer for the wrong problem?)
- Does that make sense? (Is the conclusion logical and consistent with what we already know?)
Error interpretation: correlation
The concept of correlation is crucial to the potential confusion it can cause. Statistical analysis of data sets often reveals that two variables (properties) of the population under consideration tend to vary together, as if they were connected. For example, an annual income study that also looks at the age of death may find that the poor tend to have shorter lives than the rich. Both variables are said to be correlated; However, they may or may not be the cause of each other. The phenomenon of correlation can be caused by a third phenomenon, previously unknown, called a lurking variable or confounding variable. For this reason, there is no way to immediately conclude a causal relationship between two variables. (See Correlation does not imply causation.)
History of statistics
Some scholars show the origin of statistics to 1663, with the publication of the Nature and Political Observations of Bills of Mortality by John Graunt. The initial application of statistical thinking revolves around the need of the state to base its policy on demographic and economic data, then its etymology stat - . The scope of statistical discipline was expanded in the early nineteenth century to include data collection and analysis in general. Today, statistics are widely used in governmental, business, and natural and social sciences.
The mathematical basics were laid in the 17th century with the development of probability theory by Gerolamo Cardano, Blaise Pascal and Pierre de Fermat. The theory of mathematical probability arises from the study of coincidences, though the concept of probability has been studied in medieval law and by philosophers like Juan Caramuel. The least squares method was first described by Adrien-Marie Legendre in 1805.
Modern statistical fields emerged in the late 19th and early 20th centuries in three stages. The first wave, at the turn of the century, was led by the work of Francis Galton and Karl Pearson, which transformed statistics into a strict mathematical discipline used for analysis, not only in science but in industry and politics as well. Galton's contributions include introducing the concept of standard deviation, correlation, regression analysis and application of this method to study various human characteristics - height, weight, eyelash length among others. Pearson develops Pearson's product-moment correlation coefficients, defined as product moments, the moment method for mounting distribution to Pearson samples and distributions, among many other things. Galton and Pearson founded Biometrika as the first journal of mathematical and biostatistical statistics (later called biometry), and the latter established the first university statistics department at University College London.
Ronald Fisher coined the term "null hypothesis" during a tea-tasting experiment of Lady, which was "never proven or established, but possibly denied, in the course of experiments".
The second wave of the 1910s and 20s was initiated by William Gosset, and culminated in the insights of Ronald Fisher, who wrote textbooks defining academic disciplines at universities around the world. Fisher's most important publication was his 1918 paper The correlation between the Relatives of Mendelian Heritage Marking, which was the first to use the terms statistics, variance, 1925 static work The Statistical Methods for Workers Research and 1935 The Design of Experiments , in which he developed a strict experimental model design. He derives the concept of sufficiency, additional statistics, Fisher linear discriminator and Fisher information. In his 1930s The Genetical Theory of Natural Selection he applied statistics for various biological concepts such as the Fisher principle). However, A. W. F. Edwards says that it is "probably the most famous argument in evolutionary biology". (about the sex ratio), the Fisherman's escape, a concept in sexual selection about the influence of positive feedback affect found in evolution.
The last wave, which mainly saw the perfection and extension of previous developments, emerged from collaborative work between Egon Pearson and Jerzy Neyman in the 1930s. They introduced the concept of "Type II" errors, test powers and confidence intervals. Jerzy Neyman in 1934 showed that stratified random sampling is generally a better estimation method than purposive (quota) sampling.
At present, statistical methods are applied in all areas involving decision making, to make accurate conclusions from collected data sets and to make decisions in the face of uncertainty based on statistical methodologies. The use of modern computers has accelerated large-scale statistical calculations, and also allows new methods that are not practical to do manually. Statistics continue to be an active field of research, for example on the issue of how to analyze Big data.
Apps
Applied statistics, theoretical statistics and mathematical statistics
Applied statistics consists of descriptive statistics and the application of inferential statistics. Theoretical statistics concern the logical argument underlying the justification of approaches to statistical inference, and includes mathematical statistics . Mathematical statistics not only include the manipulation of probability distributions necessary to derive results associated with estimation and inference methods, but also various aspects of computational statistics and experimental design.
Machine learning and data mining â ⬠<â â¬
There are two applications for machine learning and data mining: data management and data analysis. Statistical tools are required for data analysis.
Statistics in the community
Statistics apply to a variety of academic disciplines, including natural and social sciences, government, and business. Statistical consultants can help organizations and companies that do not have internal expertise relevant to their specific questions.
Statistical computing
The rapid increase in computing power from the second half of the 20th century has had a major impact on the practice of statistics. Initial statistical models are almost always from linear model classes, but powerful computers, coupled with appropriate numerical algorithms, lead to increased interest in nonlinear models (such as neural networks) as well as the creation of new types, such as general linear models and multilevel models.
Increased computing power has also led to the growing popularity of computation-intensive methods based on resampling, such as permutation and bootstrap tests, while techniques like Gibbs sampling have used Bayesian models more feasible. The computer revolution has implications for the future of statistics with new emphasis on "experimental" and "empirical" statistics. A large number of special-purpose and custom statistics software are now available. Examples of available software are capable of calculating complex statistics including programs like Mathematica, SAS, SPSS, and R.
Statistics applied to math or art
Traditionally, statistics relate to drawing inferences using semi-standard methodologies that are "required learning" in most sciences. This has changed by using statistics in a non-inferential context. What was once considered a dry subject, taken in many areas as a degree requirement, is now seen with enthusiasm. Initially derided by some mathematical purists, it is now considered an important methodology in certain areas.
- In number theory, the plot data generated by the distribution function can be changed by a tool commonly used in statistics to reveal the archetype, which can then lead to the hypothesis.
- Statistical methods include prediction methods in forecasting coupled with chaos theory and fractal geometry to create video works considered to have extraordinary beauty.
- The art process of Jackson Pollock relies on artistic experiments in which the underlying distribution in nature is artistically expressed. With the advent of computers, statistical methods were applied to formalize distributed-distribution natural processes to create and analyze mobile video art.
- The statistical method can be used predicatively in the performing arts, as in card tricks based on the Markov process that only works for some time, a predictable event using a statistical methodology.
- Statistics can be used to create predictive works of art, such as in statistical or stochastic music created by Iannis Xenakis, where the music has a special performance. Although this type of art does not always appear as expected, it behaves in a predictable way and can be set using statistics.
Special Discipline
Statistical techniques are used in various types of scientific and social research, including: biostatistics, computational biology, computational sociology, tissue biology, social sciences, sociology, and social research. Some areas of inquiry use applied statistics so widely that they have special terminology. This discipline includes:
In addition, there are certain types of statistical analyzes that have also developed their own specific terminology and methodology:
Statistics form the key base tool in business and manufacturing as well. This is used to understand variability measurement systems, control processes (such as in statistical process controls or SPCs), to summarize data, and to make decisions based on data. In this role, this is a key tool, and perhaps the only reliable tool.
See also
- Foundation and main field of statistics
References
Further reading
-
Barbara Illowsky; Susan Dean (2014). Introduction Statistics . OpenStax CNX. ISBN: 9781938168208. - David W. Stockburger, Introduction Statistics: Concepts, Models, and Applications , Third Ed Web. Missouri State University.
- Stephen Jones, 2010. Statistics in Psychology: Explanations without Equation . Palgrave Macmillan. ISBN: 9781137282392.
- Cohen, J. (1990). "The thing I learned (so far)". American psychologist , 45, 1304-1312.
- Gigerenzer, G. (2004). "Unconditional statistics". Socio-Economic Journal , 33, 587-606. doi: 10.1016/j.socec.2004.09.033
- Ioannidis, J. P. A. (2005). "Why do most of the published research findings go wrong". PLoS Medicine , 2, 696-701. doi: 10.1371/journal.pmed.0040168
External links
- (Electronic Version): StatSoft, Inc. (2013). Electronic Statistics Textbook. Tulsa, OK: StatSoft.
- Online Statistics Education: An Interactive Multimedia Study Course . Developed by Rice University (Main Developers), University of Houston Clear Lake, Tufts University, and National Science Foundation.
- UCLA Statistics Computing Resources
Source of the article : Wikipedia