For example, profit by definition can be broken down into total revenue and total cost. In turn, total revenue can be analyzed by its components, such as revenue of divisions A, B, and C which are mutually exclusive of each other and should add to the total revenue collectively exhaustive. Analysts may use robust statistical measurements to solve certain analytical problems. Hypothesis testing is used when a particular hypothesis about the true state of affairs is made by the analyst and data is gathered to determine whether that state of affairs is true or false.
For example, the hypothesis might be that "Unemployment has no effect on inflation", which relates to an economics concept called the Phillips Curve.
Hypothesis testing involves considering the likelihood of Type I and type II errors , which relate to whether the data supports accepting or rejecting the hypothesis.
Regression analysis may be used when the analyst is trying to determine the extent to which independent variable X affects dependent variable Y e. This is an attempt to model or fit an equation line or curve to the data, such that Y is a function of X. Necessary condition analysis NCA may be used when the analyst is trying to determine the extent to which independent variable X allows variable Y e.
Whereas multiple regression analysis uses additive logic where each X-variable can produce the outcome and the X's can compensate for each other they are sufficient but not necessary , necessary condition analysis NCA uses necessity logic, where one or more X-variables allow the outcome to exist, but may not produce it they are necessary but not sufficient.
Each single necessary condition must be present and compensation is not possible. Users may have particular data points of interest within a data set, as opposed to general messaging outlined above.
Such low-level user analytic activities are presented in the following table. The taxonomy can also be organized by three poles of activities: Barriers to effective analysis may exist among the analysts performing the data analysis or among the audience. Distinguishing fact from opinion, cognitive biases, and innumeracy are all challenges to sound data analysis.
Effective analysis requires obtaining relevant facts to answer questions, support a conclusion or formal opinion , or test hypotheses. Facts by definition are irrefutable, meaning that any person involved in the analysis should be able to agree upon them. This makes it a fact. Whether persons agree or disagree with the CBO is their own opinion. As another example, the auditor of a public company must arrive at a formal opinion on whether financial statements of publicly traded corporations are "fairly stated, in all material respects.
When making the leap from facts to opinions, there is always the possibility that the opinion is erroneous.
There are a variety of cognitive biases that can adversely affect analysis. For example, confirmation bias is the tendency to search for or interpret information in a way that confirms one's preconceptions. In addition, individuals may discredit information that does not support their views. Analysts may be trained specifically to be aware of these biases and how to overcome them. In his book Psychology of Intelligence Analysis , retired CIA analyst Richards Heuer wrote that analysts should clearly delineate their assumptions and chains of inference and specify the degree and source of the uncertainty involved in the conclusions.
He emphasized procedures to help surface and debate alternative points of view. Effective analysts are generally adept with a variety of numerical techniques. However, audiences may not have such literacy with numbers or numeracy ; they are said to be innumerate.
Persons communicating the data may also be attempting to mislead or misinform, deliberately using bad numerical techniques. For example, whether a number is rising or falling may not be the key factor. More important may be the number relative to another number, such as the size of government revenue or spending relative to the size of the economy GDP or the amount of cost relative to revenue in corporate financial statements.
This numerical technique is referred to as normalization  or common-sizing. There are many such techniques employed by analysts, whether adjusting for inflation i. Analysts apply a variety of techniques to address the various quantitative messages described in the section above. Analysts may also analyze data under different assumptions or scenarios. For example, when analysts perform financial statement analysis , they will often recast the financial statements under different assumptions to help arrive at an estimate of future cash flow, which they then discount to present value based on some interest rate, to determine the valuation of the company or its stock.
Similarly, the CBO analyzes the effects of various policy options on the government's revenue, outlays and deficits, creating alternative future scenarios for key measures. A data analytics approach can be used in order to predict energy consumption in buildings. Analytics is the "extensive use of data, statistical and quantitative analysis, explanatory and predictive models, and fact-based management to drive decisions and actions.
In education , most educators have access to a data system for the purpose of analyzing student data. This section contains rather technical explanations that may assist practitioners but are beyond the typical scope of a Wikipedia article. The most important distinction between the initial data analysis phase and the main analysis phase, is that during initial data analysis one refrains from any analysis that is aimed at answering the original research question.
The initial data analysis phase is guided by the following four questions: The quality of the data should be checked as early as possible. Data quality can be assessed in several ways, using different types of analysis: The choice of analyses to assess the data quality during the initial data analysis phase depends on the analyses that will be conducted in the main analysis phase. The quality of the measurement instruments should only be checked during the initial data analysis phase when this is not the focus or research question of the study.
One should check whether structure of measurement instruments corresponds to structure reported in the literature. After assessing the quality of the data and of the measurements, one might decide to impute missing data, or to perform initial transformations of one or more variables, although this can also be done during the main analysis phase. One should check the success of the randomization procedure, for instance by checking whether background and substantive variables are equally distributed within and across groups.
If the study did not need or use a randomization procedure, one should check the success of the non-random sampling, for instance by checking whether all subgroups of the population of interest are represented in sample. Other possible data distortions that should be checked are:. In any report or article, the structure of the sample must be accurately described. It is especially important to exactly determine the structure of the sample and specifically the size of the subgroups when subgroup analyses will be performed during the main analysis phase.
The characteristics of the data sample can be assessed by looking at:. During the final stage, the findings of the initial data analysis are documented, and necessary, preferable, and possible corrective actions are taken.
Also, the original plan for the main data analyses can and should be specified in more detail or rewritten. In order to do this, several decisions about the main data analyses can and should be made:. Several analyses can be used during the initial data analysis phase: It is important to take the measurement levels of the variables into account for the analyses, as special statistical techniques are available for each level: Nonlinear analysis will be necessary when the data is recorded from a nonlinear system.
Nonlinear systems can exhibit complex dynamic effects including bifurcations , chaos , harmonics and subharmonics that cannot be analyzed using simple linear methods. Nonlinear data analysis is closely related to nonlinear system identification. In the main analysis phase analyses aimed at answering the research question are performed as well as any other relevant analysis needed to write the first draft of the research report. In the main analysis phase either an exploratory or confirmatory approach can be adopted.
Usually the approach is decided before data is collected. In an exploratory analysis no clear hypothesis is stated before analysing the data, and the data is searched for models that describe the data well. In a confirmatory analysis clear hypotheses about the data are tested. Exploratory data analysis should be interpreted carefully. When testing multiple models at once there is a high chance on finding at least one of them to be significant, but this can be due to a type 1 error.
It is important to always adjust the significance level when testing multiple models with, for example, a Bonferroni correction. Also, one should not follow up an exploratory analysis with a confirmatory analysis in the same dataset. An exploratory analysis is used to find ideas for a theory, but not to test that theory as well. When a model is found exploratory in a dataset, then following up that analysis with a confirmatory analysis in the same dataset could simply mean that the results of the confirmatory analysis are due to the same type 1 error that resulted in the exploratory model in the first place.
The confirmatory analysis therefore will not be more informative than the original exploratory analysis. It is important to obtain some indication about how generalizable the results are. Are the results reliable and reproducible? There are two main ways of doing this:. Many statistical methods have been used for statistical analyses. A very brief list of four of the more popular methods is:. For example, an outlying data point may represent the input from your most critical supplier or your highest selling product.
The nature of a regression line, however, tempts you to ignore these outliers. The trick is to determine the right size for a sample to be accurate. Using proportion and standard deviation methods, you are able to accurately determine the right sample size you need to make your data collection statistically significant.
When studying a new, untested variable in a population, your proportion equations might need to rely on certain assumptions. However, these assumptions might be completely inaccurate. This error is then passed along to your sample size determination and then onto the rest of your statistical data analysis.
Also commonly called t testing, hypothesis testing assesses if a certain premise is actually true for your data set or population. Hypothesis tests are used in everything from science and research to business and economic.
To be rigorous, hypothesis tests need to watch out for common errors. For example, the placebo effect occurs when participants falsely expect a certain result and then perceive or actually attain that result. Another common error is the Hawthorne effect or observer effect , which happens when participants skew results because they know they are being studied. However, avoiding the common pitfalls associated with each method is just as important.
Contact Join our Team. Standard Deviation The standard deviation, often represented with the Greek letter sigma, is the measure of a spread of data around the mean.
Regression Regression models the relationships between dependent and explanatory variables, which are usually charted on a scatterplot. This error is then passed along to your sample size determination and then onto the rest of your statistical data analysis 5.
Data analysis methods in the absence of primary data collection can involve discussing common patterns, as well as, controversies within secondary data .
Before we look at the methods and techniques of data analysis, lets first define what data analysis is. Data analysis is the collecting and organizing of data so that a researcher can come to a conclusion. Data analysis allows one to answer questions, solve problems, and derive important information.
15 Methods of Data Analysis in Qualitative Research Compiled by Donald Ratcliff 1. Typology - a classification system, taken from patterns, themes, or other kinds of groups of data. (Patton pp. ,) John Lofland & Lyn Lofland Ideally, categories should be mutually exclusive and exhaustive if possible, often they aren't. A source of confusion for many people is the belief that qualitative research generates just qualitative data (text, words, opinions, etc) and that quantitative research generates just quantitative data (numbers).
6 Methods of data collection and analysis 5 In the process of developing a research question, you are likely to think of a number of different research questions. Data integration is a precursor to data analysis, [according to whom?] and data analysis is closely linked [how?] to data visualization and data dissemination. The term data analysis is sometimes used as a synonym for data modeling.