Pages

Labels

Thursday, June 16, 2011

Basic statistical concepts: Data collection by sampling method

Data sampling is used to learn more about people such as consumers, events such as consumer spending, and information such as how much a consumer spends. In other words, data sampling is the acquisition of recorded numbers, information, facts or events in the form of a data set or multiple data sets. This data set is the data sample that is used in obtaining statistical answers for questions. 

Data can be collected in a number of ways, one of which is the sample method. Other methods of data collection include data mining, aggregate data gathering, random sampling, and qualitative research. This article will discuss data collection by sampling in terms of 1) its objectives, 2) how it can be done and 3) how it is analyzed.

Objectives of data sampling collection

Data sampling attempts to achieve some core objectives relevant to the statistical study in question. Specifically, sample data is utilized in a number of disciplines/fields to learn more about the subject. Moreover, in marketing, sample data may be used to learn more about people's behavior, choices, and actions for the purpose of improving marketing techniques and goals. The points listed below indicate some of the ways sample data can be used to determine.
• Information about a larger sample or population group
• Relationships between events, things or actions with other things, events or actions
• Identification of trends and patterns
• Grouping of sample data for categorization
• Analysis for research evidence and/or verification 
• Functionality of software analysis tools

An illustration of data sampling

Data sampling can be analyzed and implemented in a number of ways as we have seen. However, how this is done can vary so an illustration of how sample data is gathered and studied can be helpful. In the following hypothetical example, 1000 people are asked to partake in a statistical study. The study attempts to achieve market research into the juice flavor preferences of the 1000 people who are of different ages, gender and cultural background. To do this the following steps are taken:
• Create a simple survey with a few questions that can be used in the sample analysis
• Locate a suitable population sample from which to gather the data
• Of that population recruit 1000 of them to participate in the survey study
• Perform the research by having the population sample fill out the survey
• Analyze the data using statistical tools, software and/or techniques
• Draw conclusions based on the evidence presented by the sampling data

To further illustrate, data sampling helps answer questions like the following example: If 1000 people drink juice every day and have a choice of apple, tomato and grape juice what will be the affect on that sample population if orange juice is added to the selection? In this example, the dependent variable is the numbers of people selecting each type of juice, for example 700 apple, 100 tomato and 200 grape. The independent variable is the orange juice. The data sample is the number of people who drink each type of juice after the addition of orange juice.

Analysis of sample data

To further illustrate data sampling, data sets are used in statistics to deduce statistical relationships and characteristics of the sample data set and between independent and dependent variables within the sample data alone. Dependent variables are the individual records within a data set that are influenced by a change in the dependent variable. Sample data can be acquired a number of ways for example by primary or secondary research. The application and analysis of sample data can be used in many ways, some examples being double blind studies, control experiments, population surveys, hypothesis testing and more.

In the above example sample study of juice drinkers the affect of introducing orange juice to their choice of juices is the sample data. The data was collected with surveys rather than by observation(s), and the study was at one fixed point in time and not a over time. Thus, studies can be limited or extensive depending on what sample data is collected and how it is analyzed. How data is gathered and an analyzed can thus have a decidedly significant affect on the quality of the data and the extent of its use.

For example, the data can be analyzed simply for summary statistics such as X amount of people would drink orange juice if the given the choice out of 4 juices and Y amount of people would drink grape juice. However, the data can be further analyzed for accuracy using regression analysis in which alpha, beta and P-values are determined to aid in determining the validity of the statistics garnered through the sample data. These values help determine correlation, falsification due to pre-existing beliefs being tested, and probability of accuracy. Another way the information can be used is in cluster analysis in which the patterns of information are grouped for the purpose of assessing additional relationships. For example, the number of grape juice drinkers who would choose orange juice if grape juice were not available.

Sources:

1. http://home.ubalt.edu/ntsbarsh/Business-stat/opre504.htm#rrTopiinbuSt
2. http://brent.tvu.ac.uk/dissguide/hm1u3/hm1u3text3.htm
3. http://www.businessdictionary.com/
4. http://www.investorwords.com/
5. http://dictionary.babylon.com/

0 comments:

Post a Comment