2016
COMMERCE
Paper: 203
(Research Methodology and Statistical Analysis)
Full Marks: 80
Time: 3 hours
The figures in the margin indicate full marks for the questions
1. (a) What is Business Research? What factors should be taken into consideration while selecting a sample from a population? (6+10=16)
-> Business Research can be simply defined as a process of gather comprehensive data and information of all the areas of business and incorporating this information for sales and profit maximization. If you are wondering what is Business Research, it is a systematic management activity helping companies to determine which product will be most profitable for companies to produce. Also, there are multiple steps in conducting research, with each thoroughly reviewed to ensure that the best decision is made for the company as a whole.
Advantages and Disadvantages of Business Research
There are certain pros and cons of business research that you must know about. Here are the advantages and disadvantages of Business Research.
Advantages:-
· Business Research plays the role of a catalyst in identifying potential threats, issues as well as opportunities .
· It provides a detailed analysis of customers and the target audience, thus helping in building better relationships with one’s audience and capturing the areas which we might be missing out on.
· It also anticipates future problems thus the enterprise is able to tackle those uncertainties and prepare for them beforehand.
· It keeps a continuous track of competition in the market and gives businesses the scope to come up with better strategies to tackle their competitors.
· Business Research also conducts a thorough cost analysis thus helping the company efficiently manage resources and allocate them in an optimal manner.
· It keeps you updated with the latest trends and competitor analysis.
Disadvantages:
· Business Research can be expensive and time-consuming.
· It also has the danger of being assumptive and imprecise at times, because the focus groups might be small or can be highly based on assumptions.
· The market is ever-changing and ever-evolving and capturing the right trends or anticipating them can constitute a complicated process for business research.
Factors that are taken into consideration while selecting a sample from a population:-
I. Probability Sampling
a. Simple random sampling
Simple random sampling is a probability sampling technique wherein each member of the population is assigned a number and the desired sample is determined by generating random numbers appropriate for the relevant sample size. This ensures that each sampling unit has a known, equal and nonzero chance of getting selected into the sample.
b. Systematic random sampling
In systematic random sampling the sample is chosen by selecting a random starting point and then picking each element in succession from the sampling frame. The sampling interval i, is determined by dividing the population size N by the sample size n and rounding to the nearest integer. For example, if there were 10,000 owners of a washing machine and a sample of 100 is to be desired, the sampling interval i is 100. The researcher than selects a number between 1 and 100. If, for example, number 50 is chosen by the researcher, the sample will consists of members 50, 100, 150, 200, 250 and so on.
c. Stratified sampling
Stratified sampling is distinguished by the two-step procedure it involves. In the first step the population is divided into mutually exclusive and collectively exhaustive sub-populations, which are called strata. In the second step, a simple random sample of members are chosen independently from each group or strata. This technique is used when there is considerable diversity among the population elements.
In proportionate stratified sampling, the sample size from each stratum is dependent on that stratum’s size relative to the defined target population. Therefore, the larger strata are sampled more heavily using this method as they make up a larger percentage of the target population. On the other hand, in disproportionate stratified sampling, the sample selected from each stratum is independent of that stratum’s proportion of the total defined target population.
d. Cluster sampling
Cluster sampling is quite similar to stratified sampling and the major difference is that in stratified sampling, all the subpopulations (strata) are selected for further sampling whereas in cluster sampling only a sample of subpopulations (clusters) is chosen.
II. Non-Probability Sampling
The selection of probability and no probability sampling is based on various considerations including, the nature of research, and variability in population, statistical considerations and operational efficiency. No probability sampling is mainly used in product testing, name testing, advertising testing where researchers and managers want to have a rough idea of population reaction rather than a precise understanding.
a. Convenience sampling
As the name implies, in convenience sampling, the selection of the respondent sample is left entirely to the researcher. Many of the mall intercept studies use convenience sampling. The researcher makes assumption that the target population is homogenous and the individuals interviewed are similar to the overall defined target population. This in itself leads to considerable sampling error as there is no way to judge the representativeness of the sample. Furthermore, the results generated are hard to generalize to a wider population. However, it is the most cost-effective as well as least time-consuming among all methods.
b. Judgment sampling
Judgment sampling is an extension to the convenience sampling. In this procedure, respondents are selected according to an experienced researcher’s belief that they will meet the requirements of the study. This method also incorporates a great deal of sampling error since the researcher’s judgment may be wrong however it tends to be used in industrial markets quite regularly when small well-defined populations are to be researched.
c. Quota sampling
Quota sampling is a procedure that restricts the selection of the sample by controlling the number of respondents by one or more criterion. The restriction generally involves quotas regarding respondents’ demographic characteristics (e.g. age, race, income) or specific behaviors (e.g. frequency of purchase, usage patterns).
d. Snowball sampling
In snowball sampling, an initial group of respondents is selected, usually at random. After being interviewed however, these respondents are asked to identify others who belong to the target population of interest. Subsequent respondents are then selected on the basis of referral. Therefore, this procedure is also called referral sampling. Snowball sampling is used in researcher situations where defined target population is rare and unique and identifying the target respondents is a difficult task. For example, if the target respondent is owners of second hand washing machines, it will be extremely difficult to identify and hence, snowball sampling may provide a way forward.
(b) Detail out the measurement of scaling techniques. (16)
-> Scaling technique is a method of placing respondents in continuation of gradual change in the pre-assigned values, symbols or numbers based on the features of a particular object as per the defined rules. All the scaling techniques are based on four pillars, i.e., order, description, distance and origin.
The marketing research is highly dependable upon the scaling techniques, without which no market analysis can be performed.
Types of Scaling Techniques
The researchers have identified many scaling techniques; today, we will discuss some of the most common scales used by business organizations, researchers, economists, experts, etc.
These techniques can be classified as primary scaling techniques and other scaling techniques.
Let us now study each of these methods in-depth below:
Primary Scaling Techniques
The major four scales used in statistics for market research consist of the following:
Nominal Scale
Nominal scales are adopted for non-quantitative (containing no numerical implication) labeling variables which are unique and different from one another.
Types of Nominal Scales:
1. Dichotomous: A nominal scale that has only two labels is called ‘dichotomous’; for example, Yes/No.
2. Nominal with Order: The labels on a nominal scale arranged in an ascending or descending order is termed as ‘nominal with order’; for example, Excellent, Good, Average, Poor, Worst.
3. Nominal without Order: Such nominal scale which has no sequence, is called ‘nominal without order’; for example, Black, White.
Ordinal Scale
The ordinal scale functions on the concept of the relative position of the objects or labels based on the individual’s choice or preference.
Interval Scale
An interval scale is also called a cardinal scale which is the numerical labelling with the same difference among the consecutive measurement units. With the help of this scaling technique, researchers can obtain a better comparison between the objects.
Ratio Scale
One of the most superior measurement technique is the ratio scale. Similar to an interval scale, a ratio scale is an abstract number system. It allows measurement at proper intervals, order, categorization and distance, with an added property of originating from a fixed zero point. Here, the comparison can be made in terms of the acquired ratio.
Other Scaling Techniques
Scaling of objects can be used for a comparative study between more than one object (products, services, brands, events, etc.). Or can be individually carried out to understand the consumer’s behavior and response towards a particular object.
Following are the two categories under which other scaling techniques are placed based on their comparability:
Comparative Scales
For comparing two or more variables, a comparative scale is used by the respondents. Following are the different types of comparative scaling techniques:
Paired Comparison
A paired comparison symbolizes two variables from which the respondent needs to select one. This technique is mainly used at the time of product testing, to facilitate the consumers with a comparative analysis of the two major products in the market.
To compare more than two objects say comparing P, Q and R, one can first compare P with Q and then the superior one (i.e., one with a higher percentage) with R.
Rank Order
In rank order scaling the respondent needs to rank or arrange the given objects according to his or her preference.
Constant Sum
It is a scaling technique where a continual sum of units like dollars, points, chits, chips, etc. is given to the features, attributes and importance of a particular product or service by the respondents.
Q-Sort Scaling
Q-sort scaling is a technique used for sorting the most appropriate objects out of a large number of given variables. It emphasizes on the ranking of the given objects in a descending order to form similar piles based on specific attributes.
Non-Comparative Scales
A non-comparative scale is used to analyse the performance of an individual product or object on different parameters. Following are some of its most common types:
Continuous Rating Scales
It is a graphical rating scale where the respondents are free to place the object at a position of their choice. It is done by selecting and marking a point along the vertical or horizontal line which ranges between two extreme criteria.
Itemized Rating Scale
Itemized scale is another essential technique under the non-comparative scales. It emphasizes on choosing a particular category among the various given categories by the respondents. Each class is briefly defined by the researchers to facilitate such selection.
The three most commonly used itemized rating scales are as follows:
- Likert Scale : In the Likert scale, the researcher provides some statements and asks the respondents to mark their level of agreement or disagreement over these statements by selecting any one of the options from the five given alternatives.
- Semantic Differential Scale : A bi-polar seven-point non-comparative rating scale is where the respondent can mark on any of the seven points for each given attribute of the object as per personal choice. Thus, depicting the respondent’s attitude or perception towards the object.
- Stapel Scale : A Stapel scale is that itemized rating scale which measures the response, perception or attitude of the respondents for a particular object through a unipolar rating. The range of a Stapel scale is between -5 to +5 eliminating 0, thus confining to 10 units.
2. (a) How does regression enable prediction of relationship between variables? (16)
(b) Explain the relationship between mean, median and mode. (16)
-> Mean
Mean has been learned by most of the people in middle school math as the norm. The mean is what you get by combining all the values and dividing the total by the amount of values, given a set of values. You have a set of values called the population X1,……,Xn written in math notation.
The mean number is very helpful. It describes the group’s property. It is important to understand that the mean is not an entity-in reality, there can be anything whose value matches the mean, but the mean is a summarized representation of the population.
Median
Often the median is a better representative of a standard group member. If you take all the values in a list and arrange them in increasing order, the median will be the number located at the centre. The median is a quality that belongs to any member of the group. Based on the value distribution, the mean may not be particularly close to the quality of any group member. The mean is also subjected to skewing, as few as, one value significantly different from the rest of the group can change the mean dramatically. Without the skew factor introduced by outliers, the median gives you a central group member. If you have a normal distribution, a typical member of the population will be the median value.
Mode
The mode is the group’s most common member. Whether it’s the largest or smallest value in the group, it doesn’t matter whatever the most common value is the mode. Most of these three median measures are the least commonly used, and that’s because they are usually the least meaningful. But it is helpful once in a while. If your data is regular as well as perfect, the median, mean, and mode would all be the same. In real life, this is almost impossible and never happens.
The Relation between Mean, Median and Mode:
The relationship between mean median and mode can be expressed by using Karl Pearson’s Formula as:
(Mean – Median) = ⅓ (Mean – Mode)
3 (Mean – Median) = (Mean – Mode)
Mode = Mean – 3(Mean – Median)
Mode = 3 Median – 2 Mean
Thus, the above equation can be used when any of the two values are given and you need to find the third value.
4. (a) What are the errors in hypothesis testing? State the general procedure for testing a hypothesis. 10+6=16
-> In statistical hypothesis testing , a type I error is the rejection of a true null hypothesis (also known as a “false positive” finding or conclusion; example: “an innocent person is convicted”), while a type II error is the non-rejection of a false null hypothesis (also known as a “false negative” finding or conclusion; example: “a guilty person is not convicted”). Much of statistical theory revolves around the minimization of one or both of these errors, though the complete elimination of either is a statistical impossibility for non-deterministic algorithms . By selecting a low threshold (cut-off) value and modifying the alpha (p) level, the quality of the hypothesis test can be increased. [2] The knowledge of Type I errors and Type II errors is widely used in medical science , biometrics and computer science .
Intuitively, type I errors can be thought of as errors of commission, i.e. the researcher unluckily concludes that something is the fact. For instance, consider a study where researchers compare a drug with a placebo. If the patients who are given the drug get better than the patients given the placebo by chance, it may appear that the drug is effective, but in fact the conclusion is incorrect. In reverse, type II errors are errors of omission. In the example above, if the patients who got the drug did not get better at a higher rate than the ones who got the placebo, but this was a random fluke, that would be a type II error. The consequence of a type II error depends on the size and direction of the missed determination and the circumstances. An expensive cure for one in a million patients may be inconsequential even if it truly is a cure.
In statistical test theory , the notion of a statistical error is an integral part of hypothesis testing . The test goes about choosing about two competing propositions called null hypothesis , denoted by H0 and alternative hypothesis , denoted by H1 . This is conceptually similar to the judgement in a court trial. The null hypothesis corresponds to the position of defendant: just as he is presumed to be innocent until proven guilty, so is the null hypothesis presumed to be true until the data provide convincing evidence against it. The alternative hypothesis corresponds to the position against the defendant. Specifically, the null hypothesis also involves the absence of a difference or the absence of an association. Thus, the null hypothesis can never be that there is a difference or an association.
If the result of the test corresponds with reality, then a correct decision has been made. However, if the result of the test does not correspond with reality, then an error has occurred. There are two situations in which the decision is wrong. The null hypothesis may be true, whereas we reject H 0. On the other hand, the alternative hypothesis H1 may be true, whereas we do not reject H0. Two types of error are distinguished: Type I error and type II error. [3]
Type I error
The first kind of error is the rejection of a true null hypothesis as the result of a test procedure. This kind of error is called a type I error (false positive) and is sometimes called an error of the first kind.
In terms of the courtroom example, a type I error corresponds to convicting an innocent defendant.
Type II error
The second kind of error is the failure to reject a false null hypothesis as the result of a test procedure. This sort of error is called a type II error (false negative) and is also referred to as an error of the second kind.
In terms of the courtroom example, a type II error corresponds to acquitting a criminal.
Crossover error rate
The crossover error rate (CER) is the point at which Type I errors and Type II errors are equal and represents the best way of measuring a biometrics’ effectiveness. A system with a lower CER value provides more accuracy than a system with a higher CER value.
False positive and false negative
In terms of false positives and false negatives, a positive result corresponds to rejecting the null hypothesis, while a negative result corresponds to failing to reject the null hypothesis; “false” means the conclusion drawn is incorrect. Thus, a type I error is equivalent to a false positive, and a type II error is equivalent to a false negative.
There are 5 main steps in hypothesis testing:
1. State your research hypothesis as a null (Ho) and alternate (Ha) hypothesis.
2. Collect data in a way designed to test the hypothesis.
3. Perform an appropriate statistical test .
4. Decide whether the null hypothesis is supported or refuted.
5. Present the findings in yourresults and discussion section.
Though the specific details might vary the procedure you will use when testing a hypothesis will always follow some version of these steps.
Step 1: State your null and alternate hypothesis
After developing your initial research hypothesis (the prediction that you want to investigate), it is important to restate it as a null (Ho) and alternate (Ha) hypothesis so that you can test it mathematically.
The alternate hypothesis is usually your initial hypothesis that predicts a relationship between variables . The null hypothesis is a prediction of no relationship between the variables you are interested in.
Step 2: Collect data
For a statistical test to be valid, it is important to perform sampling and collect data in a way that is designed to test your hypothesis. If your data are not representative, then you cannot make statistical inferences about the population you are interested in.
Step 3: Perform a statistical test
There are a variety of statistical tests available, but they are all based on the comparison of within-group variance (how spread out the data is within a category) versus between-group variance (how different the categories are from one another).
If the between-group variance is large enough that there is little or no overlap between groups, then your statistical test will reflect that by showing a low p-value. This means it is unlikely that the differences between these groups came about by chance.
Alternatively, if there is high within-group variance and low between-group variance, then your statistical test will reflect that with a high p-value. This means it is likely that any difference you measure between groups is due to chance.
Your choice of statistical test will be based on the type of data you collected.
Step 4: Decide whether the null hypothesis is supported or refuted
Based on the outcome of your statistical test, you will have to decide whether your null hypothesis is supported or refuted.
In most cases you will use the p-value generated by your statistical test to guide your decision. And in most cases, your cutoff for refuting the null hypothesis will be 0.05 – that is, when there is a less than 5% chance that you would see these results if the null hypothesis were true.
Step 5: Present your findings
The results of hypothesis testing will be presented in the results and discussion sections of your research paper.
In the results section you should give a brief summary of the data and a summary of the results of your statistical test (for example, the estimated difference between group means and associated p-value). In the discussion, you can discuss whether your initial hypothesis was supported or refuted.
In the formal language of hypothesis testing, we talk about refuting or accepting the null hypothesis.
(b) Define large and small sampling tests (Parametric). (16)
-> Sampling is a process used in statistical analysis in which a predetermined number of observations are taken from a larger population . The methodology used to sample from a larger population depends on the type of analysis being performed, but it may include simple random sampling or systematic sampling.
In statistics, asampling distribution or finite-sample distribution is the probability distribution of a given random-sample -based statistic . If an arbitrarily large number of samples, each involving multiple observations (data points), were separately used in order to compute one value of a statistic (such as, for example, the sample mean or sample variance ) for each sample, then the sampling distribution is the probability distribution of the values that the statistic takes on. In many contexts, only one sample is observed, but the sampling distribution can be found theoretically.
Sampling distributions are important in statistics because they provide a major simplification en route to statistical inference . More specifically, they allow analytical considerations to be based on the probability distribution of a statistic, rather than on the joint probability distribution of all the individual sample values.
The sampling distribution of a statistic is the
distribution
of that statistic, considered as a
random variable
, when derived from a
random sample
of size n {\displaystyle n}
For example, consider a
normal
population with mean μ {\displaystyle \mu }
The mean of a sample from a population having a normal distribution is an example of a simple statistic taken from one of the simplest statistical populations . For other statistics and other populations the formulas are more complicated, and often they do not exist in closed-form . In such cases the sampling distributions may be approximated through Monte-Carlo simulations , bootstrap methods, or asymptotic distribution theory.
6. (a) What are the assumptions underlying the use of x2 test? Also explain Chi-square test for goodness of fit. (6+10)
-> The Chi-square statistic is a non-parametric (distribution free) tool designed to analyze group differences when the dependent variable is measured at a nominal level. Like all non-parametric statistics, the Chi-square is robust with respect to the distribution of the data. Specifically, it does not require equality of variances among the study groups or homoscedasticity in the data. It permits evaluation of both dichotomous independent variables, and of multiple group studies. Unlike many other non-parametric and some parametric statistics, the calculations needed to compute the Chi-square provide considerable information about how each of the groups performed in the study. This richness of detail allows the researcher to understand the results and thus to derive more detailed information from this statistic than from many others.
The Chi-square is a significance statistic, and should be followed with a strength statistic. The Cramer’s V is the most common strength test used to test the data when a significant Chi-square result has been obtained. Advantages of the Chi-square include its robustness with respect to distribution of the data, its ease of computation, the detailed information that can be derived from the test, its use in studies for which parametric assumptions cannot be met, and its flexibility in handling data from both two group and multiple group studies. Limitations include its sample size requirements, difficulty of interpretation when there are large numbers of categories (20 or more) in the independent or dependent variables, and tendency of the Cramer’s V to produce relative low correlation measures, even for highly significant results.
As with parametric tests, the non-parametric tests, including the χ 2 assume the data were obtained through random selection. However, it is not uncommon to find inferential statistics used when data are from convenience samples rather than random samples. (To have confidence in the results when the random sampling assumption is violated, several replication studies should be performed with essentially the same result obtained). Each non-parametric test has its own specific assumptions as well. The assumptions of the Chi-square include:
1. The data in the cells should be frequencies, or counts of cases rather than percentages or some other transformation of the data.
2. The levels (or categories) of the variables are mutually exclusive. That is, a particular subject fits into one and only one level of each of the variables.
3. Each subject may contribute data to one and only one cell in the χ 2. If, for example, the same subjects are tested over time such that the comparisons are of the same subjects at Time 1, Time 2, Time 3, etc., then χ2 may not be used.
4. The study groups must be independent. This means that a different test must be used if the two groups are related. For example, a different test must be used if the researcher’s data consists of paired samples, such as in studies in which a parent is paired with his or her child.
5. There are 2 variables, and both are measured as categories, usually at the nominal level. However, data may be ordinal data. Interval or ratio data that have been collapsed into ordinal categories may also be used. While Chi-square has no rule about limiting the number of cells (by limiting the number of categories for each variable), a very large number of cells (over 20) can make it difficult to meet assumption #6 below, and to interpret the meaning of the results.
6. The value of the cell expected should be 5 or more in at least 80% of the cells, and no cell should have an expected of less than one. This assumption is most likely to be met if the sample size equals at least the number of cells multiplied by 5. Essentially, this assumption specifies the number of cases (sample size) needed to use the χ2 for any number of cells in that χ2. This requirement will be fully explained in the example of the calculation of the statistic in the case study example.
Chi-Square goodness of fit test is a non-parametric test that is used to find out how the observed value of a given phenomena is significantly different from the expected value. In Chi-Square goodness of fit test, the term goodness of fit is used to compare the observed sample distribution with the expected probability distribution. Chi-Square goodness of fit test determines how well theoretical distribution (such as normal, binomial, or Poisson) fits the empirical distribution. In Chi-Square goodness of fit test, sample data is divided into intervals. Then the numbers of points that fall into the interval are compared, with the expected numbers of points in each interval.
Procedure for Chi-Square Goodness of Fit Test:
Set up the hypothesis for Chi-Square goodness of fit test:
A. Null hypothesis : In Chi-Square goodness of fit test, the null hypothesis assumes that there is no significant difference between the observed and the expected value.
B. Alternative hypothesis : In Chi-Square goodness of fit test, the alternative hypothesis assumes that there is a significant difference between the observed and the expected value.
For the goodness of fit test, we need one variable. We also need an idea, or hypothesis, about how that variable is distributed. Here are a couple of examples:
· We have bags of candy with five flavors in each bag. The bags should contain an equal number of pieces of each flavor. The idea we’d like to test is that the proportions of the five flavors in each bag are the same.
· For a group of children’s sports teams, we want children with a lot of experience, some experience and no experience shared evenly across the teams. Suppose we know that 20 percent of the players in the league have a lot of experience, 65 percent have some experience and 15 percent are new players with no experience. The idea we’d like to test is that each team has the same proportion of children with a lot, some or no experience as the league as a whole.
To apply the goodness of fit test to a data set we need:
· Data values that are a simple random sample from the full population.
· Categorical or nominal data. The Chi-square goodness of fit test is not appropriate for continuous data.
· A data set that is large enough so that at least five values are expected in each of the observed data categories.
Chi-square goodness of fit test example
Let’s use the bags of candy as an example. We collect a random sample of ten bags. Each bag has 100 pieces of candy and five flavors. Our hypothesis is that the proportions of the five flavors in each bag are the same.
Let’s start by answering: Is the Chi-square goodness of fit test an appropriate method to evaluate the distribution of flavors in bags of candy?
· We have a simple random sample of 10 bags of candy. We meet this requirement.
· Our categorical variable is the flavors of candy. We have the count of each flavor in 10 bags of candy. We meet this requirement.
· Each bag has 100 pieces of candy. Each bag has five flavors of candy. We expect to have equal numbers for each flavor. This means we expect 100 / 5 = 20 pieces of candy in each flavor from each bag. For 10 bags in our sample, we expect 10 x 20 = 200 pieces of candy in each flavor. This is more than the requirement of five expected values in each category.
Based on the answers above, yes, the Chi-square goodness of fit test is an appropriate method to evaluate the distribution of the flavors in bags of candy.
(b) The scores of 7 boys and 8 girl’s students in a physical test are given below. Use the median test to find out whether the test scores differ significantly in the two sexes: (16)
Boys: | 22 | 25 | 27 | 40 | 42 | 33 | 28 | |
Girls: | 20 | 32 | 34 | 31 | 39 | 29 | 28 | 33 |