By Aparna
1.
Analysis of variance often abbreviated as ANOVA is a statistical technique used to test for significance of difference among more than two sample means. Using this techniques it is possible to draw inference about whether different sampled drawn have same mean. For example this method may be used in studies such as comparing intelligence of students from different schools.
When using analysis of variance it is assumed that each of the sample is is drawn from population having normal distribution with same variance. The assumption of normality is not required when the sample size is large.
The analysis of variance is carried out in following three steps:
1. Population variance is estimated by variance among sample means.
2. A second estimate of variance is made form variance within samples.
3. Comparing these two estimates of variance. If they are approximately equal in value it is inferred that the the means are not significantly different.
Analysis of variance for twoway classifications:
Consider two factors (i.e., two Independent Variables) simultaneously. Now these two IVs can either be both between groups designs, both repeated measures designs, or a mixed design. The mixed design obviously has one between groups IV and one repeated measures IV. Each IV can also be a true experimental manipulation or a quasi experimental grouping (i.e., one in which there was no random assignment and only preexisting groups are compared).
If a significant Fvalue is found for one IV, then this is referred to as a significant main effect. However, when two or more IVs are considered simultaneously, there is also always an interaction between the IVs  which may or may not be significant.
An interaction may be defined as:
There is an interaction between two factors if the effect of one factor depends on the levels of the second factor. When the two factors are identified as A and B, the interaction is identified as the A X B interaction.
Often the best way of interpreting and understanding an interaction is by a graph. A two factor ANOVA with a nonsignificant interaction can be represented by two approximately parallel lines, whereas a significant interaction results in a graph with non parallel lines. Because two lines will rarely be exactly parallel, the significance test on the interaction is also a test of whether the two lines diverge significantly from being parallel.
If only two IVs (A and B, say) are being tested in a Factorial ANOVA, then there is only one interaction (A X B). If there are three IVs being tested (A, B and C, say), then this would be a threeway ANOVA, and there would be three twoway interactions (A X B, A X C, and B X C), and one threeway interaction (A X B X C). The complexity of the analysis increases markedly as the number of IVs increases beyond three. Only rarely will you come across Factorial ANOVAs with more than 4 IVs.
A word on interpreting interactions and main effects in ANOVA. Many texts including Ray (p. 198) stipulate that you should interpret the interaction first. If the interaction is not significant, you can then examine the main effects without needing to qualify the main effects because of the interaction. If the interaction is significant, you cannot examine the main effects because the main effects do not tell the complete story. Most statistics texts follow this line. But I will explain my pet grievance against this! It seems to me that it makes more sense to tell the simple story first and then the more complex story. The explanation of the results ends at the level of complexity which you wish to convey to the reader. In the twoway case, I prefer to examine each of the main effects first and then the interaction. If the interaction is not significant, the most complete story is told by the main effects. If the interaction is significant, then the most complete story is told by the interaction. In a twoway ANOVA this is the story you would most use to describe the results (because a twoway interaction is not too difficult to understand). One consequence of the difference in the two approaches is if, for example, you did run a fourway ANOVA and the fourway interaction (i.e., A X B X C D) was significant, you would not be able to examine any of the lower order interactions even if you wanted to! The most complex significant interaction would tell the most complete story and so this is the one you have to describe. Describing a fourway interaction is exceedingly difficult and would most likely not represent the relationships you were intending to examine and would not hold the reader's attention for very long. With the other approach, you would describe the main effects first, then the first order interactions (i.e., A X B, A X C, A X D, B X C, B X D, C X D) and then the higher order interactions only if you were interested in them! You can stop at the level of complexity you wish to convey to the reader.
Another exception to the rule of always describing the most complex relationship first is if you have a specific research question about the main effects. In your analysis and discussion you need to address the particular hypotheses you made about the research scenario, and if these are main effects, then so be it! However, not all texts appear to agree with this approach either!
For the sake of your peace of mind, and assessment, in this unit, there will be no examination questions or assignment marks riding on whether you interpret the interaction first or the main effects first. You do need to realise that in a twoway ANOVA, if there is a significant interaction, then this is the story most representative of the research results (i.e., tells the most complete story and is not too complex to understand).
2
Secondary data is the data that have been already collected by and readily available from other sources. Such data are cheaper and more quickly obtainable than the primary data and also may be available when primary data can not be obtained at all.
Advantages of Secondary data
1. It is economical. It saves efforts and expenses.
2. It is time saving.
3. It helps to make primary data collection more specific since with the help of secondary data, we are able to make out what are the gaps and deficiencies and what additional information needs to be collected.
4. It helps to improve the understanding of the problem.
5. It provides a basis for comparison for the data that is collected by the researcher.
Disadvantages of Secondary Data
1. Secondary data is something that seldom fits in the framework of the marketing research factors. Reasons for its nonfitting are:
a. Unit of secondary data collectionSuppose you want information on disposable income, but the data is available on gross income. The information may not be same as we require.
b. Class Boundaries may be different when units are same.
Before 5 Years After 5 Years
25005000 50006000
50017500 60017000
750010000 700110000
c. Thus the data collected earlier is of no use to you.
2. Accuracy of secondary data is not known.
3. Data may be outdated.
Evaluation of Secondary Data
Because of the above mentioned disadvantages of secondary data, we will lead to evaluation of secondary data. Evaluation means the following four requirements must be satisfied:
1. Availability It has to be seen that the kind of data you want is available or not. If it is not available then you have to go for primary data.
2. Relevance It should be meeting the requirements of the problem. For this we have two criterion:
a. Units of measurement should be the same.
b. Concepts used must be same and currency of data should not be outdated.
3. Accuracy In order to find how accurate the data is, the following points must be considered: 
a. Specification and methodology used;
b. Margin of error should be examined;
c. The dependability of the source must be seen.
4. Sufficiency Adequate data should be available.
Robert W Joselyn has classified the above discussion into eight steps. These eight steps are sub classified into three categories. He has given a detailed procedure for evaluating secondary data.
1. Applicability of research objective.
2. Cost of acquisition.
3. Accuracy of data.
3
Measurement is at the core of doing research. Measurement is the assignment of numbers to things. In almost all research, everything has to be reduced to numbers eventually. Precision and exactness in measurement are vitally important. The measures are what are actually used to test the hypotheses. A researcher needs good measures for both independent and dependent variables.
Measurement consists of two basic processes called conceptualization and operationalization, then an advanced process called determining the levels of measurement, and then even more advanced methods of measuring reliability and validity.
Conceptualization is the process of taking a construct or concept and refining it by giving it a conceptual or theoretical definition. Operationalization is the process of taking a conceptual definition and making it more precise by linking it to one or more specific, concrete indicators or operational definitions.
A level of measurement is the precision by which a variable is measured. For 50 years, with few detractors, science has used the Stevens (1951) typology of measurement levels. There are three things to remember about this typology:
(1) anything that can be measured falls into one of the four types;
(2) the higher the type, the more precision in measurement; and
(3) every level up contains all the properties of the previous level.
The four levels of measurement, from lowest to highest, are:
• Nominal
• Ordinal
• Interval
• Ratio
The nominal level of measurement describes variables that are categorical in nature. The characteristics of the data you're collecting fall into distinct categories. If there are a limited number of distinct categories (usually only two), then you're dealing with a discrete variable. If there are an unlimited or infinite number of distinct categories, then you're dealing with a continuous variable. Nominal variables include demographic characteristics like sex, race, and religion.
The ordinal level of measurement describes variables that can be ordered or ranked in some order of importance. It describes most judgments about things, such as big or little, strong or weak. Most opinion and attitude scales or indexes in the social sciences are ordinal in nature.
The interval level of measurement describes variables that have more or less equal intervals, or meaningful distances between their ranks. For example, if you were to ask somebody if they were first, second, or third generation immigrant, the assumption is that the distance, or number of years, between each generation is the same. All crime rates in criminal justice are interval level measures, as is any kind of rate.
The ratio level of measurement describes variables that have equal intervals and a fixed zero (or reference) point. It is possible to have zero income, zero education, and no involvement in crime, but rarely do we see ratio level variables in social science since it's almost impossible to have zero attitudes on things, although "not at all", "often", and "twice as often" might qualify as ratio level measurement.
Advanced statistics require at least interval level measurement, so the researcher always strives for this level, accepting ordinal level (which is the most common) only when they have to. Variables should be conceptually and operationally defined with levels of measurement in mind since it's going to affect how well you can analyze your data later on.
5B
Thurstone scale
In an attempt to approximate an interval level of measurement, psychologist Robert Thurstone developed the method of equalappearing intervals. This technique for developing an attitude scale compensates for the limitation of the Likert scale in that the strength of the individual items is taken into account in computing the attitude score. It also can accommodate neutral statements.
Constructing the scale
Step 1. Collect statements on the topic from people holding a wide range of attitudes, from extremely favorable to extremely unfavorable. For this example, we will use attitude toward the use of marijuana. Example statements are
It has its place.
Its use by an individual could be the beginning of a sad situation.
It is perfectly healthy; it should be legalized.
Step 2. Duplicates and irrelevant statements are omitted. The rest are typed on 3/5 cards and given to a group of people who will serve as judges.
Step 3. Originally, judges were asked to sort the statements into eleven (11) stacks representing the entire range of attitudes from extremely unfavorable (1) to extremely unfavorable (11). The middle stack is for statements which are neither favorable nor unfavorable (6). Only the end points (extremely favorable and extremely unfavorable) and the midpoint are labeled. The assumption is the intervening stacks will represent equal steps along the underlying attitude dimension. With a large number of judges, for example, using a class or some other group to do the preliminary ratings, it is easier to create a paperandpencil version.
Rate each of the following statements indicating the degree to which the statement is unfavorable or favorable to marijuana use. Do not respond in terms of your own agreement or disagreement with the statements; rather, respond in terms of the judged degree of favorableness or unfavorableness. Place an X in the interval that best reflects your judgment. For example: Marijuana is OK for most people, but a few people , may have problems with it.
1. If marijuana is taken safely, its effect can be quite enjoyable.
2. I think it is horrible and corrupting.
3. It is usually the drug people start on before addiction.
Remind the judges to rate favorability with regard to the target (marijuana), not to give their opinion as whether they agree or disagree with the statement.
Interval scale  Level of measurement that provides information about size or direction, plus having equal intervals between scale points.
Step 4. Each statement will have a numerical rating (1 to 11) from each judge, based on the stack in which it was placed. The number or weight assigned to the statement is the average of the ratings it received from the judges.
Statement Average rating from 20 judges
(11 = extremely favorable)
If marijuana is taken safely, its effect can be quite enjoyable. 8.9
I think it is horrible and corrupting. 1.6
It is usually the drug people start on before addiction. 4.9
If the judges cannot rate the item on its favorability or show a high degree of variability in their judgments, the item is eliminated. For example, the statement "Marijuana use should be taxed heavily" was rejected because it was ambiguous. Some judges thought it was promarijuana as it implied legalization; others though it was antimarijuana because it advocated a heavy tax.
Administering the scale
Here is the final form. The respondents check only the statements with which they agree. The average ratings by the judges are shown in parentheses. These would not be included on the actual form given to respondents. Note that the more positive statements have a higher weight.
This is a scale to measure your attitude toward marijuana. It does not deal with any other drug, so please consider that the items pertain to marijuana exclusively. We want to know how students feel about this topic. In order to get honest answers, the questionnaires are to be filled out anonymously. Do not sign your name.
Please check all those statements with which you agree.
___ 1. I don't approve of something that puts you out of a normal state of mind. (3.0)
___ 2. It has its place. (7.1)
___ 3. It corrupts the individual (2.2)
___ 4. Marijuana does some people a lot of good. (7.9).
___ 5. Having never tried marijuana, I can't say what effects it would have. (6.0)
___ 6. If marijuana is taken safely, its effect can be quite enjoyable. (8.9)
___ 7. I think it is horrible and corrupting. (1.6)
___ 8. It is usually the drug people start on before addiction. (4.9)
___ 9. It is perfectly healthy and should be legalized. (10.0)
___ 10. Its use by an individual could be the beginning of a sad situation. (4.1)
Scoring
The weights (favorability rating) for the checked statements are summed and divided by the number of statements checked.
A respondent who selected #3, #7, and #8 would have an attitude score of 2.2 + 1.6 + 4.9 = 8.7/3 = 2.9. Dividing by the number of statements checked (3) puts the score on the 111 scale. A score of 2.9 indicates an attitude that is definitely unfavorable to marijuana.
