Chapter 1 Introduction to Research Methods

0 downloads 0 Views 924KB Size Report
systematic finding: Implicit question posed, explicit answer proposed and .... 1.3.7 Conceptual Research: - This type of research is related to some ideas or theory and ..... the research scholars, trade associations and other public / private ...... size n1 and sample 2 of size n2 respectively; µ1 and µ2 are the population means, ...
Chapter 1 Introduction to Research Methods Contents: 1.1 Role of Research In Business Decision’s 1.2 Research Process 1.2.1. Selecting a Topic 1.2.2. Literature Search 1.2.3. Discussion with "Informants and Interested Parties" 1.2.4. Sampling 1.2.5. Formulating your hypothesis 1.2.6. Questionnaire Design 1.2.7. Fieldwork 1.2.8. Data Processing 1.2.9. Statistical Analysis (Hypotheses Testing) 1.2.10. Assembly of Results 1.2.11. Writing up The Results

1.3 Types of Research 1.3.1 Exploratory Research 1.3.2 Descriptive Research 1.3.3 Analytical Research 1.3.4 Causal Research 1.3.5 Quantitative Research 1.3.6 Qualitative Research 1.3.7 Conceptual Research 1.3.8 Modeling Research

1.4 Criteria of good research 1.5 Ethics of Research 1.1 Role of Research in Business Decision’s 1

Research is a process of using the methods of science to the art of management for decision-making. Every organization operates under some degree of uncertainty. This uncertainty cannot be eliminated completely, although it can be minimized with the help of research methods. Research is particularly important in the decision making process of various business organizations. To choose the best line of action (in the light of growing competition and increasing uncertainty); it is very important that one should be able to gather all the data, analyze it and reach to the appropriate decisions. Research in common context refers to a search for knowledge. It can also be defined as scientific and systematic search for gaining information and knowledge on a specific topic or phenomena. In management research is extensively used in various areas. Research provides a base for your business sound decision - making. There are three parts involved in any of your systematic finding: Implicit question posed, explicit answer proposed and Collection, analysis, and interpretation of the information leading from the question to answer Illustration. “Research comprises of defining and redefining problems, formulating hypothesis or suggested solutions; making deductions and reaching conclusions; and at last carefully testing the conclusions to determine whether they fit the formulating hypothesis”. Market Research has become an important part in management decision-making. Marketing research is a critical part of such a Market intelligence system; it helps to improve management decision making by providing relevant, accurate, & timely information. Every decision poses unique needs for information gathered through marketing research. Thus, we can say that marketing research is the function that links the Consumer, Customer, and the public to the marketer through information used to identify and define marketing opportunities and problems; Generate, Refine, and evaluate marketing actions and monitor marketing performance; improve understanding of marketing as a process.

1.2

Research Process

1.2.1. Selecting A Topic: Topic is related to the area of interest. 1.2.2. Literature Search: A researcher should be aware of the current research in the related area and further scope of expansion. 1.2.3. Discussion with "Informants and Interested Parties" 1.2.4. Sampling (described in Chapter VI) 1.2.5. Formulating Your Hypothesis (described in Chapter VII)

2

1.2.6. Questionnaire Design -Translating the broad objectives of the study into questions that will obtain the necessary information. 1.2.7. Fieldwork - Collection of data through questionnaire or interview 1.2.8. Data Processing - coding and inputting the responses 1.2.9. Statistical Analysis (hypotheses testing) 1.2.10. Assembly of Results 1.2.11. Writing up the Results- drawing conclusions / interpretations and relating the findings to other research. You will have been given separate notes on report writing.

1.3 Types of Research A research can be classified as follows 1.3.1 Exploratory Research 1.3.2 Descriptive Research 1.3.3 Analytical Research 1.3.4 Causal Research 1.3.5 Quantitative Research 1.3.6 Qualitative Research 1.3.7 Conceptual Research 1.3.8 Modeling Research 1.3.1 Exploratory Research: - The Exploratory Research structures and identifies new problems; it is an initial research which is commonly unstructured, “informal” research that is undertaken to gain background information about the general nature of the research problem, without having any specific end-objective. It is usually conducted when the researcher does not know much about the problem and needs additional information or desires new or more recent information. A research that analyzes the data and explores the possibility of obtaining as many as relationships as possible between different variables of the study. Ex: - Literature Survey, Experience survey. 1.3.2 Descriptive Research: - Descriptive research is more rigid than exploratory research, this research carries out specific objectives and hence it results to a definite conclusion. Descriptive research is undertaken to provide answers to questions of who, what, where,

3

when, and how – but not why. For example, it describes users of a product, determines the proportion of the population that uses a product, or predicts future demand for a product or describes the happening of a certain phenomenon. As opposed to exploratory research, if you are doing descriptive research you should define questions, people surveyed, and the method of analysis prior to beginning data collection. 1.3.3 Analytical research: - This type of research is used where information is already available, and analyzes these to make a critical evaluation of the material. Analytical research takes descriptive research one stage further by seeking to explain the reasons behind a particular occurrence by discovering causal relationships. Once causal relationships have been discovered, the search then shifts to factors that can be changed (variables) in order to influence the chain of causality. Typical questions in analytical research are: What factors might account for the high drop-out rate on a particular degree programme? Typical methods used in analytical research include: • • • • •

Case studies Observation Historical analysis Attitude surveys Statistical surveys

1.3.4 Causal Research: - Casual Research seeks to find cause and affect relationships between variables. It accomplishes this goal through laboratory and field experiments. 1.3.5 Quantitative Research: - This research answers the questions about data that can be measured in terms of quantity or amount. It is applicable to phenomena that can be expressed in terms of quantity. 1.3.6 Qualitative Research: - This research involves analysis of data such as words (e.g., from interviews), pictures (e.g., video), or objects (e.g., an artifact). Answer questions about nature of phenomena in order to describe phenomena and understand it from the participant’s point of view. 1.3.7 Conceptual Research: - This type of research is related to some ideas or theory and generally used by philosopher. 1.3.8 Modelling Research: - This type of research is related to business situation where business situation is formulated into different types of model. Ex:-Mathematical model, simulation models

4

1.4 Criteria of good research One thing that is important is the research work and the studies meet on the common ground of the scientific method. One expects scientific research to satisfy the following criteria. 1. The purpose of research should be clearly defined and common concepts be used. 2. The research procedure used should be described in sufficient detail to permit another researcher to repeat the research for further advancement. 3. The procedural design of the research should be carefully planned to yield results that are as objective as possible. 4. The researcher should report with complete frankness, flaws in procedural designs and estimate their effects upon the findings. 5. The analysis of data should be sufficiently adequate to reveal its significance and the methods of analysis used should be appropriate. 6. Conclusion should be considered to those justified by the data of the research and limited to those for which the data provide an adequate basis. 7. Greater confidence in research is warranted if the researcher is experienced, has a good reputation in research. In other words we can state the qualities of a good research as under: 1. Good research is systematic: it means that research is structured with specified steps to be taken in a specific sequence in accordance with well defined set of rules. 2. Good research is logical: this implies that research is guided by the rules of logical reasoning and logical process of induction and deduction are of great value in carrying out research. 3. Good research is empirical: it implies that research is related basically to one or more aspects of real situation and deals with concrete data that provides a basis for external validity to research results. 4. Good research is replicable: this characteristic allows research results to be verified by replicating the study and thereby building a sound basis for decisions.

1.5 Ethics of Research As a profound social activity, research connects us to those who will use it, to those whose research we used, through them, to the research that our sources used; Hence beyond technique, we need to think about ethics of civil communication. In addition to construction of bonds within any community, ethics deal with a range of moral and immoral choices; Research challenges us to define individual moral principles; Academic researchers are less tempted to sacrifice principle for a gain than commercial researchers. Plagiarism, claiming credit for results of others, misreport sources or invent results, data with questionable accuracy, destroy or conceal sources and data important for those who follow

5

beyond simple moral. Do not to what we should affirmatively do, i.e., concern for the integrity of the work of the community combined with narrow moral standards with the larger ethical dimension. Research done in the best interests of others is also in your own

6

Chapter-II Research Problem and Research Design Contents: 2.1 Introduction 2.2 What is a Research Problem? 2.3 How to Select the Problem 2.3.1 Sub-problem(S) 2.3.2 Statement of the Problem 2.3.3 Steps Involved In Defining A Problem

2.4 Checklist for Testing the Feasibility of the Research Problem 2.5 Meaning, Need and Features of a Research Design 2.6 Different Research Designs 2.6.1 Research Design in case of Exploratory Research 2.6.2 Research Design in case of Descriptive Research 2.6.2.1 Longitudinal Studies 2.6.2.2 Cross-sectional Studies 2.6.3 Research Design in case of Causal Research

2.1 Introduction Research forms a cycle. It starts with a problem and ends with a solution to the problem. The problem statement is therefore the axis which the whole research revolves around, because it explains in short the aim of the research.

2.2 What is a Research Problem? 7

A research problem is the situation that causes the researcher to feel apprehensive, confused and ill at ease. In other words, it refers to some difficulty which a researcher experiences in context of a situation and wants to obtain the solution for the same. It is the demarcation of a problem area within a certain context involving the WHO or WHAT, the WHERE, the WHEN and the WHY of the problem situation. There are many problem situations that may give rise to research. Three sources usually contribute to problem identification. Own experience or the experience of others may be a source of problem supply. A second source could be scientific literature. You may read about certain findings and notice that a certain field was not covered. This could lead to a research problem. Theories could be a third source. Shortcomings in theories could be researched.

2.3 How to Select the Problem The prospective researcher should think on what caused the need to do the research (problem identification). The question that he/she should ask is: Are there questions about this problem to which answers have not been found up to the present? Research originates from a need that arises. A clear distinction between the PROBLEM and the PURPOSE should be made. The problem is the aspect the researcher worries about, thinks about, and wants to find a solution for. The purpose is to solve the problem, i.e., find answers to the question(s). If there is no clear problem formulation, the purpose and methods are meaningless. Keep the following in mind: • • • • • •

Outline the general context of the problem area. Highlight key theories, concepts and ideas current in this area. What appear to be some of the underlying assumptions of this area? Why are these issues identified important? What needs to be solved? Read the subject to get to know the background and to identify unanswered questions or controversies, and/or to identify the most significant issues for further exploration.

The research problem should be stated in such a way that it would lead to analytical thinking on the part of the researcher with the aim of possible concluding solutions to the stated problem. Research problems can be stated in the form of either questions or statements. •



The research problem should always be formulated grammatically correct and as completely as possible. You should bear in mind the wording (expressions) you use. Avoid meaningless words. There should be no doubt in the mind of the reader what your intentions are. Demarcating the research field into manageable parts by dividing the main problem into sub-problems is of the utmost importance.

2.3.1 Sub-problem(S) Sub-problems are problems related to the main problem identified. Sub problems flow from the main problem and make up the main problem. It is the means to reach the set goal in a manageable way and contribute to solving the problem. 8

2.3.2 Statement of the Problem

The statement of the problem involves the demarcation and formulation of the problem, i.e., the WHO/ WHAT, WHERE, WHEN, WHY. It usually includes the statement of the hypothesis. 2.3.3 Steps involved in defining a Problem

1) Statement of a problem should be given in broad general way: For example in case of a social research it is advisable to perform some field operations, collect the survey, study it, and then phrase the problem in operational terms. 2) Understanding the origin and the nature of the problem clearly: It is essential to know the point of origin of the problem and discuss the problem with those who has a better knowledge of the concerned area. 3) Survey all the literature available and examine them before defining a research problem. 4) Finally rephrase the research problem in to a walking proposition.

2.4 Checklist for Testing the Feasibility of the Research Problem YES NO 1

Is the problem of current interest? Will the research results have social, educational or scientific value?

2 Will it be possible to apply the results in practice? 3 Does the research contribute to the science of education? 4

5

6

7

Will the research opt new problems and lead to further research? Is the research problem important? Will you be proud of the result? Is there enough scope left within the area of research (field of research)? Can you find an answer to the problem through research? Will you be able to handle the research problem?

8 Will it be practically possible to undertake the research? 9 Is the research free of any ethical problems and limitations? 9

10 Will it have any value? 11

12

13

Do you have the necessary knowledge and skills to do the research? Are you qualified to undertake the research? Is the problem important to you and are you motivated to undertake the research? Is the research viable in your situation? Do you have enough time and energy to complete the project?

14 Do you have the necessary funds for the research? 15

16

Will you be able to complete the project within the time available? Do you have access to the administrative, statistic and computer facilities the research necessitates? TOTAL:

2.5 Meaning, Need and Features of a Research Design A research design is the plan or strategy, which helps in arranging the resources required for research purpose. It acts as a path or blueprint for the researcher. In other words, it is the advanced planning of the steps to be adapted for collection of relevant data and techniques to be used in their analysis keeping different time and budget constraint in mind. Along with the population to be surveyed, size of sample, tools for analyzing data, interpretation of data, it also includes the budget and the time constraints too. The Design decision is in respect to following terms: • What is the study about? • • • • • • •

Why to study a particular topic? Where the study will be conducted? Techniques to collect the relevant data? What will be the sample design? How will the data be analyzed? What is the time required? What is the allocated Budget?

Need for Research Design:

It helps for a smooth running of various research operations

thereby making the research efficient, gaining maximum information with the minimum expenditure of time, effort, and money.

10

The Research Design is divided into following parts:-

Research Design

Operational Design Sampling Design Observational Design Statistical Design (Sub-divisions of a Research Design) Sampling Design: It deals with method of selection of samples to be collected /observed for a given study. Observational Design: It deals with the constraints and exceptions under which the observations are to be made. Statistical Design: It deals with the editing, coding and analysis of the data gathered. Operational Design: It deals with the techniques by which the procedures specified in the above designs can be carried out.

Features of a Good Design • • • •

It should define the objective of problem to be studied It should minimize the biasness and maximize the reliability of data It should give smallest experimental error It should be flexible enough to permit the consideration of many different aspects of a phenomenon.

Elements of a Research Design: The important elements of a research design are:

11

Introduction: The Research proposal should define the research problem and the researcher’s precise interest in studying it. In other words it deals with the scope of study. • Statement of the problem: It includes the formulation of problem which actually explains the objective of research. • Literature Review: It includes a review of different literatures and articles related to objective of study. It is performed to get all the information’s and researches done on the topic earlier. • Scope of Study: A complete study of any problem is difficult to study as it would entail an overwhelming amount of data. Therefore, the scope and dimensions of the study should be delimited with reference to its depth, length, and geographical area to be covered, reference period, respondents to be studied and many other different issues. We should consider the time frames decided for the study and should finish it within the same tome slot. • Objective of Study: The questions to which the researcher proposes to seek answers through the study, comes under objectives. It should be stated clearly. For example: I. To study the nature of LLLLLLL II. To investigate the impact of LLLLLLLL.. III. To examine the nature of relation between LLLL and LLLLLLL IV. To identify the causes of LLLLLLLLL The objective statements should not be vague like “to explore unemployment in India” •

• • • •



• •

• • • •



Conceptual Model: After completing the above steps the researcher formulates and develops the structure of relationships among the variables under investigation. Hypotheses: A hypothesis is a specific statement of prediction. They refer to different possible outcomes. Operational definition of concepts: It involves the different techniques used in exploratory and descriptive research in operational terms. Significance of study: It is a careful statement of the value of the study and the possible applications of its findings which helps to justify purpose of study, its importance and social relevance. Geographical area to be covered: The territorial area to be covered depends on the purpose, nature of study and availability of resources. It should be decided and specified in the research plan. Reference Period: This refers to the time period of which the data is analyzed. Also it depends on the availability of data. Sampling Plan: It is the study that requires collection of data from the fields, then we should decide the population to be selected for study and the sampling design. Tools for Gathering data: Personal and Telephonic Interviews, Questionnaire, checklist are different tools for data collection. Plan of Analysis: This includes the statistical techniques used for editing, coding and analysis of data. Chapter Scheme: The chapter scheme of report or dissertation should be prepared to give the outlines and the studies of the research conducted. Time Budget: The time period of research should be decided in advance and the research work should not exceed the time limits. This leads to loss of resources and extra cost is involved. Financial Budget: The cost of the project includes major categories like salary, printing, stationery, postage, travel expenses etc.

2.6 Different Research Designs: 12

2.6.1 Research Design in case of Exploratory Research:

-It

is also termed as

Formulative Research Studies. In this case we do not have enough understanding of the problem. Its main purpose is more precise investigation about the objective of study. It is particularly useful when researchers lack a clear idea of the problems they will meet during the study. Through this the researcher develops more clear concepts, establishes priorities, develop operational definitions also. This means that a general study will be conducted without having any end-objective except to establish as many relationships as possible between the variables of study. The Research Design in such studies must have inbuilt flexibility because the research problem broadly defined initially, is transformed into one with more precise meaning. This type of research lay the foundation for formulation of different hypotheses of research problems. It involves the study of secondary data. It rarely involves structured questionnaire, large samples and probability sampling plans. Different types of Exploratory Research •

Literature Survey: It is a study involving a collection of literatures in the selected area in which the researcher has limited experience, and critical examination and comparison of them to have better understanding. It helps in updating the past data related to the topic of research. It also helps in formulation of relevant hypothesis if it is not formed.



Experience Survey: It is a survey of experiences of experts/specialists related to the field of research which acts as a database for future research. This helps in generating ideas with minimum data collection. The decision making in the probabilistic situations is a complex process therefore the study of the experiences of the executives/researchers can be carried out using experience survey. Bidding of Tenders, Technology forecasting, Manpower and Materials planning, Production Scheduling, Portfolio Decisions etc. are examples of experience survey.

2.6.2 Research Design in case of Descriptive Research : - It is carried out with specific objectives and hence a definite end-result. It is structured research with clearly stated hypothesis or investigative questions. It deals with describing the characteristics associated with the population chosen for research, Estimates of the proportions of a population that have these characteristics and discovery of relationship among several variables. It is based on large representative samples. The design in such studies must be rigid and focus attention on the following: • • • • • •

What is the study about and why is it done? Designing methods of data collection. Selecting the sample. Processing and analysis of data. Interpretations of Results. Budget and Time Constraints.

13

For example: to describe characteristics of consumers, sales people, market areas or organizations. 2.6.2.1 Longitudinal Studies Longitudinal studies are time series analyses that make repeated measurements of the same individuals, thus allowing you to monitor behavior such as brand switching. However, longitudinal studies are not necessarily representative since many people may refuse to participate because of the commitment required. 2.6.2.2 Cross-sectional Studies Cross-sectional studies sample the population to make measurements at a specific point in time. A special type of cross-sectional analysis is a cohort analysis, which tracks an aggregate of individuals who experience the same event within the same time interval over time. You can use Cohort analyses for long forecasting of product demand.

2.6.3 Research Design in case of Causal Research: -When it is necessary to determine that one variable determines values of other variables, causal research design is used. Thus the relationship between different variables is established. It is a research design in which the major emphasis is on determining a cause-and-effect relationship. When we start the research work it is not necessary that only one type of research is used, we can use a combination of two or all the three types of research. Also research is an unending process, so there may be a clue left, which can initiate a research objective for other researchers.

Chapter-III Methods of Data Collection Contents:

3.1 Data: Definition 3.1.1 Primary Data 3.1.2 Secondary Data

3.2 Collection of Primary Data 3.2.1 Observation Method 14

3.2.2 Interview Method 3.2.3 Collection of data through Questionnaires

3.3 Collection of Secondary Data

15

3.1 Data: Definition Data: collection of any number of related observations. Statistical data are basic material needed to make an effective decision in a particular situation. It is a continuous process of measuring, counting and observing. It is necessary because •

It provides important inputs of the topic under study.



Measure of performance of an ongoing process and situations under study.



The hidden facts can also be discovered.



To help in decision making and estimating the cost.

The work of data collection starts when the research problem and research design has been planned. The data can be classified into (i) (ii)

Qualitative data Quantitative data

Qualitative data: the data which can’t be expressed numerically i.e. it can be only expressed in terms of its attributes Quantitative data: the data which can be expressed numerically i.e. its characteristics is expressed in terms of numbers. Example: when the people are grouped according to their heights, we can find their average height. But if they are classified according to their occupation it is not possible to find anything like average occupation. Thus height is quantitative data and occupation is qualitative data. Also religion, language, beauty, behavior belongs to qualitative data. 3.1.1 Primary data: - it is the data gathered by the researcher for the purpose of the project/research at hand. It is the data collected by the researcher for the first time in respect to specific purpose. Advantages: original in character, reliable information

3.1.2 Secondary data: - the data which is already been collected by someone else and which have already been passed through the statistical process. Advantages: easy to collect, involves less time and cost, deficiencies can be identified easily.

3.2 Collection of primary data 16

3.2.1 Observation method 1. Observation method: it is a common method used for data collection primarily used in the fields of behavioral sciences. It becomes a scientific tool when it becomes a formulated research purpose; it is systematically planned and recorded and is subjected to checks and controls on validity and reliability. Here the information is sought by way of the investigator’s own direct observation and without asking the respondent. We should keep in mind the following points: I.

What should be observed?

II.

How the observations should be recorded?

I.

How to ensure the accuracy of observations? Structured observation: here the observation is characterized by definition of units to be observed, steps of recording the observed information, standardized conditions to be observed. It is appropriate in descriptive studies. Unstructured observation: here the observation takes place without taking the specific characteristics into consideration. It is appropriate in exploratory studies. Participant observation: here the observer observes the situation by making himself the member of the group he is observing. It helps to record the natural behavior of the group. The observer can verify the truth of the statements with respect to the contents of a questionnaire. But if the observer extends his participation emotionally, he may narrow-down the researcher’s range of experience. Non-participant observation: here the observer observes as a detached emissary. He does not experience what the respondents feel. Controlled observation: here the observation takes place according to definite prearranged plan, involving experimental procedure. It usually takes place in laboratories. Uncontrolled observation: here the observations take place according to natural settings. The main aim here is to get spontaneous picture of the real life situations. It is resorted to in case of exploratory studies. Advantage: I.

If the observation is done accurately, it helps to eliminate subjective bias.

II.

The current information is neither affected by past information’s nor by future intensions.

III.

It is independent of willingness of the respondent to respond and hence it is suitable for the situations when the verbal report of the respondent is not required.

17

Limitations: I.

Expensive method.

II.

We get very limited information.

III.

Since we don’t talk to people hence it may happen that some unforeseen factors may interfere with the observational task.

3.2.2 Interview method: - interview is a type of discussion between two or more people for a definite purpose. It is the most powerful method of data collection. It helps us to gather valid and reliable data related to the research objective. It is divided into two parts: personal interview and telephonic interview. Personal interviews: this method requires two persons sitting in front of each other, the one who initiates and asks the question is called the interviewer and the respondent is called as the interviewee. I.

Structured interviews: it is a rigid way which involves set of predetermined questions and highly standardized techniques of recordings. It is used in case of descriptive studies. We often use this method because of ease of generalization of the responses given by several interviewee’s and requiring lesser skill on the part of interviewer. For example: if a company is conducting a survey before launching a product, then their questionnaire consists of a set of pre-defined questions.

II.

Unstructured interviews: it is characterized by a flexibility of approach to questioning. Here the interviewer may ask the questions in any order, he may also ask some extra questions or drop some questions. It requires the interviewer to be highly skilled and he should have deep knowledge of the subject. It helps us in collecting information in case of exploratory research studies. But this flexibility results in lack of comparability of one interview with other. It is time consuming. For example: the faculty interview for a specific subject, the corporate interviews for a domain.

III.

Focused interviews: it is meant to focus attention on the given experience of the respondent. Here the interviewer decides the sequence of the questions and also has the freedom to explore the reasons and motives. The respondent is given sufficient time to express their thoughts and observation’s. It deals with the situations that have been analyzed prior to the interview. It takes place with the persons known to have been in a particular situation. For example: if a person has witnessed a live incident , then the reporters would talk to that person about that incident and allow his to express his thoughts. Also an attempt to interview well-known persons like sports person on issues related to sports and its areas of development can be cited as a good example of focused interview.

IV.

Non-directive interviews: in these interviews, the interviewer allows the respondent to speak on a particular topic, relate their concrete experiences with no or little direction from the interviewer.

18

3.2.3 Collection of data through questionnaires 1. Through questionnaires A questionnaire is a research instrument consisting of a series of questions and other prompts for the purpose of gathering information from respondents. Questionnaire-based surveys are one of the most common tools used by market researchers to establish consumer preferences. Bad questionnaires are misleading and likely to yield meaningless data, so an awareness of the techniques of questionnaire design is essential to any student or researcher wanting to establish opinions on their subject of specialization. There are two main objectives in designing a questionnaire: •



To maximize the proportion of subjects answering our questionnaire - that is, the response rate. Response error should be avoided and try to obtain accurate relevant information for our survey. To develop the question’s which the respondent can and will answer. Two apparently similar ways of posing a question may yield different information.

Guidelines for constructing questionnaire The researcher must pay attention to the following points in constructing an appropriate and effective questionnaire: 1. The researcher must keep in view the problem he is to study for it provides the starting point for developing the questionnaire. He must be clear about the various aspects of case research problem to be dealt with in the course of his research project. 2. Appropriate form of questions depends on the nature of information sought, the sampled respondents & the kind of analysis intended. The researcher must decide whether to use closed or open ended questions. Questions should be simple & must be constructed with a view to there forming a logical path of a well through out tabulation plan. The unit of enumeration should also be defined precisely so that they can ensure accurate & full information. 3. Rough draft of the questionnaire should be prepared, giving due through the appropriate sequence by putting questions. Questionnaires previously drafted may as well be looked into at this stage. 4. Researcher must invariably re-examine, and in case of need may revise the rough draft for a better one. Technical must be minutely scrutinized & removed. 5. Pilot study should be undertaken for pre testing the questionnaire. The questionnaire may be edited in the light of the results of the pilot study. 6. Questionnaire must contain simple but straight forward direction for the respondent so that they may not feel any difficult in answering the questions.

3.3 Collection of Secondary Data The data that are already available is called Secondary data. It has already been collected and analysed by someone else. Secondary data may be published data or unpublished data. The published data are usually available in books, magazines, reports and publications of various associations, reports prepared by 19

research scholars, economists, universities etc. The unpublished data may be found in diaries, letters, unpublished biographies and also may be available with the research scholars, trade associations and other public / private individuals and organizations. The researcher must do the minute scrutiny because it may be possible that the secondary data may be unsuitable or may be inadequate in the context of the problem the researcher wants to study.

20

Chapter-IV Data Processing and Analysis Contents: 4.1 Introduction 4.2 Data Entry 4.2.1 Decision on File Format

4.2.2 Devise Code for Analysis

4.3 Processing of Data 4.3.1 Frequency Distribution 4.3.2 Cumulative Frequency Distribution 4.3.3 Relative Frequency Distribution

4.4 Presenting Data 4.4.1 Histograms 4.4.2 Ogive 4.4.2.1 A less-than ogive 4.4.2.2 A more-than ogive 4.4.3 a less-than cumulative frequency polygon 4.4.4 a more-than cumulative frequency polygon 4.4.5 Pie chart 4.4.6 Pareto chart 4.4.7 Time series graph

4.5 Measures of Central Tendency 4.5.1 The mode 4.5.2 The median 4.5.3 Arithmetic mean 4.5.4 Standard deviation

4.1 Introduction 21

After collecting data, they must be classified and presented in meaningful forms to have better insight of a research problem. Once the information is tabulated, it is easy to perform various statistical tests for their validity, accuracy and significance. Gathered information should be presented in such a manner that even a layman understands what, why, when and how of information.

4.2 Data Entry It is the process of taking completed questionnaires\surveys and putting them into a form that can readily be analyzed. A series of options need to consider when you enter the information you have gathered. You will first have to decide on a file format and then devise a code for analysis. 4.2.1 Decision on File Format It comprises of decisions regarding: 1 The way the data will be organized in a file 2. 3. 4. 5.

Order of information collected How subject is referenced Constructing individual records Application to statistics programs

4.2.2 Devise Code for Analysis The main points you need to remember while devising the code for analysis are: 1. 2. 3. 4.

Set of rules that translates answers into discrete values Alphabetical or Numerical depending on measurement scale Preserve level of measurement for each item General Considerations (closed questions)

4.3 Processing of Data 4.3.1 Frequency Distribution If the data are of repeating nature, then they should be presented in forms of the number of occurrences of each value of the data of particular type. The following are the steps of constructing a frequency distribution: 1. Specify the number of class intervals. A class is a group (category) of interest. No totally accepted rule tells us how many intervals are to be used. Between 5 and 15 class intervals are generally recommended. Note that the classes must be both mutually exclusive and all-inclusive. Mutually exclusive means that classes must be selected such that an item can’t fall into two classes and all-inclusive classes are classes that together contain all the data. 2. When all intervals are to be the same width, the following rule may be used to find the required class interval width: W = (L - S) / K where: W= class width, L= the largest data, S= the smallest data, K= number of classes

22

The frequency distribution can be classified into discrete frequency distribution and continuous frequency distribution which are demonstrated in Table 1 and 2, respectively. Table 1 Discrete Frequency Distribution Income Category

Number of Respondents (frequency)

Low Income

300

Medium Income

200

High Income

100

Table 2 Continuous Frequency Distribution Monthly income (in rupees) (class

Number of Respondents

interval)

(frequency)

0-5000

20

5000-10000

30

10000-15000

40

15000-20000

60

20000-25000

30

25000-30000

20

The Frequency Distribution: 1. Shows how the observations cluster around a central value. 2. Shows the degree of difference between observations. For example, in the above problem we know that none of the person has monthly income greater than 30,000, and the maximum number of people has monthly income between 15000 and 20000. This descriptive analysis provides us with an image of the monthly income of the population, which is not available from raw data. 4.3.2 Cumulative Frequency Distribution The cumulative frequency distribution is a modified form of frequency distribution, as shown in Table 3. In a given row, the value in the last column is the cumulative value of the frequencies shown in its last but one column up to that value.

Table 3 Cumulative Frequency Distribution Number of Monthly income (in

Respondents

rupees) (class interval)

(frequency)

Cumulative frequency

23

0-5000

20

20

5000-10000

30

50

10000-15000

40

90

15000-20000

60

150

20000-25000

30

180

25000-30000

20

200

4.3.3 Relative Frequency Distribution The relative frequency distribution is a modified form of frequency distribution, as shown in Table 4. In a given row, the value in the last column is the ratio between the frequency of that row and the total frequency.

Number of Monthly income (in rupees)

Respondents

Relative

(class interval)

(frequency)

frequency

0-5000

20

20/200=0.10

5000-10000

30

30/200=0.15

10000-15000

40

40/200=0.20

15000-20000

60

60/200=0.30

20000-25000

30

30/200=0.15

25000-30000

20

20/200=0.10

4.4 Presenting Data Graphs, curves, and charts are used to present data. Bar charts are used to graph the qualitative data. The bars do not touch, indicating that the attributes are qualitative categories; variables are discrete and not continuous. 4.4.1 Histograms are used to graph absolute, relative, and cumulative frequencies. 4.4.2 Ogive is a cumulative frequency curve. This can be classified into less-than-ogive and more-than-ogive. An ogive is constructed by placing a point corresponding to the upper end of each class at a height equal to the cumulative frequency of the class. These points then are connected. An ogive also shows the relative cumulative frequency distribution on the right side axis. 4.4.2.1 A less-than ogive shows how many items in the distribution have a value less than the upper limit of each class. 4.4.2.2 A more-than ogive shows how many items in the distribution have a value greater than or equal to the lower limit of each class.

24

4.4.3 A less-than cumulative frequency polygon is constructed by using the upper true limits and the cumulative frequencies. 4.4.4 A more-than cumulative frequency polygon is constructed by using the lower true limits and the cumulative frequencies. 4.4.5 Pie chart is often used in newspapers and magazines to depict budgets and other economic information. A complete circle (the pie) represents the total number of measurements. The size of a slice is proportional to the relative frequency of a particular category. For example, since a complete circle is equal to 360 degrees, if the relative frequency for a category is 0.40, the slice assigned to that category is 40% of 360 or (0.40)(360)= 144 degrees. 4.4.6 Pareto chart is a special case of bar chart and often used in quality control. The purpose of this chart is to show the key causes of unacceptable quality. Each bar in the chart shows the degree of quality problem for each variable measured. 4.4.7 Time series graph is a graph in which the X axis shows time periods and the Y axis shows the values related to these time periods.

4.5 Measures of Central Tendency Central tendency is defined as the central point around which data revolve. The following techniques can be employed:

4.5.1 The mode The mode is defined as the score (value or category) of the variable which is observed most frequently. For example: 3 7 5 8 6 4 5 9 5.From the above mentioned, the mode equals 5 because 5 appears to be the most frequent score amongst all the numbers (occurred 3 times).

4.5.2 The median The median indicates the middle value of a series of sequentially ordered scores. Because the median divides frequencies into two equal parts, it can also be described as being the fiftieth percentile: 10

13

14

15

18

19

22

25

25

The median in the above-mentioned is the fifth score, which is 18. There are 4 counts on both sides of the numerical value 18. In cases where you have, for instance: 25

10

13

14

15

18

19

22

25

26

29

There are 2 numerical values indicating the median. By dividing the result by 2, the median can be determined. The fifth score with a numerical value of 18 and the sixth score with the numerical value of 19 are in the middle of the sequentially ordered scores. The median for the above mentioned scores is therefore (18 + 19)/ 2 = 18.5. Note: Mode=3*median – 2* mean

4.5.3 Arithmetic mean The arithmetic mean refers to a measure of central tendencies found by adding all scores and dividing them by the number of scores. The following is an example: 5, 2, 6, 1, 6 = (Sum total of scores)/N Thus 5 + 2 + 6 + 1 + 6 = 20, because there are 5 scores, N = 5, and the sum total of the scores (20) is divided by 5. m= (∑fx)/ ∑f; ∑f=n f= frequency; x= mid value of class interval; n= total frequency

4.5.4 Standard deviation The standard deviation is a measure of the spread of dispersion of a distribution of scores. The deviation of each score from the mean is squared; the squared deviations are then summed, the result divided by n, and the square root taken. It is denoted by σ σ = √ {∑(x-m) 2/ n}

(Note: In case of Ungrouped Data)

Where m= (∑fx)/ ∑f; ∑f=n f= frequency; x= mid value of class interval; n= total frequency Calculation of Standard Deviation – Grouped Data. σ = √ [{∑f(x-m) 2}/ n] Where m= (∑fx)/ ∑f; ∑f=n f= frequency; x= mid value of class interval; n= total frequency

26

Chapter-V Measurement and Scaling Techniques Contents: 5.1 Levels of Measurement: 5.1.1 Nominal scale 5.1.2 Ordinal Scale

5.1.3 Interval Scale 5.1.4 Ratio Scale

5.2. Important Scaling Techniques: 5.2.1 Rating Scales 5.2.2 Ranking Scales

5.3 Scale construction techniques: 5.3.1 Arbitrary scales 5.3.2 Likert scales 5.3.2.1Procedure for developing a likert- type 3.2.2 Advantages: 5.3.2.3 Limitations 5.3.3 Cumulative scale 5.3.3.1Procedure for cumulative scales: 5.3.3.2 Advantages of cumulative scale 5.3.3.3Disadvantage

27

5.1 Levels of Measurement We know that the level of measurement is a scale by which a variable is measured. For 50 years, with few detractors, science has used the Stevens (1951) typology of measurement levels (scales). There are three things, which you need to remember about this typology: Any thing that can be measured falls into one of the four types: The higher the level of measurement, the more precision in measurement and every level up contains all the properties of the previous level. The four levels of measurement, from lowest to highest, are as follows: 1. 2. 3. 4.

Nominal Ordinal Interval Ratio

Fig. 4 levels of measurement

Types of Measurement Scales Ordinal and nominal data are always discrete. Continuous data has to be at either ratio or interval level of measure.

5.1.1 Nominal scale It is a system of assigning number symbols to events in order to label the data. It includes demographic characteristics like sex, race, and religion and therefore performs a major role in surveys and other ex-post-facto research where we classify the data by major sub-groups of population. Thus nominal level of measurement describes variables that are categorical in nature. The characteristics of the data you’re collecting fall into distinct categories: 28

1. If there are a limited number of distinct categories (usually only two), then you’re dealing with a dichotomous variable. 2. If there are an unlimited or infinite number of distinct categories, then you’re dealing with a continuous variable. Nominal Scale is the least powerful level of measurement. It indicates no order, no relationship and has no arithmetic origin. For example:

Which of the following food items do you tend to buy at least once per month? (Please tick) Okra

Palm Oil

Milled Rice

Peppers

Prawns

Pasteurized milk

The numbers have no arithmetic properties and act only as labels. The only measure of average that can be used is the mode because this is simply a set of frequency counts.

5.1.2 Ordinal Scale It is the lowest level of Ordered Scale. 1The ordinal level of measurement describes variables that can be ordered or ranked in some order of importance. 2It describes most judgments about things, such as big or little, strong or weak. 3Most opinion and attitude scales or indexes in the social sciences are ordinal in nature. The Ordinal Scale determines the students Rank in his class. Thus its use implies a statement of “greater than” or “lesser than” without being able to state how much great or less. An example of an ordinal scale used to determine farmers' preferences among 5 brands of pesticide. Order of preference Brand 1

Rambo

2

R.I.P.

3

Killalot

4

D.O.A.

5

Bugdeath

From such a table the researcher knows the order of preference but nothing about how much more one brand is preferred to another, which is there is no information about the interval between any two brands. All of the information a nominal scale would have given is available from an ordinal scale. In addition, positional statistics such as the median, quartile and percentile can be determined.

5.1.3 Interval Scale The interval scales have more or less equal intervals, or meaningful distances between their ranks. For example, if you were to ask somebody if they were first, second, or third

29

generation immigrant, the assumption is that the distance or number of years, between each generation is the same. Interval Scales may have arbitrary zero, but it is not possible for them to determine what may be called as “absolute zero” or “a unique origin”. Fahrenheit Scale is an example of Interval scale. Figure 3.3 Examples of interval scales in numeric and semantic formats Please indicate your views on Balkan Olives by scoring them on a scale of 5 down to 1 (i.e. 5 = Excellent; = Poor) on each of the criteria listed Balkan Olives are:

Circle the appropriate score on each line

Succulence

5 4 3 2 1

Fresh tasting

5 4 3 2 1

Free of skin blemish

5 4 3 2 1

Good value

5 4 3 2 1

Attractively packaged

5 4 3 2 1

(a) Please indicate your views on Balkan Olives by ticking the appropriate responses below: Excellent

Very Good

Good

Fair

Poor

Succulent Freshness Freedom from skin blemish Value for money Attractiveness of packaging (b)

Most of the common statistical methods of analysis require only interval scales in order that they might be used. 5.1.4 Ratio Scale The ratio level of measurement describes variables that have equal intervals and a fixed zero (or reference) point. It is possible to have zero income, zero education, and no involvement in crime, but rarely do we see ratio level variables in social science since it’s almost impossible to have zero attitudes on things, although “not at all”, “often”, and “twice as often” might qualify as ratio level measurement. It helps in measurement of physical dimensions such as weight, height distance etc. It allows to compare both differences in scores and the relative magnitude of scores, multiplication, division and all the Statistical Techniques are generally usable with ratio scales.

30

5.2. Important Scaling Techniques

31

We now take up some scaling techniques that are often used in context of social or business research. 5.2.1 Rating Scales: The Rating Scales involves qualitative description of a limited number of aspects of a thing or of traits of a person. In this we judge the properties of objects against the specified criteria, without reference to other similar object. In practice, three to seven-point scale are generally used for the reason that more points on a scale provide an opportunity for greater sensitivity of measurement. Rating Scale may be either a graphic rating scale or an itemized rating scale. Graphic rating scale is quite simple and is commonly used in practice. The various points are usually put along the line to form a continuum and the rater indicates his rating by simply making a mark at the appropriate point on a line that runs from one extreme to the other. Itemized rating scale presents a series of statements from which a respondent selects one as best reflecting his evaluation. These statements are ordered progressively in terms of more or less of some property. 5.2.2 Ranking Scales: In this we make relative judgments against other similar objects. The respondents under this method directly compare two or more objects and make choices among them.

How do you like the product? (Please check)

Like very

Like some

much

what

Neutral

Dislike Some Dislike very what

much

5.3 Scale construction techniques: In social science studies, while measuring attitudes of the people we generally follow the technique of preparing the opinionnaire (or attitude scale) in such a way that the score of the individual responses assigns him a place on a scale. Under this approach, the respondent expresses his agreement or disagreement with a number of statements relevant to the issue. While developing such statements, the researcher must note the following two points: 1. That the statements must elicit responses which are psychologically related to the attitude being measured; 32

2. That the statements need to be such that they discriminate not merely between extremes of attitude but also among individuals who differ slightly.

5.3.1 Arbitrary scales: These scales are developed on ad hoc bases and are designed largely through the researchers own subjective selection of items. The researcher first collects few statements or items which he believes are unambiguous and appropriate to a given topic. Some of these are selected for inclusion in the measuring instrument and then people are asked to check in a list the statement with which they agree. The chief merit of such scales is that they can be developed very easily, quickly and with relatively less expense. They can also be designed to be highly specific and adequate. Because of these benefits, such scales are widely used in practice. At the same time there are some limitations to these scales. The most important one is that we do not have objective evidence that such scale measure the concepts for which they have been developed. We have simply to rely on researcher’s insight and competence.

5.3.2 Likert scales: A Likert scale is what is termed a summated instrument scale. This means that the items making up a Liken scale are summed to produce a total score. In fact, a Likert scale is a composite of itemised scales. Typically, each scale item will have 5 categories, with scale values ranging from -2 to +2 with 0 as neutral response. This explanation may be clearer from the example in figure 3.12. Figure 3.12 The Likert scale Strongly Agree

Agree Neither Disagree Strongly Disagree

If the price of raw materials fell firms would 1 reduce the price of their food products.

2

3

4

5

Without government regulation the firms 1 would exploit the consumer.

2

3

4

5

Most food companies are so concerned 1 about making profits they do not care about quality.

2

3

4

5

The food industry spends a great deal of 1 money making sure that its manufacturing is hygienic.

2

3

4

5

Food companies should charge the same 1 price for their products throughout the country

2

3

4

5

Likert scales are treated as yielding Interval data by the majority of marketing researchers. The scales which have been described in this chapter are among the most commonly used in marketing research. Whilst there are a great many more forms 33

which scales can take, if students are familiar with those described in this chapter they will be well equipped to deal with most types of survey problem.

Strongly agree

agree

undecided

disagree strongly disagree

5.3.2.1 Procedure for developing a likert- type scale: 1. The researcher collects large number of statements which are relevant to the attitude being studied and each of the statements expresses definite favorableness or un-favorableness to a particular point of view. 2. After the statements have been gathered, a trial test should be -administered to a number of subjects. 3. the response to various statements are scored in such a way that the response indicative of most favorable attitude is given the highest score of five and that with the most unfavorable attitude is given lowest score one. 4. Then the total score of each respondent is obtained by adding his scores that he receives for separate statements. 5. The next step is to array these total scores and find out those statements which have a high discriminatory power. For this purpose, the researcher may select some part of the highest and the lowest total scores say the top 25% and the bottom 25%. 6. Only those statements that correlate with the total test should be retained in the final instrument and al others must be discarded from it.

5.3.2.2 Advantages: 1. Likert type scale is considered more reliable because under it respondents answer each statement included the instrument. As such it also provides more information. 2. Likert type scale can easily be used in respondent centered and stimulus centered studies.

5.3.2.3 Limitations: There are several limitations to the likert type scale, one important limitation is that, with this scale, we can simply examine weather respondents are more or less favorable to topic, but we can not tell how much more or less they are. There is no basis for belief that the five positions indicated on the scale are equally spaced.

5.3.3 Cumulative scale: Cumulative scale like other scales consists of series of statements to which a respondent expresses his agreement or disagreement. The special feature of this type of is that statements in it form a cumulative series. In other words the statements are related to one another in such a way that an individual, who replies 34

favorably to item no 3 also replies favorably to items 2 and 1and 1 who replier favorably to item 4 also replies favorably to item no 3, 2 and 1 and so on.

5.3.3.1Procedure for cumulative scales: 1. We must lay down in clear terms the issue we want to deal with in our study. 2. The next step is to develop a number of items relating the issue and to eliminate by inspection the items that are ambiguous, irrelevant or those that happen to be too extreme items. 3. This step consists in pre-testing the items to determine whether the issue at hand is scalable. In a pre- test the respondents are ask to record the opinions on all selected items using the likert type 5 point scale, ranging from ‘strongly agree’ to strongly ‘disagree’. The strongest favorable response is scored as 5, where as the strongest unfavorable response as 1. the total score can thus range , if there are 15 items in all from 75 for most favorable to 15 for the least favorable. 4. The next step is to total the scores for various opinions and to rearray them to reflect any shift in order, resulting from reducing the items, say, from 15 in pretest, say, 5 for the final scale.

5.3.3.2 Advantages of cumulative scale: It assures that only a single dimension of attitude is being measured. Researcher’s subjective judgment is not allowed to creep in the development of scale since the scale is determined by the replies of respondents.

5.3.3.3Disadvantage: The main difficulty in using the scaling technique is that in practice perfect cumulative or unidirectional scales are very rarely found and we have only to use its approximation testing through coefficient of reproducibility or examining on the bases of some other criteria. This method is not frequently used for simple reason that its development procedure is tedious and complex.

35

Chapter-VI Sampling Design Contents:

6.1 Introduction 6.1.1 Need for Sampling 6.1.2 Concept of Population and Sample 6.1.3 Sampling Frame

6.2 Census and Sample Survey 6.3 Types of Sampling 6.3.1 Non-Probability Sampling Methods 6.3.1.1 Convenience Sampling 6.3.1.2. Judgement Sampling 6.3.1.3 Quota Sampling 6.3.1.4 Snowball Sampling 6.3.1.5 Advantages of Non-probability Sampling 6.3.1.6 Disadvantages of Non-probability Sampling 6.3.2 Probability Sampling Methods 6.3.2.1 Simple Random Sampling 6.3.2.2 Stratified Random Sampling 6.3.2.3 Systematic Random Sampling

6.4 Sample size and its determination 6.5 Sampling Distributions 6.6 Important Sampling Distribution 6.1 Introduction 36

In this lesson, we shall describe the basic thing, how to collect data. We shall also discuss a variety of methods of selecting the sample called Sampling Designs, which can be used to generate our sample data sets.

6.1.1Need for Sampling Sampling is used in practice for a variety of reasons such as: 1. Sampling can save time and money. A sample study is usually less expensive than a census study and produces results at a relatively faster speed. 2. Sampling may enable more accurate measurements for a sample study is generally conducted by trained and experienced investigators. 3. Sampling remains the only way when population contains infinitely many members. 4. Sampling remains the only choice when a test involves the destruction of the items under study. 5. Sampling usually enables to estimate the sampling errors and, thus, assists in obtaining information concerning some characteristic of the population.

6.1.2 Concept of Population and Sample Statisticians commonly separate the statistical techniques into two broad categoriesDescriptive and Inferential. The Descriptive Statistics deals with collecting, summarizing and simplifying the complicated data. It also helps in understanding the data and report making. The Inferential Statistics deals with methods used for drawing inferences about the totality of observations on the basis of knowledge gained Population is roughly defined as collection of all elements taken into consideration and about which conclusion have to be drawn. For example: If the study is been conducted to determine average salary of the workers of a factory, then the population will consists workers in the factory. Similarly, if we investigate about fertility of land in a region, then the population will consists of all lands under cultivation. Thus population refers to all items under investigation. Sample can be defined as collection of some elements of population. In other words, a part of totality on which information is generally collected and analyzed for the purpose of understanding any aspect of the population. The part of population taken into consideration is called Sampling Unit. For example: A doctor examines a few drops of blood to draw conclusions about the nature of disease or blood constitution of the whole body.

37

If the sampling unit comprises of all units of all elements of population may be viewed as Elementary Sampling Unit For example: In textile industry, the workers of a department whose wages may be a sample and all the workers of the company will be considered as population. The total number of units in the population is known as population size. The total number of units in the sample is known as sample size. Any characteristic of population is called parameter and that of sample is called statistic.

6.1.3 Sampling Frame To select a random sample of sampling units, we need a list of all sampling units contained in the population. Such a list is called a Sampling Frame

6.2 Census and Sample Survey It is possible to examine every person of the population if we want to calculate average wage of a person working in a factory, then all the elements of population will be called as primary sampling unit. Also we call this a complete enumeration or CENSUS. The census method is not very popularly used in practice. Since the effort, money & time required for carrying out complete enumeration will generally be extremely large and in many cases, it involves huge cost. The standard deviation of sampling distribution is called standard error, larger the sample size lower will be the standard error. We have also studied various sources of sampling and non-sampling error along with principles of sampling. For the process of statistical inference to be valid we must ensure that we take a representative sample of our population. Whatever method of sample selection we use, it is vital that the method is described. How do we know if the characteristics of a sample we take match the characteristics of the population we are sampling? The short answer is we don’t. We can, however, take steps that make it as likely as possible that the sample will be representative of the population. Two simple and effective methods of doing this are making sure that the sample size is large and making sure it is randomly selected. A large sample size is more likely to be representative of a population than a small one. Think of extreme cases. If we want to know the average height of the population and we select just one person and measure their height it is unlikely to be close the population

38

average. If we took 1,000,000 people, measured their heights and took the average, this figure would be likely to be close to the population average.

Types of Sampling The type of enquiry you want to have and the nature of data that you want to collect fundamentally determines the technique or method of selecting a sample. The procedure of selecting a sample may be broadly classified under the following three heads: — Non-Probability Sampling Methods — Probability Sampling — Mixed Sampling Now let us discuss these in detail. We will start with the non-probability sampling then we will move on to probability sampling. 6.3.1 Non-Probability Sampling Methods: The common feature in non probability sampling methods is that subjective judgments are used to determine the population that are contained in the sample .We classify non-probability sampling into four groups: 1. Convenience Sampling 2. Judgement Sampling 3. Quota Sampling 4. Snowball sampling 6.3.1.1 Convenience Sampling This types of sampling is used primarily for reasons of convenience. It is used for exploratory research and speedy situations. It is often used for new product formulations or to provide gross-sensory evaluations by using employees, students, peers, etc. • Convenience sampling is extensively used in marketing studies This would be clear from the following examples: • • •

1. Suppose a marketing research study aims at estimating the proportion of Pan (Beetle leaf) shops in Delhi, which store a particular drink Maaza. It is decided to take a sample of size 150. What the investigator does is to visit 150 Pan shops near his place of office as it is very convenient to him and observe whether a Pan shop stores Maaza or not. This is definitely not a representative sample, as most Pan shops in Delhi had no chance of being selected. It is only those Pan shops which were near the office of the investigator has a chance of being selected 2. A ball pen manufacturing company is interested in knowing the opinions about the ball pen (like smooth flow of ink, resistance to’ breakage of the cover etc.) it is presently manufacturing with a view to modify it to suit customers

39

need. The job is given to a marketing researcher who visits a college near his place of residence and asks a few students (a convenient sample) their opinion about the ‘ball pen” in question. 6.3.1.2. Judgement Sampling It is that sample in which the selection criteria are based upon the researcher’s personal judgment that the members of the sample are representative of the population under study. • It is used for most test markets and many product tests conducted in shopping malls. If personal biases are avoided, then the relevant experience and the acquaintance of the investigator with the population may help to choose a relatively representative sample from the population. It is not possible to make an estimate of sampling error as we cannot determine how precise our sample estimates are. Judgement sampling is used in a number of cases, some of which are: •

1. Suppose we have a panel of experts to decide about the launching of a new product in the next year. If for some reason or the other, a member drops out, from the panel, the chairman of the panel may suggest the name of another person whom he thinks has the same expertise and experience to be a member of the said panel. This new member was chosen deliberately - a case of Judgment sampling. 2. The method could be used in a study involving the performance of salesmen. The salesmen could be grouped into top-grade and low-grade performer according to certain specified qualities. Having done so, the sales manager may indicate who in his opinion, would fall into which category. Needless to mention this is a biased method. However in the absence of any objective data, one might have to resort to this type of sampling. 6.3.1.3 Quota Sampling This is a very commonly used sampling method in marketing research studies. Here the sample is selected on the basis of certain basic parameters such as age, sex, income and occupation that describe the nature a population so as to make it representative of the population. The Investigators or field workers are instructed to choose a sample that conforms to these parameters. The field workers are assigned quotas of the number of units satisfying the required characteristics on which data should be collected. However, before collecting data on these units, the investigators are supposed to verify that the units qualify these characteristics. Suppose we are conducting a survey to study the buying behavior of a product and it is believed that the buying behavior is greatly influenced by the income level of the consumers. We assume that it is possible to divide our population into three income strata such as high-income group, middle-income group and low-income group. Further it is known that 20% of the population is in high income group, 35% in the middle-income group and 45% in the low-income group. Suppose it is decided to select a sample of size 200 from the population. Therefore, samples of size 40, 70 and90 should come from high income, middle income and low income groups respectively. Now the

40

various field workers are assigned quotas to select the sample from each group in such a way that a total sample of 200 is selected in the same proportion as mentioned above. 6.3.1.4 Snowball Sampling — The sampling in which the selection of additional respondents (after the first small group of respondents is selected) is based upon referrals from the initial set of respondents. — It is used to sample low incidence or rare populations — It is done for the efficiency of finding the additional, hard-to-find members of the sample. 6.3.1.5 Advantages of Non-probability Sampling — It is much cheaper to probability sampling. — It is acceptable when the level of accuracy of the research results is not of utmost importance. — Less research time is required than probability samples. — It often produces samples quite similar to the population of interest when conducted properly. 6.3.1.6 Disadvantages of Non-probability Sampling — You cannot calculate Sampling error. Thus, the minimum required sample size cannot be calculated which suggests that you (researcher) may sample too few or too many members of the population of interest. — You do not know the degree to which the sample is representative of the population from which it was drawn. — The research results cannot be projected (generalized) to the total population of interest with any degree of confidence.

6.3.2 Probability Sampling Methods Probability sampling is the scientific method of selecting samples according to some laws of chance in which each unit in the population has some definite pre-assigned probability of being selected in the sample. The different types of probability sampling are: 1. Where each unit has an equal chance of being selected. 2. Sampling units have different probabilities of being selected 3. Probability of selection of a unit is proportional to the sample size. 6.3.2.1 Simple Random Sampling It is the technique of drawing a sample in such a way that each unit of the population has an equal and independent chance of being included in the sample. In this method an equal probability of selection is assigned to each unit of population at the first draw. It also implies an equal probability of selecting in the subsequent draws. Thus in simple random sample from a population of size N, the probability of drawing any unit in the first draw is 1/N.The probability of drawing a second unit in the second draw is (1/N)-1.

41

The probability of selecting a specified unit of population at any given draw is equal to the probability of its being selected at the first draw. Selection of a Simple Random Sample: As we all know Simple Random Sample refers to that method of selecting a sample in which each and every unit of population is given independent and equal chance to be included in the sample. But, Random Sample does not depend only upon selection of units, but also on the size and nature of the population. One procedure may be good and simple for a small sample but it may not be good for the large population. Generally, the method of selecting a sample must be independent of the properties of sampled population. Proper precautions should be taken to ensure that your selected sample is random. Although human bias is inherent in any sampling scheme administered by human beings. Random selection is best for two reasons - it eliminates bias and statistical theory is based on the idea of random sampling. We can select a simple random sample through use of tables of random numbers, computerized random number generator or lottery method. Thus, the three methods of drawing simple random sample are mechanical method and using tables of random numbers and sealed envelopes (lottery system) etc. Lottery Method This is the simplest method of selecting a random sample. We will illustrate it by means of example for better understanding. Suppose, we want to select “r” candidates out of “n”. We assign the numbers from 1 to n i.e. to each and every candidate we assign only one exclusive number. These numbers are then written on n slips which are made as homogeneous as possible in shape, size, colour, etc. These slips are then put in a bag and thoroughly shuffled and then “r” slips are drawn one by one. The “r” candidates corresponding to numbers on the slips drawn will constitute a random sample. This method of selecting a simple random sample is independent of the properties of population. Generally in place of slips you can use cards also. We make one card corresponding to one unit of population by writing on it the number assigned to that particular unit of population. The pack of cards is a miniature of population for sampling purposes. The cards are shuffled a number of times and then a card is drawn at random from them. This is one of the most reliable methods of selecting a random sample. Merits and Limitations of Simple Random Sampling Merits 1. Since sample units are selected at random providing equal chance to each and every unit of population to be selected, the element of subjectivity or personal bias is completely eliminated. Therefore, we can say that simple random sample is more representative of population than purposive or judgement sampling.

42

2. You can ascertain the efficiency of the estimates of the parameters by considering the sampling distribution of the statistic (estimates) For example: One measure of calculating precision is sample size. Sample mean becomes an unbiased mean of population mean or a more efficient estimate of population mean as sample size increases. Limitations 1. The selection of simple random sample requires an up-to-date frame of population from which samples are to be drawn. Although it is impossible to have knowledge about each and every unit of population if population happens to be very large. This restricts the use of simple random sample. 2. A simple random sample may result in the selection of the sampling units, which are widely spread geographically and in such a case the administrative cost of collecting the data may be high in terms of time and money. 3. For a given precision, simple random sample usually requires larger sample size as compared to stratified random sampling which we will be studying next. The limitations of simple random sample will be clear from the example. Therefore, some of the randomly allocated samples prove very non-random. This type of problem can be eliminated by use of Stratified Random Sampling, in which the population is divided into different strata. Now, we will move into details of stratified random sampling.

6.3.2.2 Stratified Random Sampling We have understood that in simple random sampling, the variance of the sample estimate of the population is a. inversely proportional to the sample size, and b. directly proportional to the variability of the sampling units in the population. We also know that the precision is defined as reciprocal of its sampling variance. Therefore as sample size increases precision increases. Apart from increasing the sample size or sampling fraction n/N, the only way of increasing the precision of sample mean is to devise a sampling technique which will effectively reduce variance, the population heterogeneity. One such technique is Stratified Sampling. Stratification Means Division into Layers Past data or some other information related to the character under study may be used to divide the population into various groups such that i. units within each group are as homogeneous as possible and

43

ii. the group means are as widely different as possible. Thus, if we have a population consisting of N sampling units, it is divided into k relatively homogeneous mutually disjoint (non overlapping) sub-groups, termed as strata, of sizes N1, N2,LL,.., Nk , such that N = “Ni for i =1 to k . Now you draw a simple random sample of size ni (i=1, 2, 3,... k) from each stratum. This type of technique of drawing a sample is called stratified random sampling and the sample is called stratified random sample. There are two points which you have to keep in mind while drawing a stratified random sample. — Proper classification of the population into various strata, and — A suitable sample size from each stratum. Both these points are important to be considered because if your stratification is faulty, it cannot be compensated by taking large samples. Advantages of Stratified Random Sampling 1. More Representative In non-stratified random sample some strata may be over represented, others may be under-represented while some may be excluded altogether. Stratified sampling ensures any desired representation in the sample of the various strata in the population. It overrules the possibility of any essential group of the population being completely excluded in the sample. Stratified sampling thus provides a more representative cross section of the population and is frequently regarded as the most efficient system of sampling. 2. Greater Accuracy Stratified sampling provides estimates with increased precision. Moreover, stratified sampling enables us to obtain the results of known precision for each stratum. 3. Administrative Convenience As compared with simple random sample, the stratified random samples are more concentrated geographically. Accordingly, the time and money involved in collecting the data and interviewing the individuals may be considerably reduced and the supervision of the field work could be allocated with greater ease and convenience. 6.3.2.3 Systematic Random Sampling If you have the complete and up-to-date list of sampling units is available you can also employ a common technique of selection of sample, which is known as systematic sampling. In systematic sampling you select the first unit at random, the rest being automatically selected according to some predetermined pattern involving regular spacing of units.

44

Now let us assume that the population size is N. We number all the sampling units from 1 to N in some order and a sample of size n is drawn in such a way that N = nk i.e. k = N/n , where k, usually called the sampling interval, is an integer. In systematic random sampling we draw a number randomly, let us suppose that the number drawn is i and selecting the unit corresponding to this number and every kth unit subsequently. Thus the systematic sample of size n will consist of the units i, i+k, i+2k, - - - - - - - - - - - - , i+ (n-1)k. The random number i is called the random start and its value determines the whole sample. Merits and Demerits of Systematic Random Sampling Merits I. .Systematic sampling is operationally more convenient than simple random sampling or stratified random sampling. It saves your time and work involved. II. This sampling is more efficient to simple random sample, provided the frame (the list from which you have drawn the sample units ) is arranged wholly at random Demerits I. The main disadvantage of systematic sampling is that systematic sampling is that systematic samples are not in general random samples since the requirement in merit two is rarely fulfilled. II. If N is not a multiple of n, then the actual sample size is different from that required, and sample mean is not an unbiased estimate of the population mean. Cluster Sampling In this type of sampling you divide the total population, depending upon the problem under study, into some recognizable sub-divisions which are termed as clusters and a simple random sample of n blocks is drawn. The individuals whom you have selected from the blocks constitute the sample. Notes — Clusters should be as small as possible consistent with the cost and limitations of the survey. — The number of sampling units in each cluster should be approximately same. Thus cluster sampling is not to be recommended if we have sampling areas in the cities where there are private residential houses, business and industrial complexes, apartment buildings, etc., with widely varying number of persons or households. Multistage Sampling

45

One better way of selecting a sample is to resort to sub-sampling within the clusters, instead of enumerating all the sampling units in the selected cluster. This technique is called two-stage sampling, clusters being termed as primary units and the units within the clusters being termed as secondary units. This technique can be generalized to multistage sampling. We regard population as a number of primary units each of which is further composed of secondary stage units and so on, till we ultimately reach a stage where desired sampling units are obtained. In multi-stage sampling each stage reduces the sample size. Merits and Limitations Merits: i. Multistage sampling is more flexible as compared to other methods .It is simple to carry out and results in administrative convenience by permitting the field work to be concentrated and yet covering large area. ii. It saves a lot of operational cost as we need the second stage frame only for those units which are selected in the first stage sample. iii. It is generally less efficient than a suitable single- stage sampling of the same size.This brings an end on today’s discussion on sampling techniques. Thus in the nutshell we can say that Non probabilistic sampling such as Convenience sampling, Judgement Sampling and Quota sampling are sometimes used although representative ness of such a sample cannot be ensured. Whereas a probabilistic sampling to each unit of the population to be included in the sample and in this sense it is a representative sample of the population. Points to Ponder Sampling is based on two premises. One is that there is enough similarity among the elements in a population that a few of these elements will adequately represent the characteristic of the total population. The second premises is that while some elements in a sample underestimate the population value, others overestimate the value. The results of these tendencies are that a sample mean is generally a good estimate of population mean. A good sample has both accuracy & precision. An accurate sample is one which there is little or no bias or systematic variance. A sample with adequate precision is one that has a sampling error that is within acceptable limits. A variety of sampling technique is available, of which probability sampling is based on random selection – a controlled procedure that ensures that each population element is given a known nonzero chance of selection.

46

In contrast non-probability selection is not random. When each sample element is drawn individually from the population at large, it is unrestricted sampling.

6.4 Sample size and its determination In sample analysis the most ticklish question is: What should be the size of the sample or how large pr small should be ‘n’? If the sample size (‘n’) is too small, it may not serve to achieve the objectives and if it is too large, we may incur huge cost and waste resources. As a general rule, one can say that the sample must be of an optimum size i.e., it should neither be excessively large nor too small. Technically the sample size should be large enough to give a confidence interval of desired width and as such the size of the sample must be chosen by some logical process before sample is taken from the universe. Size of the sample should be determined by researcher keeping in view the following points: 1. Nature of Universe: Universe may be either homogenous or heterogeneous in nature. If the items of the universe are homogenous, a small sample can serve the purpose. But if the sample is heterogeneous, a large sample would be required. Technically, this can be termed as the dispersion factor. 2. Number of classes proposed: If many class – groups (groups and sub – groups) are to be formed, a large sample would be required because a small sample might not be able to give a reasonable number of items in each class – groups. 3. Nature of Study: If items are to be intensively and continuously studied, the sample should be small. For general survey the size of the sample should be large, but a small sample is considered appropriate in technical surveys. 4. Type of Sampling: Sampling technique plays an important part in determining the size of the sample. A small random sample is apt to be much superior to a larger but badly selected sample. 5. Standard of accuracy and acceptable confidence level: If the standard of accuracy or the level of precision is to be kept high, we shall require relatively larger sample. For doubling the accuracy for a fixed significance level, the sample size has to be increased fourfold. 6. Availability of finance: In practice, size of the sample depends upon the amount of money available for the study purposes. This factor should be kept in view while determining the size of the sample for large samples result in increasing the cost of sampling estimates. Other considerations: Nature of units, size of the population, size of questionnaire, availability of trained investigators, the conditions under which the sample is being conducted, the time available for completion of the study are a few other considerations to which a researcher must pay attention while selecting the size of the sample

6.5 Sampling Distributions The process of generalizing the sample results of the population is referred to as statistical inference. Here, we shall use certain sample statistics (such as the sample mean, the sample proportion, etc.) in order to estimate and draw inferences about the true population parameters. For example, in order to be able to use the sample mean to estimate the population mean, we should examine every possible sample (and its mean) that could

47

have occurred in the process of selecting one sample of a certain size. If this selection of all possible samples actually were to be done, the distribution of the results would be referred to as a sampling distribution. Although, in practice, only one such sample is actually selected, the concept of sampling distributions must be examined so that probability theory and its distribution can be used in making inferences about the population parameter values. Sampling theory has made it possible to deal effectively with these problems. However, before we discuss in detail about them from the standpoint of sampling theory, it is necessary to understand the central limit theorem and the following three probability distributions, their characteristics and relations: (1) The population (universe) distribution, (2) The sample distribution, and (3) The sampling distribution. Central Limit Theorem: The Central Limit Theorem, first introduced by De Moivre during the early eighteenth century, happens to be the most important theorem in statistics. According to this theorem, if we select a large number of simple random samples, say, from any population distribution and determine the mean of each sample, the distribution of these sample means will tend to be described by the normal probability distribution with a mean µ and variance σ2/n. This is true even if the population distribution itself is not normal. Or, in other words, we say that the sampling distribution of sample means approaches to a normal distribution, irrespective of the distribution of population from where sample is taken and approximation to the normal distribution becomes increasingly close with increase in sample size. Symbolically, the theorem can be explained as follows: When given n independent random variables X1, X2,X3LLXn, which have the same distribution (no matter what the distribution), then: X= X1+X2+X3+LL.Xn is a normal variate. The mean µ and variance σ2 of X are µ = µ1+ µ2+ µ3+L+ µn = n µi σ2 = σ21+ σ22+ σ23 +L+ σ2n= n σ2i where µi and σ2i are the mean of Xi. The utility of this theorem is that it requires virtually no conditions on distribution patterns of the individual random variable being summed. As a result, it furnishes a practical method of computing approximate probability values associated with sums of arbitrarily distributed independent random variables. This theorem helps to explain why a vast number of phenomena show approximately a normal distribution. Let’s consider a case when the

48

population is skewed, skewness of the sampling distribution of means is inversely proportional to the square root of the sample size. Consider the case when n=16 that means is inversely proportional to the square root of the sample size. Consider the case when n=16 that means the sampling distribution of means will exhibit only one-fourth as much skewness as the population has. Consider the case when n=100, skewness becomes one-tenth as much, ie., as the sample size increases, the skewness will decrease. As a practical consequence, the normal curve will serve as a satisfactory model when samples are small and population is close to a normal distribution, or when samples are large and population is markedly skewed. Because of its theoretical and practical significance, this theorem is considered as most remarkable theoretical formulation of all probability laws. The Population (Universe) Distribution When we talk of population distribution, we assume that we have investigated the population and have full knowledge of its mean and standard deviation. For example, a company might have manufactured 1, 00,000 tyres of cars in the year 2004. Suppose, it contacts all those who had bought these tyres and gathers information about the life of these tyres. On the basis of the information obtained, the mean of the population which is also called true mean symbolized by µ and its standard deviation symbolized by σ can be worked out. These Greek letters µ and σ are used for these measures to emphasise their difference from corresponding measure taken from a sample. It may be noted such measures characterizing a population care called population parameters. The shape of the distribution of the life of tyres may be as follows:

Distribution of the Life of Tyres It is clear from above that, though, the distribution shows slight skewness, it does not depart radically from a normal distribution. However, this should not lead one to the conclusion that for sampling theory to apply, it is necessary that the distribution must be normally distributed. The Sample Distribution When we talk of a sample distribution, we take a sample from the population. A sample distribution may take any shape. The mean and standard deviation of the sample

49

distribution are symbolized by x and s respectively. A measure characterizing a sample such as x or s is called a sample statistic. It may be noted that several sample distributions are possible from a given population. Suppose, in the above illustration, the manufacturer takes a s sample of 500 tyres. He contacts the buyers and enquiries about he life of tyres. The shape of the distribution of these tyres may be as follows:

Sample distribution of 500 tyres The mean values of these tyres can be expected to differ somewhat from one sample to another. The sample means constitute the raw material out of which a sampling distribution is constructed. The Sampling Distribution Sampling distributions constitute the theoretical basis of statistical inference and are of considerable importance in business decision making. If we take numerous different samples of equal size from the same population, the probability distribution of all the possible values of a given statistic from all the distinct possible samples of equal size is called a sampling distribution. It is interesting to note that sampling distributions closely approximate a normal distribution. It can be see that the mean of a sampling distribution of sample means is the same as the mean of the population distribution from which the sample are taken. The mean of the sampling distribution is designated by the same symbol as the mean of the population, namely µ. However, the standard deviation of the sampling distribution of means given a special name, standard error of mean, and is symbolized by σx¯. The subscript indicates that in this case, we are dealing with a sampling distribution of means. The greatest importance of sampling distributions is the assistance that they give us in revealing the patterns of sampling errors and their magnitude in terms of standard error. In sampling with replacement, we can observe a good deal of fluctuations in the sample mean as compared to fluctuations in the actual population. The fact that the sample means are less variable than the population data follows logically from an understanding of the averaging process. A particular sample mean averages together all the values in the sample. A population (universe) may consist of individual outcomes that can take on a wide range of values from extremely small to extremely large. However, if an extreme value falls

50

into the sample, although it will have an effect on the mean, the effect will be reduced since it is being averaged in with all the other values in the sample. Moreover, as the sample size increases, the effect of a single extreme value gets even smaller, since it is being averaged with more observations. This is a single extreme value gets even smaller, since it is being averaged with more observations. This phenomenon is expressed statistically in the value of the standard deviation of the sample mean. This is the measure of variability of the mean from sample to sample and is referred to as the standard deviation of the sampling distribution of sample mean or the standard error of the mean denoted as σx¯ and is calculated by σx¯ = σ/ √n This formula holds only when population is infinite or sample are from finite population with replacement. It may be noted that in deducing a sampling distribution, we must first make an assumption about the appropriate parameter. In as much as any value can be assumed for a parameter. In as much as any value can be assumed for a parameter, depending upon our knowledge or a guess of the population, there is no theoretical limit to the number of sampling distribution of the same sample size that can be taken from the population. There is a sampling distribution for each assumed value of a parameter. Also, given the assumed value of a parameter, there is a different sampling distribution of statistics for each specific sample size. Further, under the same assumptions about a population and the same sample size, the distribution of one statistic differs from that of another statistic. For example, the pattern of the distribution of X¯ (x bar) will differ from that of s2, even though both measures are computed from the same sample. Relationship between Population, Sample and Sampling Distribution It will be interesting to note that the mean of the sampling distribution is the same as the mean of the population. It is possible that many sample means may differ from the population mean. However, the sample information can be used as an estimate of population values. It has also been established that the observed standard deviation of a sample is close to the standard deviation of the population values. In fact, the standard deviation of the sample is usually so good an approximation that it can safely be used as an estimate of the corresponding population measure. In order to use s of the sample to estimate σ of the population, we make a slight adjustment which has been found to contribute to greater accuracy of the estimate. The adjustment consists of using (n-1) instead of n in the formula for the standard deviation of a sample, i.e., we use s =√{ ∑(x-x¯)/n-1}

51

The adjustment decreases the denominator and, therefore, gives a larger result. Thus, the estimated standard deviation of the population is slightly larger than the observed standard deviation of the sample

6.6 Important sampling distribution Some important sampling distribution, which are commonly used are: (1) sampling distribution of mean; (2) sampling distribution of proportion; (3) student’s,‘t’ distribution; (4) F distribution; and (5) Chi-square distribution. A brief mention of sampling distribution of mean is described below: Sampling distribution of mean: Sampling distribution of mean refers to the probability distribution of all the possible means of random samples if a given size that we take from a population. If samples are taken from a normal population, N(µ, σx¯), the sampling distribution of mean would also be normal with mean µx‾ = µ and standard deviation σx¯=σ/√n where µ is the mean of the population, σx¯ is the standard deviation of the population and n means the number of items in a sample. But when sampling is from a population which is not normal (may be positively or negatively skewed), even then, as per the central limit theorem, the sampling distribution of mean tends quite closer to the normal distribution, provided the number of sample items is large i.e., more than 30. In case we want to reduce the sampling distribution of mean to unit normal distribution i.e., N(0,1), we can write the normal variable z = (x‾-µ)/( σ/√n) for the sampling distribution of mean. This characteristic of the sampling distribution of mean is very useful in several decision situations for accepting or rejection of hypothesis

52

Chapter-VII Testing of Hypotheses Contents

7.1 Introduction 7.2 What is Hypothesis? 7.3 Procedure for Hypotheses Testing 7.3.1 Set up a hypothesis. 7.3.2 Set up a suitable significance level. 7.3.3 Determination of a suitable test statistic 7.3.4 Determine the critical region. 7.3.5 Doing computations 7.3.6 Making decisions

7.4 Type I and Type II Errors 7.5 Important Parametric Tests 7.5.1 Large Sample test: z-test 7.5.1.2 Testing Hypothesis about the Difference between Two Means 7.5.2 Small sample test: 7.5.2.1 t-test 7.5.3 χ2-test 7.5.3.1 Degrees Of Freedom 7.5.3.2 Properties of Chi- Square distribution 7.5.3.3 USES OF ψ2TEST 7.5.3.4 Conditions for applying Chi-Square test 7.5.3.5 Working Rule For ψ2 –Test 7.5.3.6 Ψ2 Test For Goodness of Fit 7.5.3.7 Ψ2 Test As A Test Of Independence

53

7.1 Introduction A hypothesis is an assumption about the population parameter to be tested based on sample information. The statistical testing of hypothesis is the most important technique in statistical inference. Hypothesis tests are widely used in business and industry for making decisions. It is here that probability and sampling theory plays an ever-increasing role in constructing the criteria on which business decisions are made. Very often in practice we are called upon to make decisions about population on the basis of sample information. For example, we may wish to decide on the basis of sample data whether a new medicine is really effective in curing a disease, whether one training procedure is better than another, etc. Such decisions are called statistical decisions. In other words, a hypothesis is the assumption that we make about the population parameter. This can be any assumption about a population parameter not necessarily based on statistical data. For example it can also be based on the gut feel of a manager. Managerial hypotheses are based on intuition; the market place decides whether the manager’s intuitions were in fact correct. In fact managers propose and test hypotheses all the time. For example: 1. If a manager says ‘if we drop the price of this car model by Rs15000, we’ll increase sales by 25000 units’ is a hypothesis. To test it in reality we have to wait to the end of the year to and count sales. 2. A manager estimates that sales per territory will grow on average by 30% in the next quarter is also an assumption or hypotheses. How would the manager go about testing this assumption? Suppose he has 70 territories under him. One option for him is to audit the results of all 70 territories and determine whether the average growth is greater than or less than 30%. This is a time consuming and expensive procedure. Another way is to take a sample of territories and audit sales results for them. Once we have our sales growth figure, it is likely that it will differ somewhat from our assumed rate. For example we may get a sample rate of 27%. The manager is then faced with the problem of determining whether his assumption or hypothesized rate of growth of sales is correct or the sample rate of growth is more representative. To test the validity of our assumption about the population we collect sample data and determine the sample value of the statistic. We then determine whether the sample data supports our hypotheses assumption regarding the average sales growth.

7.2 What is Hypothesis? In attempting to reach decisions, it is useful to make assumptions or guesses about the populations involved. Such assumptions, which may or may not be true, are called statistical hypothesis and in general are statements about the probability distributions of the population. The hypothesis is made about the value of some parameter, but the only facts available to estimate the true parameter are those provided by a sample. If the sample statistic differs from the hypothesis made about the population parameter, a decision must be made as to whether or not this difference is significant. If it is, the hypothesis is rejected. If not, it must be accepted. Hence, the term "tests of hypothesis".

54

Now, if Ө be the parameter of the population and is the estimate of Өˆ in the random sample drawn from the population, then the difference between Ө and Өˆ should be small. In fact, there will be some difference between Ө and Өˆ because Өˆ is based on sample observations and is different for different samples. Such a difference is known as difference due to sampling fluctuations. If the difference between Ө and Өˆ is large, then the probability that it is exclusively due to sampling fluctuations is small. Difference which is caused because of sampling fluctuations is called insignificant difference and the difference due to some other reasons is known as significant difference. A significant difference arises due to the fact that either the sampling procedure is not purely random or sample is not from the given population.

7.3 Procedure for Hypotheses Testing The general procedure followed in testing hypothesis comprises the following steps: 7.3.1 Set up a hypothesis. The first step in hypothesis testing is to establish the hypothesis to be tested. Since statistical hypothesis are usually assumptions about the value of some unknown parameter, the hypothesis specifies a numerical value or range of values for the parameter. The conventional approach to hypothesis testing is not to construct single hypothesis about the population parameter, but rather to set up two different hypothesis. These hypothesis are normally referred to as (i) null hypothesis denoted by Ho and (ii) alternative hypothesis denoted by H1. The null hypothesis asserts that there is no true difference in the sample statistic and population parameter under consideration (hence the word "null" which means invalid, void or amounting to nothing and that the difference found is accidental arising out of fluctuations of sampling. A hypothesis which states that there is no difference between assumed and actual value of the parameter is the null hypothesis and the hypothesis that is different from the null hypothesis is the alternative hypothesis. If the sample information leads us to reject Ho then we will accept the alternative hypothesis H1 Thus, the two hypothesis are constructed so that if one is true, the other is false and vice versa. The rejection of the null hypothesis indicates that the differences have statistical significance and the acceptance of the null hypothesis indicates that the differences are due to chance. As against the null hypothesis, the alternative hypothesis specifies those values that the researcher believes to hold true. The alternative hypothesis may embrace the whole range of values rather than single point. 7.3.2 Set up a suitable significance level. Having set up a hypothesis, the next step is to select a suitable level of significance. The confidence with which an experimenter rejects or retains null hypothesis depends on the significance level adopted. The level of significance, usually denoted by "α", is generally specified before any samples are drawn, so that results obtained will not influence our choice. Though any level of significance can be adopted, in practice, we either take 5 per cent or 1 per cent level of significance. When we take 5 per cent level of significance then there are about 5 chances out of 100 that we would reject the null hypothesis when it should be accepted, i.e., we are about 95% confident that we have made the right decision. When we test a hypothesis at a 1 per cent level of significance, there is only one chance out of 100 that we would reject the null hypothesis when it should be accepted, i.e., we, are about 99% confident that we have made the right decision. When the null hypothesis is rejected at α = 0.5, the test result is said to be "significant". When the null hypothesis is rejected at α = 0.01, the test result is said to be "highly significant".

55

7.3.3 Determination of a suitable test statistic. The third step is to determine a suitable test statistic and its distribution. Many of the test statistics that we shall encounter will be of the following form:

7.3.4 Determine the critical region. It is important to specify, before the sample is taken, which values of the test statistic will lead to a rejection of Ho and which lead to acceptance of Ho. The former is called the critical region. The value of α, the level of significance, indicates the importance that one attaches to the consequences associated with incorrectly rejecting Ho. It can be shown that when the level of significance is α, the optimal critical region for a two-sided test consists of that α/2 per cent of the area in the right-hand tail of the distribution plus that α/2 percent in the left hand tail. Thus, establishing a critical region is similar to determining a 100(I - α)% confidence interval. In general, one uses a level of significance of α = 0.05, indicating that one willing to accept a 5 per cent chance of being wrong to reject Ho. 7.3.5 Doing computations. The fifth step in testing hypothesis is the performance of various computations from a random sample of size n, necessary for the test statistic obtained in step 7.3.3. Then, we need to see whether sample result falls in the critical region or in the acceptance regions. 7.3.6 Making decisions. Finally, we may draw statistical conclusions and the management may take decisions. A statistical decision or conclusion comprises either accepting the null hypothesis or rejecting it. The decision will depend on whether the computed value of the test criterion falls in the region of rejection or the region of acceptance. If the hypothesis is being tested at 5 per cent level of significance and the observed set of results has a probability less than 5 per cent, we reject the null hypothesis and the difference between the sample statistic and the hypothetical population parameter is considered to be significant. On the other hand, if the testing statistic falls in the region of non-rejection, the null hypothesis is accepted and the difference between the sample statistic and the hypothetical population parameter is not regarded as significant, i. e., it can be explained by chance variations.

7.4 Type I and Type II Errors When a statistical hypothesis is tested, there are four possible results: (I) the hypothesis is true but our test rejects it. (2) The hypothesis is false but our test accepts it. (3) The hypothesis is true and our test accepts it. (4) The hypothesis is false and our test rejects it. Obviously, the first two possibilities lead to errors. If we reject a hypothesis when it should be accepted (possibility No.1), we say that a Type I error has been made. On the other hand, if we accept a hypothesis when it should be rejected (possibility No.2), we say that a Type II error has been made. In either case a wrong decision or error in judgment has occurred.

56

The probability of committing a type I error is designated as "α." and is called the level of significance. Therefore, α = P r [Type I error] = Pr [Rejecting Ho| Ho is true] must be the complement of (I - α) = Pr [Accepting Ho| Ho is true]. This probability (I - α) corresponds to the concept of 100(1- α) % confidence interval. Our efforts would obviously be to have a small probability of making a type I error. Hence the objective is to construct the test to minimise α. Similarly, the probability of committing a type II error is designated by β. Thus β = P r [Type II error] = Pr [Accepting HoI Ho is false] and

(1 - β) = Pr [Rejecting HolHo is false].

This probability (1 - β) is known as the power of a statistical test. The following table gives the probabilities associated with each of the four cells shown in the previous table: The null hypothesis is The decision is : Accept Ho

True

False

(1 -α) Confidence level

β (1- β)

Reject Ho

α

Power of the test

Sum

1.00

1.00

Note that the probability of each decision outcome is a conditional probability and the elements in the same column sum to 1.0, since the events with which they are associated are complement. However, α and β are not independent of each other, nor are they independent of the sample size n. When n is fixed, if α is lowered then β normally rises and vice versa. If n is increased, it is possible for both α and β to decrease. Since, increasing the sample size involves money and time, therefore, one should decide how much additional money and time, he is willing to spare on increasing the sample size in order to reduce the size of α and β. In order for any tests of hypothesis or rules of decisions to be good, they must be designed so as to minimise errors of decision. However, this is not a simple matter, since for a given sample size, an attempt to decrease one type of error is accompanied in general by an increase in other type of error. The probability of making type I error is fixed in advance by the choice of level of significance employed in the test. We can make the type I error as small as we please, by lowering the level of significance. But by doing so, we increase the chance of accepting a false hypothesis, i. e., of making a type II error. It follows that it is impossible to minimise both errors simultaneously. In the long run, errors of type I are perhaps more likely to prove serious in research programmes in social sciences than are errors of type II. In practice, one type of error may be more serious than the other and so a compromise should be reached in favour of limitations of the more serious error. The only way to reduce both types of error is to increase the sample size that may or may not be possible.

57

One-Tailed and Two-Tailed Tests Basically, there are three kinds of problems of tests of hypothesis. They include: (i) two-tailed tests, (ii) right-tailed test, and (iii) left-tailed test. Two-tailed test is that where the hypothesis about the population mean is rejected for value of falling into either tail of the sampling distribution. When the hypothesis about population mean is rejected only for value of falling into one of the tails of the sampling distribution, then it is known as one-tailed test. If, it is right tail then it is called right-tailed test or onesided alternative to the right and if it is on the left tail, then, it is one-sided alternative to the left and called left-tailed test. For example, Ho: µ = 100 tested against H1: µ > 100 or < 100 is one-tailed test since HI specifies that µ lies on particular side of 100. The same null hypothesis tested against H1: µ ≠ 100 is a two-tailed test since µ can be on either side of 100. The following diagrams would make it clearer: ***** DIAGRAMATIC REPRESENTATION ( after the table) The following table gives critical values of Z for both one-tailed and two-tailed tests at various levels of significance. Critical values of Z for other levels of significance are found by use of the table of normal curve areas : Level of Significance

0.10

0.05

0.01

0.005

0.0002

Critical value of z for one-

-1.28

-1.645

-2.33

-2.58

-2.88

tailed tests

or 1.28

or 1.645 or 2.33

or 2.58

or 2.88

Critical value of z for two-

- 1.645

- 1. 96

-2.81

-3.08

and

and

I. and

and

and

1.645

96

2.58

2.81

3.08

tailed tests

- 2.58

58

7.5.1 Z TEST: Tests of Hypothesis Concerning Large Samples Though, it is difficult to draw a clear-cut line of demarcation between large and small samples, it is generally agreed that if the size of sample exceeds 30, it should be regarded as a large sample. The tests of significance used for large samples are different from the ones used for small samples for the reason that the assumptions we make in case of large samples do not hold for small samples. Tests of hypothesis involving large samples are based on the following assumptions: (1) The sampling distribution of a sample statistic is approximately normal. (2) Values given by the samples are sufficiently close to the population value and can be used in its place for the standard error of the estimate. Thus, we have seen that the normal distribution plays a vital role in tests of hypothesis based on large samples (central limit theorem). Suppose Өˆis an unbiased estimate of Өˆ, the population parameter. On the basis of Өˆ , taken from sample observations, it is to test the hypothesis whether the sample is drawn from a population whose parameter value is Ө, i. e., we have to test the hypothesis Ho: Ө= Өˆ If the Sampling distribution of Ө is normal, then

Let us test the hypothesis at 100 α% level of significance. From tables of area under the standard normal curve corresponding to given α, we can find an ordinate z α such that Pr[IzαI>zα]=α P r [ - z α ≤ Z ≤ z α] = 1 – α If α = .01, then z α = 2.58 and if α = 0.05, then z α = 1.96, and so on. If the difference between Ө and Өˆ is more than z α times, the standard error of Өˆ , the difference is regarded significant and Ho is rejected at 100 α % level of significance and if the difference between Ө and Өˆ is less than or equal to z α times the standard error of Өˆ , the difference is insignificant and Ho is accepted at 100 α % level of significance. 7.5.1.2 Testing Hypothesis about the Difference between Two Means: 7.5.1.2a For the hypothesis testing concerning the population parameter μ by considering the two-tailed test. Ho: μ= μo Since the best unbiased estimator of μ is the simple mean x` (x bar), therefore, we shall focus our attention on the sampling distribution of x`(x bar). From Central Limit theorem, we know that

59

where

x` (x bar) ~ N(μ,σx) z = (x`- μ)/σx σx = σ/ √ N = s/ √ N

( if s is unknown for large samples)

If the calculated value of zz α/2 , the null hypothesis is rejected. 7.5.1.2b If the hypothesis involves a right-tailed test. For example, Ho: µ ≤ µo and H1: µ > µo For the calculated values z> z α , the null hypothesis is rejected. 7.5.1.2c If the hypothesis involves a left-tailed test. For example, Ho: µ ≥ µo and H1: µ < µo For the calculated values z 30. 4. This test is used only for drawing inferences by testing hypothesis. It cannot be used for estimation of parameter or any other value. 5. It is wholly dependent on the degrees of freedom. 6. The frequencies used in ψ2-test should be absolute and not relative in terms. 7. The observations collected for ψ2-test should be on random basis of sampling.

7.5.3.5

Working Rule For ψ2 -Test

The Chi-square test is widely used to test the independence of attributes. It is applied to test the association between the attributes when the sample data is presented in the form of a contingency table with any number of rows or columns. Step 1. Set up the Null Hypothesis Ho : No Association exists between the attributes. Alternative Hypothesis HI: An association exists between the attributes Step 2. Calculate the expected frequency E corresponding to each cell by the formula

69

Step 3. Calculate ψ2 -statistic by the fonnula

The characteristics of this distribution are completely defined by the number of degrees of freedom v which is given by v = (R - 1) (C - 1),

where R = number of rows. and C = number of columns in the contingency table Step 4. Find from the table the value of ψ2 for a given value of the level of significance α and for the degrees of freedom v, calculated in STEP 2 If no value for α is mentioned, then take α = 0.05. Step 5. Compare the computed value of ψ2, with the tabled value of ψ2 found in (a) If calculated value of ψ2 < tabulated value of ψ2, then accept the null hypotheses Ho (b) If calculated value of ψ2 > tabulated value of ψ2 , then reject the null hypotheses Ho and accept the alternative hypothesis H1 7.5.3.6 Ψ2 Test For Goodness of Fit ψ2 -test is a measure of probabilities of association between the attributes. It gives us an idea about the divergence between the observed and expected frequencies. Thus the test is also described as the test of "Goodness of Fit". If the curves of these two distributions, when superimposed do not coincide or appear to diverge much we say that the fit is poor. On the other hand if they don't diverge much, then the fit is less poor. Illustration 4. A survey of 320 families with 5 children each revealed the No. of boys: 5 4 5 2 1 0 No. of girls: 0 1 2 3 4 5 No. of families: 14 56 110 88 40 12. Is this result consistent with the hypothesis that the male and female births are equally probable?

70

7.5.3.7 Ψ2 Test As A Test Of Independence Ψ2 -test can also be applied to test, the independence between various attributes when the sample data is presented in the form of a contingency table with any number of rows 'R' and columns 'C'. The null hypothesis and alternative hypothesis are set as follows:

71

Null Hypothesis Ho: The attributes are independent. Alternative Hypothesis H1: The attributes are not independent. We then calculate Ψ2 If the calculated value of Ψ2 is less than the tabled value of Ψ2α,v at a given level of significance α and degrees of freedom v, the hypothesis is accepted and viceversa. Illustration 5. 50 students selected at random from 500 students enrolled in a computer crash programme were classified according to age and grade points giving the following data. Age (in years) Grade Points 20 and under 21 - 30 Above 30 Upto 5.0

3

5

2

5.1 to 7.5

8

7

5

7.6 to 10.0

4

8

8

Test at 5% level of significance the hypothesis that age and grade point are independent.

72

73

Chapter VIII Report Writing Contents:

8.1 Significance of Report Writing 8.2 Steps in Writing Report 8.3 Layout of Research Report 8.4 Precautions for Writing a Research Report

8.1 Significance of Report Writing The task of research remains incomplete until the report has been presented and/or written. The results of research must invariably enter the general store of knowledge. Presenting the results of research study, generally involves a formal written report as well as an oral presentation. The report and presentation are extremely important as the results of research are often intangible the written report is usually the only documentation of the project. The written report and the oral presentation are typically the only aspect of the study that the examining committee is exposed to, and consequently the overall evaluation of the research project rests on how well this information is communicated. Every person has a different style of writing. There is not really one right style for a report, but there are some basic principles for writing a research report clearly. 8.2 Steps in Writing Report Preparing a research report involves other activities besides writing; in fact, writing is actually the last step in the preparation process. Before writing can take place, the results of the research project must be fully understood and thought must be given to what the report will say. Thus, preparing a research report involves three steps: understanding, organizing and writing. The general guidelines that should be followed for any report or research paper are as follows: 8.2.1 Logical analysis of the subject matter: which develop a subject: logical understand the subject, analyze it another, it is logical development. It

It is the first step. The two ways, and chronological. When we and associate one thing with often consists of arranging the

74

8.2.2

8.2.3

8.2.4

8.2.5

simplest content to the complex one. Chronological development is based on a connection or a sequence in time or occurrence. Preparation of the final outline: Outlines are the framework upon which long written works are constructed. They are an aid to the logical organization of the material and a remainder of the points to be stressed in the report. Preparation of the rough draft: This follows the logical analysis of the subject and the preparation of the final outline. In this step the researcher writes his work performed, procedure, results obtained in context of his research study. Rewriting and polishing the rough draft: It is the most difficult part of all formal writing. It usually requires much more time than it required for preparation of rough draft. The careful revision helps to identify the weaknesses in terms of logical development or presentation. While preparing the final content, one should also check the mechanics of writing –grammar, spelling and usage. Preparation of final bibliography: The bibliography, which is generally appended to the research report, is a list of books, magazines, and all the work that the researcher has consulted. The bibliography should be arranged alphabetically and may be divided into two pars; the first part may contain the names of books and pamphlets, and the second part may contain the names of magazine and newspaper articles.

For books and pamphlets, the order may be as under: 1. Name of the author, last name, first name 2. Title, underlined to indicate italics. 3. Place, publisher and date of publication. 4. Number of volumes. For Example: Kothari, C.R., Quantitative Techniques, New Delhi, Vikas Publishing House Pvt.Ltd., 1978. For the magazines and newspapers the order may be as under: 1. Name of the author, last name first 2.Title of article, in quotation marks. 3. Name of the periodical, underlined to indicate italics. 4. The volume or volume and number. 5. The date of the issue. 6. The pagination. For Example: Robert V. Roosa, “Coping with Short-term International Money Flows”, The Banker,London,September,1971,p.995. 8.2.6. Writing the final draft: This constitutes the last step. The final draft should be written in a concise and objective style and in simple language, avoiding vague expressions. While writing the final draft, the researcher must

75

avoid abstract terminology and technical jargon. A research report must not be dull rather it should maintain interest of the people and show originality. 8.3 Layout of the Research Report: The research report must necessarily convey enough about the study so that the researcher can place it in its general scientific context, judge the adequacy of its methods and thus form an opinion of how seriously the findings are to be taken. For this purpose there is the need of proper layout of the report. The layout of the report means as to what the research report should contain. A comprehensive layout of the research report should comprise of: (a) preliminary pages; (b) the main text and; (c) The end matter. (a) Preliminary pages: In this the repot should carry a title and date, followed by acknowledgments in the form of ‘ pre-phase ’ or ‘forward’. Then there should be a table of contents followed by list of tables and illustrations so that the decision –maker or anybody interested in reading the report can easily locate the required information in the report (b)Main text: The main text provides the complete outline of the research report along with all details. Title of the research study is repeated at the top of the first page of the main text and then follows the other details on pages numbered consecutively, beginning with the second page. Each main section of the report should begin on a new page .the main text of the report should have the following section: 1. 2. 3. 4. 5.

introduction statement of findings and recommendations the results the implications drawn from the results and the summary.

1. Introduction: The purpose of introduction is to introduce the research project to the readers. It should contain a clear statement of the objectives of research i.e. enough background should be given to make clear to the reader why the problem was considered worth investigating. A brief summary of other relevant research may also be stated so that the present study can be seen in that context. The hypothesis of study, if any, and the definitions of the major concepts employed in the study should be explicitly stated in the introduction of your report.

76

2. statement of findings and recommendations: after introduction, the research report must contain a statement of findings and recommendations in non-technical language so that it can be easily understood by all concerned. If the findings happen to be extensive, at this pont they should be pu in summarized form. 3. results: a detailed presentation of the findings of the study , with supporting data in the form of tables and charts together with a validation of results, is the next step in writing the main text of the report. This generally comprises the main body of the report. The result section of the report should contain statistical summaries and reductions of the data rather than the raw data. All the results should be presented in logical sequence and spitted into readily identifiable sections. All relevant results must find a place in the report. 4. implementations of the results: Towards the end of the main text the researcher should again put down the results of his research clearly and precisely. He should, state the implications and that flow from the results of the study for understanding the human behavior. Such implications must have three aspects as stated belowa) a statement of the inferences drawn from the present study which may be expected to apply in similar circumstances b) the conditions of the present study which may limit the extent of legitimate generalization o the inferences drown from the study c) The relevant questions that still remain unanswered or to new questions rose by the study along with suggestions for the kind to research and would provide answers for them. 5. Summary: it has become customary to conclude the research report with a very brief summary, resting in brief the research problem, the methodology, the major findings and the major conclusions drawn from the research results. (c) End Matter:

At the end of the report appendices should be enlisted in respect of all technical data such as questionnaire, sample information, mathematical derivation and the like ones. Bibliography of sources consulted should also be given. Index( an alphabetical listing of names, places and topics along with the no of the pages in a book or repot in on which they are mentioned or disused) should invariably given at the end of the report. The value of index lies in the fact that it works as a guide to the reader for the contents in the report

77

8.4 Precautions for Writing a Research Report The Report must be prepared keeping the following precautions in view: 1. Abstract terminology and jargon should be avoided in a research report. 2. Readers are often interested in acquiring a quick knowledge of the main findings and for this purpose charts, graphs and the statistical tables must be used for various results in the main report in addition to summary of important findings. 3. The report must present logical analysis of the subject matter. 4. A research report should show originality and should necessarily be an attempt to solve some intellectual problem.

78

1. In which scale objective evidence is missing that such scales measure the concepts for which they have been developed. We have to rely on researchers’ insight and competence. c) Arbitrary Scale

2. In which scale the statements are related to one another in a way that if you have a favourable response for an item you should also have a favourable reply for the previous items. a) Cumulative Scale

3. Likert scales are treated as yielding interval data by a majority of marketing researchers a) True

4. The overall score of which type of scale represents the respondents’ position on the continuum of favourable-unfavourableness towards an issue? b) Likert Scales

5. A Descriptive Research may not contain one of the following attribute: b)Biased but flexible

6. Which study or survey leads to monitor behaviour? d.Longitudinal Studies

1. Among the following, which does not qualify to the qualities of a good research? c) Good research is non-replicable

2. Main text of the report should have—Introduction, Summary of Findings, Conclusions and ____________________. Fill in the space with an appropriate answer: b)Recommendations

3. Which does not form the objective of Research? c) To loose familiarity with a phenomenon

79

10. Which statement, out of the following, doesn’t cater to the Research Ethics? d) Work community with moral standards

Q11. The monthly income of 10 employees working in a firm is as follows: 4487, 4493, 4502, 4446, 4475, 4492, 4572, 4516, 4468, 4489. Find the average monthly income. a) 4494

Q12. Weekly rent No. of persons paying rent 200-400 6 400-600 9 600-800 11 800-1000 14 1000-1200 20 1200-1400 15 1400-1600 10 1600-1800 8 1800-2000 7 For the following frequency calculate mean: c)1100

Q13. From the following data of wages of 7 workers, compute the median wage: 4600, 4650, 4580, 4690, 4660, 4606, 4640 a) 4640

Q14. When one variable determines values of other variables, ______research design is used. a) Causal

Q15. Size of shoes: 5 6 7 8 9 10 11 No. of persons: 10 20 25 40 22 15 6 Calculate the Mode from the following information. d)8

Q16. The following figures relate to the preferences with regard to size of screen of T.V sets of 30 persons selected at random from a locality. Find the modal size of the T.V screen. 12 20 12 24 29 20 12 20 29 24 24 20 12 20 24 29 24 24 20 24 24 20 24 24 12

80

24

20

29

24

24

b)24

Q17. The _______ level of measurement describes variables that can be ordered or ranked in some order of importance. a) ordinal

Q18. Find the Standard Deviation from the weekly wages of ten workers working in a factory: Workers Weekly A 1320 B 1310 C 1315 D 1322 E 1326 F 1340 G 1325 H 1321 I 1320 J 1331 a) 7.89

Q19. An analysis of production rejects resulted in the following figures: No. of rejects per operators No. of operators 21-25 5 26-30 15 31-35 28 36-40 42 41-45 15 46-50 12 51-55 3 Calculate Standard Deviation. c)6.8

Q20. The mean lifetime of 100 light tubes produced by a company is found to be 1,580 hours with standard deviation of 90 hours. Test the Hypothesis that the mean lifetime of the tubes produced by the company is 1600 hours. (calculate) b) -2.22 Q21. In 600 throws of six-faced die, odd points appeared 360 times. Would you say that the die is fair at 5 % level of significance? (calculate) d)4.9

81

Q22 A sample of 400 managers is found to be a mean height of 171.38cms. Can it be regarded as a sample from a large population of mean height 171.17 cms and standard deviation of 3.30 cms? (calculate) d)1.3

Q23. Ten oil tins are taken at random from an automatic filling machine. The mean weight of the tins is 15.8 Kg and standard deviation is 0.50kg. Does the sample mean differ significantly from the intended weight of 16 kg? (calculate) c) -1.26

Q24. Price of share of a company on the different days in a month was found to be: 60, 65, 69, 70, 69, 71, 70, 63, 64 and 68 Test whether the mean price of the shares in month is 65. Calculate t? a) 2.81

Q25. A sample of 200 persons with a particular disease was selected. Out of these, 100 were given a drug and the others were not given any drug. The results are as follows: Number of Persons Drug No Drug Total Cured 65 55 120 Not cured 35 45 80 Total 100 100 200 Test, whether the drug is effective or not. Find the value of Chi- Square at 5% level of significance. a) 2.084

Q26. A certain drug is claimed to be effective in curing cold. In an experiment on 500 persons with cold, half of them were given the drug and half of them were given the sugar pills. The patient’s reactions to the treatment are recorded in the following table: Helped

Harmed

No effect

Total

Drug

150

30

70

250

Sugar pills

130

40

80

250

Total

280

70

150

500

On the basis of this data, can it be concluded that there is a significant difference in the effect of the drug and sugar pills? Find the value of Chi- Square at 5% level of significance. d) 3.5

82

Q27. The number of parts for a particular spare part in a factory was found to vary from day to day. In a sample study, the following information was obtained: Days:

Mon. Tues.

Wed.

Thurs. Fri. Sat.

No. of Parts:

1124

1110

1120 1126 1115 6720

1125

total

Demanded Test the Hypothesis that the number of parts demanded does depend on the day of the week. At 5% level of significance. d)0.179

Q28. A survey of 320 families with 5 children n each, revealed the following distribution: No. of boys: 5 4 3 2 1 0 No. of girls: 0 1 2 3 4 5 No. of families: 14 56 110 88 40 12 Find the value of Chi- Square at 5% level of significance. c)7.16

Q29. The figures given below are (a) the theoretical frequencies of a distribution and (b) the frequencies of the distribution having the same mean, standard deviation and total frequency as in (a): (a) 1 12 66 220 495 729 924 792 495 220 66 12 1 (b) 2 15 66 210 484 799 943 799 484 210 66 15 2 Do you think that the normal distribution provides a good fit to the data? Calculate Chi- Square with 5 % level of significance. a) 3.839

Q30. Exploratory research is used in Convenience Sampling. a) True

Q31. The sampling in which the selection of additional respondents (after the first small group of respondents is selected) is based upon referrals from the initial set of respondents is called? d)Snowball Sampling

Q32. Sample on the basis of which certain basic parameters such as age, sex, income and occupation that describe the nature a population so as to make it representative of the population:

83

c)Quota Sampling

Q33. In Simple Random Sampling the probability of selecting a specified unit of population at any given draw is equal to the probability of its being selected at the first draw. a) True

Q34. In Lottery Method selecting a simple random sample is dependent on the properties of population. b) False

Q35. It is impossible to have knowledge about each and every unit of population if population happens to be very large. This encourages the use of simple random sample. b)False

Q36.In which type of sampling the variance of the sample estimate of the population is inversely proportional to the sample size, and directly proportional to the variability of the sampling units in the population: c) Stratified Sampling

Q37. Which type of sampling over-rules the possibility of any essential group of the population being completely excluded in the sample. It thus provides a more representative cross section of the population and is frequently regarded as the most efficient system of sampling. d)Stratified Sampling

Q38. In which type of sampling you have the complete and up-to-date list of sampling units is available you can also employ a common technique of selection of sample. a) Systematic Sampling

Q39. In which type of sampling we regard population as a number of primary units each of which is further composed of secondary stage units and so on, till we ultimately reach a stage where desired sampling units are obtained. d)Multistage Sampling

Q40. The order for writing the name of books in Bibliography of your research report: a) name of author( last name first), title, publisher, number of volumes

84