Impact of the Number of Scale Points on Data Characteristics and Respondents’ Evaluations: An Experimental Design Approach Using 5-Point and 7-Point Likert-type Scales

A remarkable deal of social research is based on data collected through the use of Likerttype scales. The optimal number of response categories in Likert-type scales has been subject to an academic debate for years. This article studies the differences between 5- and 7-point Likert-type scales using the SERVPERF Scale, which was developed by Cronin and Taylor in 1992, as the measuring instrument. A pretest-posttest control group experimental design was used to test whether the differently pointed response categories lead to any statistical differences in data characteristics, dimensional structure of the scale and data fit. Results do not show any statistically significant differences in terms of normality and reliability whereas different dimensional structures are achieved for the 5- and 7-point scale formats of SERVPERF using Exploratory Factor Analysis. ANCOVA results reveal that the number of response categories is not affective on the participants’ evaluations of SERVPERF. The results of confirmatory factor analysis show that the best fit is achieved for the 7-point SERVPERF.

Impact of the Number of Scale Points on Data Characteristics and Respondents’ Evaluations: An Experimental Design Approach Using 5-Point and 7-Point Likert-type Scales

A remarkable deal of social research is based on data collected through the use of Likerttype scales. The optimal number of response categories in Likert-type scales has been subject to an academic debate for years. This article studies the differences between 5- and 7-point Likert-type scales using the SERVPERF Scale, which was developed by Cronin and Taylor in 1992, as the measuring instrument. A pretest-posttest control group experimental design was used to test whether the differently pointed response categories lead to any statistical differences in data characteristics, dimensional structure of the scale and data fit. Results do not show any statistically significant differences in terms of normality and reliability whereas different dimensional structures are achieved for the 5- and 7-point scale formats of SERVPERF using Exploratory Factor Analysis. ANCOVA results reveal that the number of response categories is not affective on the participants’ evaluations of SERVPERF. The results of confirmatory factor analysis show that the best fit is achieved for the 7-point SERVPERF.

___

Alford, W.K., Malouff, J.M. & Osland, K.S. (2005). Written Emotional Expression as a Coping Method in Child Protective Services Officers. International Journal of Stress Management, 12(2), 182-183.

Babakus, E. & Boller, G.W. (1992). An Emprical Assessment of the SERVQUAL Scale. Journal of Business Research, 24(3), 253-268. doi: 10.1016/0148-2963(92)90022-4

Bearden, W.O., Netmeyer, R.G. & Mobley, M. (1993). Handbook of Marketing Scales: Multi-item Measures for Marketing and Consumer Behavior Research. Newbury Park, CA: Sage.

Bendig, A.W. (1953). The Reliability of Self-ratings as a Function of the Amount of Verbal Anchoring and the Number of Categories on The Scale. The Journal of Applied Psychology, 37, 38-41. doi: 10.1037/h0055647

Bendig, A.W. (1954). Reliability and The Number of Rating Scale Categories. The Journal of Applied Psychology, 38, 38-40.

Brady, M.K., Cronin, J.J.Jr. & Brand, R.R. (2002). Performance-only Measurement of Service Quality: A Replication and Extension. Journal of Business Research, 55(1), 17-31.

Brown, G.; Wilding, R.E. & Coulter, R.L. (1991). Customer Evaluation of Retail Salespeople Using the SOCO Scale: A Replication Extension and Application. Journal of the Academy of Marketing Science, 9, 347-351.

Carillat, F.A., Jaramillo, F. & Mulki, J.P. (2007). The Validity of the SERVQUAL and SERVPERF Scales: A Meta-analytical View of 17 Years of Research Across Five Continents. International Journal of Service Industry Management, 18(5), 472-490. doi: 10.1108/09564230710826250

Chang, L. (1994). A Psychometric Evaluation of Four-point and Six-point Likert-type Scales in Relation to Reliability and Validity. Applied Psychological Measurement, 18, 205-215.

Choudhury, S. & Bhattacharjee, D. (2014). Optimal Number of Scale Points in Likert Type Scales for Quantifying Compulsive Buying Behaviour. Asian Journal of Management Research, 4(3), 431-440.

Cicchetti, D.V., Showalter, D. & Tyrer, P.J. (1985). The Effect of Number of Rating Scale Categories on Levels of Inter-rater Reliability: A Monte-Carlo Investigation. Applied Psychological Measurement, 9, 31-36.

Cohen, J. (1988). Statistical Power Analysis for the Behavioural Sciences. Second Edition, New York, NY: Academic Press.

Colman, A.M., Norris, C.E. & Preston, C.C. (1997). Comparing Rating Scales of Different Lengths: Equivalence of Scores from 5-point and 7-point Scales. Psychological Reports, 80, 355-362. doi: 10.2466/pr0.1997.80.2.355

Cortina, J.M. (1993). What is Coefficient Alpha? An Examination of Theory and Applications. Journal of Applied Psychology, 78, 98-104. doi: 10.1037/0021- 9010.78.1.98

Cox, E.P. (1980). The Optimal Number of Response Alternatives for a Scale: A Review. Journal of Marketing Research, 17, 407-422.

Crocker, L. & Algina, J. (1986). Introduction to Classical and Modern Test Theory. New York, NY: Holt, Rinehart & Winston.

Cronin, J.J.Jr. & Taylor, A.S.(1992). Measuring Service Quality: A Reexamination and an Extension. Journal of Marketing, 56(3), 243-253.

Dawes, J. (2008). Do Data Characteristics Change According to the Number of Scale Points Used?, International Journal of Market Research, 50(1), 61 – 77. http://citeseerx.ist. psu.edu/viewdoc/download?doi=10.1.1.417.9488&rep=rep1&type=pdf

Dimitrov, D.M. & Rumrill, Jr., P.D. (2003). Pretest-Posttest Designs and Measurement of Change. Work, 20, 159-165. http://iospress.metapress.com/content/7x9hgpq885t2yttq/

Doğan, V., Özkara, B.Y., Yılmaz, C. and Torlak, Ö. (2014). Katılım Düzeyi Seçenek Sayısının Veri Karakteristiği ve Veri Kalitesi Kapsamında İncelenmesi: Optimal Katılım Düzeyi Seçenek Sayısına İlişkin Bir Çıkarım (An Examination of the Optimal Number of Response Categories in terms of Data Characteristics and Data Quality: An Inference Regarding the Optimal Number of Response Categories). In the Proceedings of the 19th Annual Turkish National Marketing Congress, Gaziantep, TURKEY.

Field, A. (2012). Discovering Statistics Using IBM SPSS Statistics. Fourth Edition, London: Sage Publications.

Finn, R.H. (1972). Effects of Some Variations in Rating Scale Characteristics on the Means and Reliabilities of Ratings. Educational and Psychological Measurement, 32(7), 255- 265.

Garner, W.R. (1960). Rating Scales, Discriminability and Information Transmission. Psychological Review, 67,343-352.

Green, J.A. & Rao, V.R. (1970). Rating Scales and Information Recovery: How Many Scales and Response Categories to Use? Journal of Marketing, 34, 33-39.

Howell, D.C. (1992). Statistical Methods for Psychology. Boston, MA: Duxbury Press.

Huck, S.W. (2008). Reading Statistics and Research. Fifth Edition, Boston, MA: Pearson Education, Inc.

Jain, S.K. & Gupta, G. (2004). Measuring Service Quality: SERVQUAL vs SERVPERF Scales. The Journal for Decision Makers, 29(2), 25-37. http://www.vikalpa.com/pdf/ articles/2004/2004_apr_jun_25_37.pdf

Janssens, W., Wijnen, K. Pelsmacker, P.D. & Van Kenhove, P. (2008). Marketing Research with SPSS. London: Pearson Education Limited.

Jones, R.R. (1968). Differences in Response Consistency and Subjects’ Preferences for Three Personality Inventory Response Formats. In Proceedings of the 76th Annual Convention of the American Psychological Association, 247-248.

Jöreskog, K.G. and Sörbom, D. (1993). Lisrel 8: Structural Equation Modeling with Simplis Command Language. Scientific Software International.

Lai, M., Li, Yongjian & Liu, Y. (2010). Determining the Optimal Scale Width for a Rating Scale Using an Integrated Discrimination Fuction. Measurement, 43, 1458-1471. doi: 10.1016/j.measurement.2010.08.012

Leung, S.O. (2011). A Comparison of Psychometric Properties and Normality in 4-,5-,6 and 11-Point Likert Scales. Journal of Social Service Research, 37, 412-421. doi:10.1 080/01488376.2011.580697

Loken, B., Pirie, P., Virnig, K.A., Hinkle, R.L. & Salmon, C.T. (1987). The Use of 0-10 Scales in Telephone Surveys. Journal of the Market Research Society, 29(3), 353-362.

Lozano, L.M., Garcia-Cueto, E. & Muniz, J. (2008). Effect of the Number of Response Categories on the Reliability and Validity of Rating Scales. Methodology, 4(2), 73-79. doi: 10.1027/1614-2241.4.2.73

Malhotra, N. K. (2010). Marketing Research: An Applied Orientation. Sixth Edition, Boston, MA: Pearson.

Marlow, L., Inman, D. & Shwery, C. (2005). To What Extent are Literacy Initiatives Being Supported: Important Questions for Administrators. Reading Improvement, 42(3), 179. http://eric.ed.gov/?id=EJ725388

Matell, M.S. & Jacoby, J. (1971). Is There an Optimal Number of Alternatives for Likert Scale Items? Study 1: Reliability and Validity. Educational and Psychological Measurement, 31, 657-674. http://psycnet.apa.org/journals/apl/56/6/506/

Morris, S.B. (2008). Estimating Effect Sizes from Pretest – Posttest – Control Group Designs. Organizational Research Methods, 11(2), 364-386.

Oaster, T.R.F. (1989). Number of Alternatives per Choice Point and Stability of Likert-type Scales. Perceptual and Motor Scales, 68, 549-550. doi: 10.2466/pms.1989.68.2.549

Osteras, N., Gulbrandsen, P., Garratt, A., Benth, J.S., Dahl, F.A., Natvig, B. & Brage, S. (2008). A Randomised Comparison of a Four and a Five-Point Scale Version of the Norwegian Function Assessment Scale. Health and Quality of Life Outcomes, 6(14), 1-9. doi: http://www.hqlo.com/content/6/1/14

Preston, C.C. & Colman, A.M. (2000). Optimal Number of Response Categories in Rating Scales: Reliability, Validity, Discriminating Power and Respondent Preferences. Acta Psychologica, 104, 1-15. doi: 10.1016/S0001-6918(99)00050-5

Qin, H., Prybutok, V.R. & Zhao, Q. (2010). Perceived Service Quality in Fast-food Restaurants: Empirical Evidince from China. International Journal of Quality & Reliability Management, 27(4) , 424-437. doi: 10.1108/02656711011035129

Ramsay, J.O. (1973). The Effect of Number of Categories in Rating Scales on Precision of Estimation of Scale Values. Psychometrika, 38, 513-533. doi: 10.1177/014662168500900103

Sallot, L.M. & Lyon, L.J. (2003). Investigating Effects of Tolerance – Intolerence of Ambiguity and the Teaching of Public Relations Writing: A Quasi-Experiment. Journalism & Mass Communication Educator, 58(3), 251-272. doi: 10.1177/107769580305800304

Symonds, P.M. (1924). On the Loss of Reliability in Ratings Due to Coarseness of the Scale. Journal of Experimental Psychology, 7, 456-461. doi: 10.1177/014662168500900103

Viswanathan, M., Sudman, S. & Johnson, M. (2004). Maximum versus Meaningful Discrimination in Scale Response: Implications for Validity of Measurement of Consumer Perceptions About Products. Journal of Business Research, 57, 108-124. doi: 10.1016/S0148-2963(01)00296-X

Weathers, D., Sharma, S. & Niedrich, R.W. (2005). The Impact of the Number of Scale Points, Dispositional Factors and the Status Quo Decision Heuristic on Scale Reliability and Response Accuracy. Journal of Business Research, 58, 1516-1524. doi: 10.1016/j. jbusres.2004.08.002

Weijters, B., Cabooter, E. & Schillewaert, N. (2010). The Effect of Rating Scale Format on Response Styles: The Number of Response Categories and Response Category Labels. International Journal of Research in Marketing, 27, 236-247. doi: 10.1016/j. ijresmar.2010.02.004

Weng, L.J. (2004). Impact of the Number of Response Categories and Anchor Labels on Coefficient Alpha and Test-retest Reliability. Educational and Psychological Measurement, 64(6), 956-972. doi: 10.1177/0013164404268674

Woodruff, D.J. & Feldt, L.S. (1986). Tests for Equality of Several Alpha Coefficients When Their Sample Estimates are Dependent. Psychometrika, 51, 393-413. http://link. springer.com/article/10.1007/BF02294063

Zhou, L. (2004). A Dimension-specific Analysis of Performance-Only Measurement of Service Quality and Satisfaction in China’s Retail Banking. The Journal of Services Marketing, 18(6/7), 534-546. doi: 10.1108/08876040410561866