Emergent Trends and Research Topics in Language Testing and Assessment

:This study, which is of descriptive nature, aims to explore the emergent trends and research topics in language testing and assessment that have attracted increasing attention of language testing and assessment researchers. To this end, 300 articles published within the last seven years (2012-2018) in two leading journals of the field were analyzed by using thematic analysis method. Overall, the results demonstrated that the assessment of language skills still constitute the backbone of language testing and assessment research. While the term communicative has become the established norm in language testing and assessment, the field has grown more interested in professionalization, understanding the dynamics that underlie test performance and validation. Moreover, the results revealed that even though the latest advancements in the fields of computer, cognitive sciences and information/communication technologies seem to make their way into language testing and assessment, more research is needed to make the most of these advancements and keep up with the rapidly changing nature of communication and literacy in the 21st century. The results are discussed and the implications are made.

___

Appel, R., & Wood, D. (2016). Recurrent word combinations in EAP test-taker writing: Differences between high-and low-proficiency levels. Language Assessment Quarterly, 13(1), 55–71.

Aryadoust, V. (2016). Gender and academic major bias in peer assessment of oral presentations. Language Assessment Quarterly, 13(1), 1–24.

Aryadoust, V., & Zhang, L. (2016). Fitting the mixed Rasch model to a reading comprehension test: Exploring individual difference profiles in L2 reading. Language Testing, 33(4), 529–553.

Attali, Y., Lewis, W., & Steier, M. (2013). Scoring with the computer: Alternative procedures for improving the reliability of holistic essay scoring. Language Testing, 30(1), 125–141.

Babaii, E., Taghaddomi, S., & Pashmforoosh, R. (2016). Speaking self-assessment: Mismatches between learners’ and teachers’ criteria. Language Testing, 33(3), 411–437.

Bachman, L. F. (2000). Modern language testing at the turn of the century: Assuring that what we count counts. Language Testing, 17(1), 1–42.

Baker, B. A. (2012). Individual differences in rater decision-making style: An exploratory mixed-methods study. Language Assessment Quarterly, 9(3), 225–248.

Barkaoui, K. (2014). Examining the impact of L2 proficiency and keyboarding skills on scores on TOEFLiBT writing tasks. Language Testing, 31(2), 241–259.

Bax, S. (2013). The cognitive processing of candidates during reading tests: Evidence from eye tracking. Language Testing, 30(4), 441–465.

Bochner, J. H., Samar, V. J., Hauser, P. C., Garrison, W. M., Searls, J. M., & Sanders, C. A. (2016). Validity of the American Sign Language Discrimination Test. Language Testing, 33(4), 473–495.

Braun, V., & Clarke, V. (2006). Using thematic analysis in psychology. Qualitative Research in Psychology, 3(2), 77–101.

Bridgeman, B., Cho, Y., & DiPietro, S. (2016). Predicting grades from an English language assessment: The importance of peeling the onion. Language Testing, 33(3), 307–318.

Brooks, L., & Swain, M. (2014). Contextualizing performances: Comparing performances during TOEFL iBTTM and real-life academic speaking activities. Language Assessment Quarterly, 11(4), 353–373.

Butler, Y. G., & Zeng, W. (2014). Young foreign language learners’ interactions during task-based paired assessments. Language Assessment Quarterly, 11(1), 45–75.

Cai, H. (2013). Partial dictation as a measure of EFL listening proficiency: Evidence from confirmatory factor analysis. Language Testing, 30(2), 177–199.

Cai, H. (2015). Weight-based classification of raters and rater cognition in an EFL speaking test. Language Assessment Quarterly, 12(3), 262–282.

Canale, M., & Swain, M. (1981). Theoretical bases of communicative approaches to second language teaching and testing. Applied Linguistics, 1(1), 1–47.

Chalhoub–Deville, M., & Deville, C. (1999). Computer adaptive testing in second language contexts. Annual Review of Applied Linguistics, 19, 273–299.

Chapelle, C. A., Cotos, E., & Lee, J. (2015). Validity arguments for diagnostic assessment using automated writing evaluation. Language Testing, 32(3), 385–405.

Chapman, E. (2003). Alternative approaches to assessing student engagement rates. Practical Assessment, Research & Evaluation, 8(13), 1–7.

Choi, I. (2017). Empirical profiles of academic oral English proficiency from an international teaching assistant screening test. Language Testing, 34(1), 49–82.

Davis, L. (2016). The influence of training and experience on rater performance in scoring spoken language. Language Testing, 33(1), 117–135.

Denies, K., & Janssen, R. (2016). Country and gender differences in the functioning of CEFR-based can-do statements as a tool for self-assessing English proficiency. Language Assessment Quarterly, 13(3), 251–276.

Dulock, H. L. (1993). Research design: Descriptive research. Journal of Pediatric Oncology Nursing, 10(4), 154–157.

Eckes, T. (2012). Operational rater types in writing assessment: Linking rater cognition to rater behavior. Language Assessment Quarterly, 9(3), 270–292.

Eckes, T. (2014). Examining testlet effects in the TestDaF listening section: A testlet response theory modeling approach. Language Testing, 31(1), 39–61.

Eckes, T. (2017). Setting cut scores on an EFL placement test using the prototype group method: A receiver operating characteristic (ROC) analysis. Language Testing, 34(3), 383–411.

Farnsworth, T. L. (2013). An investigation into the validity of the TOEFL iBT speaking test for international teaching assistant certification. Language Assessment Quarterly, 10(3), 274–291.

Fidalgo, A. M., Alavi, S. M., & Amirian, S. M. R. (2014). Strategies for testing statistical and practical significance in detecting DIF with logistic regression models. Language Testing, 31(4), 433–451.

Goodwin, A. P., Huggins, A. C., Carlo, M., Malabonga, V., Kenyon, D., Louguit, M., & August, D. (2012). Development and validation of extract the base: an English derivational morphology test for third through fifth grade monolingual students and Spanish-speaking English language learners. Language Testing, 29(2), 265–289.

Granfeldt, J., & Ågren, M. (2014). SLA developmental stages and teachers’ assessment of written French: Exploring Direkt Profil as a diagnostic assessment tool. Language Testing, 31(3), 285–305.

Green, A., & Hawkey, R. (2012). Re-fitting for a different purpose: A case study of item writer practices in adapting source texts for a test of academic reading. Language Testing, 29(1), 109–129.

Han, C. (2016). Investigating score dependability in English/Chinese interpreter certification performance testing: A generalizability theory approach. Language Assessment Quarterly, 13(3), 186–201.

Harding, L. (2014). Communicative language testing: Current issues and future research. Language Assessment Quarterly, 11(2), 186-197.

Harding, L., Alderson, J. C., & Brunfaut, T. (2015). Diagnostic assessment of reading and listening in a second or foreign language: Elaborating on diagnostic principles. Language Testing, 32(3), 317–336.

Haug, T. (2012). Methodological and theoretical issues in the adaptation of sign language tests: An example from the adaptation of a test to German Sign Language. Language Testing, 29(2), 181–201.

Hirai, A., & Koizumi, R. (2013). Validation of empirically derived rating scales for a story retelling speaking test. Language Assessment Quarterly, 10(4), 398–422.

Hoang, G. T. L., & Kunnan, A. J. (2016). Automated Essay Evaluation for English Language Learners: A Case Study of MY Access. Language Assessment Quarterly, 13(4), 359–376.

Hsieh, M. (2013a). An application of Multifaceted Rasch measurement in the Yes/No Angoff standard setting procedure. Language Testing, 30(4), 491–512.

Hsieh, M. (2013b). Comparing yes/no Angoff and Bookmark standard setting methods in the context of English assessment. Language Assessment Quarterly, 10(3), 331–350.

Hsu, T. H. L. (2016). Removing bias towards World Englishes: The development of a Rater Attitude Instrument using Indian English as a stimulus. Language Testing, 33(3), 367–389.

Huang, B., Alegre, A., & Eisenberg, A. (2016). A cross-linguistic investigation of the effect of raters’ accent familiarity on speaking assessment. Language Assessment Quarterly, 13(1), 25–41.

Huang, F. L., & Konold, T. R. (2014). A latent variable investigation of the Phonological Awareness Literacy Screening-Kindergarten assessment: Construct identification and multigroup comparisons between Spanish-speaking English-language learners (ELLs) and non-ELL students. Language Testing, 31(2), 205–221.

Ilc, G., & Stopar, A. (2015). Validating the Slovenian national alignment to CEFR: The case of the B2 reading comprehension examination in English. Language Testing, 32(4), 443–462.

Jin, T., & Mak, B. (2013). Distinguishing features in scoring L2 Chinese speaking performance: How do they work?. Language Testing, 30(1), 23–47.

Jin, T., Mak, B., & Zhou, P. (2012). Confidence scoring of speaking performance: How does fuzziness become exact?. Language Testing, 29(1), 43–65.

Kang, O. (2012). Impact of rater characteristics and prosodic features of speaker accentedness on ratings of international teaching assistants' oral performance. Language Assessment Quarterly, 9(3), 249–269.

Katzenberger, I., & Meilijson, S. (2014). Hebrew language assessment measure for preschool children: A comparison between typically developing children and children with specific language impairment. Language Testing, 31(1), 19–38.

Kim, H. J. (2015). A qualitative analysis of rater behavior on an L2 speaking assessment. Language Assessment Quarterly, 12(3), 239–261.

Knoch, U., & Chapelle, C. A. (2017). Validation of rating processes within an argument-based framework. Language Testing, 34, 1–23.

Kokhan, K. (2013). An argument against using standardized test scores for placement of international undergraduate students in English as a Second Language (ESL) courses. Language Testing, 30(4), 467–489.

Koo, J., Becker, B. J., & Kim, Y. S. (2014). Examining differential item functioning trends for English language learners in a reading test: A meta-analytical approach. Language Testing, 31(1), 89–109.

Kuiken, F., & Vedder, I. (2017). Functional adequacy in L2 writing: Towards a new rating scale. Language Testing, 34(3), 321–336.

Kyle, K., Crossley, S. A., & McNamara, D. S. (2016). Construct validity in TOEFL iBT speaking tasks: Insights from natural language processing. Language Testing, 33(3), 319–340.

Lado, R. (1961). Language Testing. New York: McGraw-Hill.

Lam, R. (2015). Language assessment training in Hong Kong: Implications for language assessment literacy. Language Testing, 32(2), 169–197.

Lee, H., & Winke, P. (2013). The differences among three-, four-, and five-option-item formats in the context of a high-stakes English-language listening test. Language Testing, 30(1), 99-123.

Lee, S., & Winke, P. (2018). Young learners’ response processes when taking computerized tasks for speaking assessment. Language Testing, 35(2), 239-269.

Li, H., & Suen, H. K. (2013). Detecting native language group differences at the subskills level of reading: A differential skill functioning approach. Language Testing, 30(2), 273–298.

Li, H., Hunter, C. V., & Lei, P. W. (2016). The selection of cognitive diagnostic models for a reading comprehension test. Language Testing, 33(3), 391–409.

Lin, C. K., & Zhang, J. (2014). Investigating correspondence between language proficiency standards and academic content standards: A generalizability theory study. Language Testing, 31(4), 413–431.

Ling, G. (2017). Is writing performance related to keyboard type? An investigation from examinees’ perspectives on the TOEFL iBT. Language Assessment Quarterly, 14(1), 36–53.

Mann, W., Roy, P., & Morgan, G. (2016). Adaptation of a vocabulary test from British Sign Language to American Sign Language. Language Testing, 33(1), 3–22.

Murray, J. C., Riazi, A. M., & Cross, J. L. (2012). Test candidates’ attitudes and their relationship to demographic and experiential variables: The case of overseas trained teachers in NSW, Australia. Language Testing, 29(4), 577–595.

Nakatsuhara, F., Inoue, C., Berry, V., & Galaczi, E. (2017). Exploring the use of video-conferencing technology in the assessment of spoken language: a mixed-methods study. Language Assessment Quarterly, 14(1), 1–18.

Pan, M., & Qian, D. D. (2017). Embedding Corpora into the Content Validation of the Grammar Test of the National Matriculation English Test (NMET) in China. Language Assessment Quarterly, 14(2), 120– 139.

Papageorgiou, S., & Cho, Y. (2014). An investigation of the use of TOEFL® Junior™ Standard scores for ESL placement decisions in secondary education. Language Testing, 31(2), 223–239.

Pill, J., & McNamara, T. (2016). How much is enough? Involving occupational experts in setting standards on a specific-purpose language test for health professionals. Language Testing, 33(2), 217–234.

Saida, C. (2017). Creating a Common Scale by Post-Hoc IRT Equating to Investigate the Effects of the New National Educational Policy in Japan. Language Assessment Quarterly, 14(3), 257–273.

Sanchez, S. V., Rodriguez, B. J., Soto-Huerta, M. E., Villarreal, F. C., Guerra, N. S., & Flores, B. B. (2013). A case for multidimensional bilingual assessment. Language Assessment Quarterly, 10(2), 160–177.

Sato, T. (2012). The contribution of test-takers’ speech content to scores on an English oral proficiency test. Language Testing, 29(2), 223–241.

Savignon, S. J. (1972). Communicative competence: An experiment in foreign language teaching. Philadelphia: Center for Curriculum Development.

Shaw, S., & Imam, H. (2013). Assessment of international students through the medium of English: Ensuring validity and fairness in content-based examinations. Language Assessment Quarterly, 10(4), 452–475.

Shohamy, E., Gordon, C. M., & Kraemer, R. (1992). The effect of raters' background and training on the reliability of direct writing tests. The Modern Language Journal, 76(1), 27–33.

Spolsky, B. (2008). Language assessment in historical and future perspective. In E. Shohamy & N. Hornberger (Eds.), Encyclopedia of language and education (Second ed., Vol. 7: Language testing and assessment, pp. 445–454). New York: Springer Science.

Stansfield, C. W. (2008). Lecture: “Where we have been and where we should go”. Language Testing, 25(3), 311–326.

Suvorov, R. (2015). The use of eye tracking in research on video-based second language (L2) listening assessment: A comparison of context videos and content videos. Language Testing, 32(4), 463–483.

Suzuki, Y. (2015). Self-assessment of Japanese as a second language: The role of experiences in the naturalistic acquisition. Language Testing, 32(1), 63–81.

Tengberg, M. (2017). National reading tests in Denmark, Norway, and Sweden: A comparison of construct definitions, cognitive targets, and response formats. Language Testing, 34(1), 83–100.

Timpe-Laughlin, V., & Choi, I. (2017). Exploring the Validity of a Second Language Intercultural Pragmatics Assessment Tool. Language Assessment Quarterly, 14(1), 19–35.

Vogt, K., & Tsagari, D. (2014). Assessment literacy of foreign language teachers: Findings of a European study. Language Assessment Quarterly, 11(4), 374–402.

Wagner, E. (2013). An investigation of how the channel of input and access to test questions affect L2 listening test performance. Language Assessment Quarterly, 10(2), 178–195.

Wei, J., & Llosa, L. (2015). Investigating differences between American and Indian raters in assessing TOEFL iBT speaking tasks. Language Assessment Quarterly, 12(3), 283–304.

Widdowson, H. G. (1983). Learning purpose and language use. Oxford: Oxford University Press.

Winke, P., Gass, S., & Myford, C. (2013). Raters’ L2 background as a potential source of bias in rating oral performance. Language Testing, 30(2), 231-252.

Xi, X., Higgins, D., Zechner, K., & Williamson, D. (2012). A comparison of two scoring methods for an automated speech scoring system. Language Testing, 29(3), 371–394.

Zhang, L., Goh, C. C., & Kunnan, A. J. (2014). Analysis of test takers’ metacognitive and cognitive strategy use and EFL reading test performance: A multi-sample SEM approach. Language Assessment Quarterly, 11(1), 76–102.