International Association of Educators   |  ISSN: 1309-0682

Orjinal Araştırma Makalesi | Akdeniz Eğitim Araştırmaları Dergisi 2019, Cil. 13(28) 66-81

Emergent Trends and Research Topics in Language Testing and Assessment

Tuğba Elif Toprak Yıldız

ss. 66 - 81   |  DOI:   |  Makale No: MANU-1901-21-0001

Yayın tarihi: Haziran 30, 2019  |   Okunma Sayısı: 1620  |  İndirilme Sayısı: 1067


This study, which is of descriptive nature, aims to explore the emergent trends and research topics in language testing and assessment that have attracted increasing attention of language testing and assessment researchers. To this end, 300 articles published within the last seven years (2012-2018) in two leading journals of the field were analyzed by using thematic analysis method. Overall, the results demonstrated that the assessment of language skills still constitute the backbone of language testing and assessment research. While the term communicative has become the established norm in language testing and assessment, the field has grown more interested in professionalization, understanding the dynamics that underlie test performance and validation. Moreover, the results revealed that even though the latest advancements in the fields of computer, cognitive sciences and information/communication technologies seem to make their way into language testing and assessment, more research is needed to make the most of these advancements and keep up with the rapidly changing nature of communication and literacy in the 21st century. The results are discussed and the implications are made.

Anahtar Kelimeler: Language testing, language assessment, emergent trends in language testing, educational assessment, educational testing

Bu makaleye nasıl atıf yapılır?

APA 6th edition
Yildiz, T.E.T. (2019). Emergent Trends and Research Topics in Language Testing and Assessment . Akdeniz Eğitim Araştırmaları Dergisi, 13(28), 66-81. doi: 10.29329/mjer.2019.202.4

Yildiz, T. (2019). Emergent Trends and Research Topics in Language Testing and Assessment . Akdeniz Eğitim Araştırmaları Dergisi, 13(28), pp. 66-81.

Chicago 16th edition
Yildiz, Tugba Elif Toprak (2019). "Emergent Trends and Research Topics in Language Testing and Assessment ". Akdeniz Eğitim Araştırmaları Dergisi 13 (28):66-81. doi:10.29329/mjer.2019.202.4.

  1. Appel, R., & Wood, D. (2016). Recurrent word combinations in EAP test-taker writing: Differences between high-and low-proficiency levels. Language Assessment Quarterly, 13(1), 55–71. [Google Scholar]
  2. Aryadoust, V. (2016). Gender and academic major bias in peer assessment of oral presentations. Language Assessment Quarterly, 13(1), 1–24. [Google Scholar]
  3. Aryadoust, V., & Zhang, L. (2016). Fitting the mixed Rasch model to a reading comprehension test: Exploring individual difference profiles in L2 reading. Language Testing, 33(4), 529–553. [Google Scholar]
  4. Attali, Y., Lewis, W., & Steier, M. (2013). Scoring with the computer: Alternative procedures for improving the reliability of holistic essay scoring. Language Testing, 30(1), 125–141. [Google Scholar]
  5. Babaii, E., Taghaddomi, S., & Pashmforoosh, R. (2016). Speaking self-assessment: Mismatches between learners’ and teachers’ criteria. Language Testing, 33(3), 411–437. [Google Scholar]
  6. Bachman, L. F. (2000). Modern language testing at the turn of the century: Assuring that what we count counts. Language Testing, 17(1), 1–42. [Google Scholar]
  7. Baker, B. A. (2012). Individual differences in rater decision-making style: An exploratory mixed-methods study. Language Assessment Quarterly, 9(3), 225–248. [Google Scholar]
  8. Barkaoui, K. (2014). Examining the impact of L2 proficiency and keyboarding skills on scores on TOEFL-iBT writing tasks. Language Testing, 31(2), 241–259. [Google Scholar]
  9. Bax, S. (2013). The cognitive processing of candidates during reading tests: Evidence from eye tracking. Language Testing, 30(4), 441–465. [Google Scholar]
  10. Bochner, J. H., Samar, V. J., Hauser, P. C., Garrison, W. M., Searls, J. M., & Sanders, C. A. (2016). Validity of the American Sign Language Discrimination Test. Language Testing, 33(4), 473–495. [Google Scholar]
  11. Braun, V., & Clarke, V. (2006). Using thematic analysis in psychology. Qualitative Research in Psychology, 3(2), 77–101.  [Google Scholar]
  12. Bridgeman, B., Cho, Y., & DiPietro, S. (2016). Predicting grades from an English language assessment: The importance of peeling the onion. Language Testing, 33(3), 307–318. [Google Scholar]
  13. Brooks, L., & Swain, M. (2014). Contextualizing performances: Comparing performances during TOEFL iBTTM and real-life academic speaking activities. Language Assessment Quarterly, 11(4), 353–373. [Google Scholar]
  14. Butler, Y. G., & Zeng, W. (2014). Young foreign language learners’ interactions during task-based paired assessments. Language Assessment Quarterly, 11(1), 45–75. [Google Scholar]
  15. Cai, H. (2013). Partial dictation as a measure of EFL listening proficiency: Evidence from confirmatory factor analysis. Language Testing, 30(2), 177–199. [Google Scholar]
  16. Cai, H. (2015). Weight-based classification of raters and rater cognition in an EFL speaking test. Language Assessment Quarterly, 12(3), 262–282. [Google Scholar]
  17. Canale, M., & Swain, M. (1981). Theoretical bases of communicative approaches to second language teaching and testing. Applied Linguistics, 1(1), 1–47. [Google Scholar]
  18. Chalhoub–Deville, M., & Deville, C. (1999). Computer adaptive testing in second language contexts. Annual Review of Applied Linguistics, 19, 273–299. [Google Scholar]
  19. Chapelle, C. A., Cotos, E., & Lee, J. (2015). Validity arguments for diagnostic assessment using automated writing evaluation. Language Testing, 32(3), 385–405. [Google Scholar]
  20. Chapman, E. (2003). Alternative approaches to assessing student engagement rates. Practical Assessment, Research & Evaluation, 8(13), 1–7. [Google Scholar]
  21. Choi, I. (2017). Empirical profiles of academic oral English proficiency from an international teaching assistant screening test. Language Testing, 34(1), 49–82. [Google Scholar]
  22. Davis, L. (2016). The influence of training and experience on rater performance in scoring spoken language. Language Testing, 33(1), 117–135. [Google Scholar]
  23. Denies, K., & Janssen, R. (2016). Country and gender differences in the functioning of CEFR-based can-do statements as a tool for self-assessing English proficiency. Language Assessment Quarterly, 13(3), 251–276. [Google Scholar]
  24. Dulock, H. L. (1993). Research design: Descriptive research. Journal of Pediatric Oncology Nursing, 10(4), 154–157. [Google Scholar]
  25. Eckes, T. (2012). Operational rater types in writing assessment: Linking rater cognition to rater behavior. Language Assessment Quarterly, 9(3), 270–292. [Google Scholar]
  26. Eckes, T. (2014). Examining testlet effects in the TestDaF listening section: A testlet response theory modeling approach. Language Testing, 31(1), 39–61. [Google Scholar]
  27. Eckes, T. (2017). Setting cut scores on an EFL placement test using the prototype group method: A receiver operating characteristic (ROC) analysis. Language Testing, 34(3), 383–411. [Google Scholar]
  28. Farnsworth, T. L. (2013). An investigation into the validity of the TOEFL iBT speaking test for international teaching assistant certification. Language Assessment Quarterly, 10(3), 274–291. [Google Scholar]
  29. Fidalgo, A. M., Alavi, S. M., & Amirian, S. M. R. (2014). Strategies for testing statistical and practical significance in detecting DIF with logistic regression models. Language Testing, 31(4), 433–451. [Google Scholar]
  30. Goodwin, A. P., Huggins, A. C., Carlo, M., Malabonga, V., Kenyon, D., Louguit, M., & August, D. (2012). Development and validation of extract the base: an English derivational morphology test for third through fifth grade monolingual students and Spanish-speaking English language learners. Language Testing, 29(2), 265–289. [Google Scholar]
  31. Granfeldt, J., & Ågren, M. (2014). SLA developmental stages and teachers’ assessment of written French: Exploring Direkt Profil as a diagnostic assessment tool. Language Testing, 31(3), 285–305. [Google Scholar]
  32. Green, A., & Hawkey, R. (2012). Re-fitting for a different purpose: A case study of item writer practices in adapting source texts for a test of academic reading. Language Testing, 29(1), 109–129. [Google Scholar]
  33. Han, C. (2016). Investigating score dependability in English/Chinese interpreter certification performance testing: A generalizability theory approach. Language Assessment Quarterly, 13(3), 186–201. [Google Scholar]
  34. Harding, L. (2014). Communicative language testing: Current issues and future research. Language Assessment Quarterly, 11(2), 186-197. [Google Scholar]
  35. Harding, L., Alderson, J. C., & Brunfaut, T. (2015). Diagnostic assessment of reading and listening in a second or foreign language: Elaborating on diagnostic principles. Language Testing, 32(3), 317–336. [Google Scholar]
  36. Haug, T. (2012). Methodological and theoretical issues in the adaptation of sign language tests: An example from the adaptation of a test to German Sign Language. Language Testing, 29(2), 181–201. [Google Scholar]
  37. Hirai, A., & Koizumi, R. (2013). Validation of empirically derived rating scales for a story retelling speaking test. Language Assessment Quarterly, 10(4), 398–422. [Google Scholar]
  38. Hoang, G. T. L., & Kunnan, A. J. (2016). Automated Essay Evaluation for English Language Learners: A Case Study of MY Access. Language Assessment Quarterly, 13(4), 359–376. [Google Scholar]
  39. Hsieh, M. (2013a). An application of Multifaceted Rasch measurement in the Yes/No Angoff standard setting procedure. Language Testing, 30(4), 491–512. [Google Scholar]
  40. Hsieh, M. (2013b). Comparing yes/no Angoff and Bookmark standard setting methods in the context of English assessment. Language Assessment Quarterly, 10(3), 331–350. [Google Scholar]
  41. Hsu, T. H. L. (2016). Removing bias towards World Englishes: The development of a Rater Attitude Instrument using Indian English as a stimulus. Language Testing, 33(3), 367–389. [Google Scholar]
  42. Huang, B., Alegre, A., & Eisenberg, A. (2016). A cross-linguistic investigation of the effect of raters’ accent familiarity on speaking assessment. Language Assessment Quarterly, 13(1), 25–41. [Google Scholar]
  43. Huang, F. L., & Konold, T. R. (2014). A latent variable investigation of the Phonological Awareness Literacy Screening-Kindergarten assessment: Construct identification and multigroup comparisons between Spanish-speaking English-language learners (ELLs) and non-ELL students. Language Testing, 31(2), 205–221. [Google Scholar]
  44. Ilc, G., & Stopar, A. (2015). Validating the Slovenian national alignment to CEFR: The case of the B2 reading comprehension examination in English. Language Testing, 32(4), 443–462. [Google Scholar]
  45. Jin, T., & Mak, B. (2013). Distinguishing features in scoring L2 Chinese speaking performance: How do they work?. Language Testing, 30(1), 23–47. [Google Scholar]
  46. Jin, T., Mak, B., & Zhou, P. (2012). Confidence scoring of speaking performance: How does fuzziness become exact?. Language Testing, 29(1), 43–65. [Google Scholar]
  47. Kang, O. (2012). Impact of rater characteristics and prosodic features of speaker accentedness on ratings of international teaching assistants' oral performance. Language Assessment Quarterly, 9(3), 249–269. [Google Scholar]
  48. Katzenberger, I., & Meilijson, S. (2014). Hebrew language assessment measure for preschool children: A comparison between typically developing children and children with specific language impairment. Language Testing, 31(1), 19–38. [Google Scholar]
  49. Kim, H. J. (2015). A qualitative analysis of rater behavior on an L2 speaking assessment. Language Assessment Quarterly, 12(3), 239–261. [Google Scholar]
  50. Knoch, U., & Chapelle, C. A. (2017). Validation of rating processes within an argument-based framework. Language Testing, 34, 1–23.  [Google Scholar]
  51. Kokhan, K. (2013). An argument against using standardized test scores for placement of international undergraduate students in English as a Second Language (ESL) courses. Language Testing, 30(4), 467–489. [Google Scholar]
  52. Koo, J., Becker, B. J., & Kim, Y. S. (2014). Examining differential item functioning trends for English language learners in a reading test: A meta-analytical approach. Language Testing, 31(1), 89–109. [Google Scholar]
  53. Kuiken, F., & Vedder, I. (2017). Functional adequacy in L2 writing: Towards a new rating scale. Language Testing, 34(3), 321–336. [Google Scholar]
  54. Kyle, K., Crossley, S. A., & McNamara, D. S. (2016). Construct validity in TOEFL iBT speaking tasks: Insights from natural language processing. Language Testing, 33(3), 319–340. [Google Scholar]
  55. Lado, R. (1961). Language Testing. New York: McGraw-Hill. [Google Scholar]
  56. Lam, R. (2015). Language assessment training in Hong Kong: Implications for language assessment literacy. Language Testing, 32(2), 169–197. [Google Scholar]
  57. Lee, H., & Winke, P. (2013). The differences among three-, four-, and five-option-item formats in the context of a high-stakes English-language listening test. Language Testing, 30(1), 99-123. [Google Scholar]
  58. Lee, S., & Winke, P. (2018). Young learners’ response processes when taking computerized tasks for speaking assessment. Language Testing, 35(2), 239-269. [Google Scholar]
  59. Li, H., & Suen, H. K. (2013). Detecting native language group differences at the subskills level of reading: A differential skill functioning approach. Language Testing, 30(2), 273–298. [Google Scholar]
  60. Li, H., Hunter, C. V., & Lei, P. W. (2016). The selection of cognitive diagnostic models for a reading comprehension test. Language Testing, 33(3), 391–409. [Google Scholar]
  61. Lin, C. K., & Zhang, J. (2014). Investigating correspondence between language proficiency standards and academic content standards: A generalizability theory study. Language Testing, 31(4), 413–431. [Google Scholar]
  62. Ling, G. (2017). Is writing performance related to keyboard type? An investigation from examinees’ perspectives on the TOEFL iBT. Language Assessment Quarterly, 14(1), 36–53. [Google Scholar]
  63. Mann, W., Roy, P., & Morgan, G. (2016). Adaptation of a vocabulary test from British Sign Language to American Sign Language. Language Testing, 33(1), 3–22. [Google Scholar]
  64. Murray, J. C., Riazi, A. M., & Cross, J. L. (2012). Test candidates’ attitudes and their relationship to demographic and experiential variables: The case of overseas trained teachers in NSW, Australia. Language Testing, 29(4), 577–595. [Google Scholar]
  65. Nakatsuhara, F., Inoue, C., Berry, V., & Galaczi, E. (2017). Exploring the use of video-conferencing technology in the assessment of spoken language: a mixed-methods study. Language Assessment Quarterly, 14(1), 1–18. [Google Scholar]
  66. Pan, M., & Qian, D. D. (2017). Embedding Corpora into the Content Validation of the Grammar Test of the National Matriculation English Test (NMET) in China. Language Assessment Quarterly, 14(2), 120–139. [Google Scholar]
  67. Papageorgiou, S., & Cho, Y. (2014). An investigation of the use of TOEFL® Junior™ Standard scores for ESL placement decisions in secondary education. Language Testing, 31(2), 223–239. [Google Scholar]
  68. Pill, J., & McNamara, T. (2016). How much is enough? Involving occupational experts in setting standards on a specific-purpose language test for health professionals. Language Testing, 33(2), 217–234. [Google Scholar]
  69. Saida, C. (2017). Creating a Common Scale by Post-Hoc IRT Equating to Investigate the Effects of the New National Educational Policy in Japan. Language Assessment Quarterly, 14(3), 257–273. [Google Scholar]
  70. Sanchez, S. V., Rodriguez, B. J., Soto-Huerta, M. E., Villarreal, F. C., Guerra, N. S., & Flores, B. B. (2013). A case for multidimensional bilingual assessment. Language Assessment Quarterly, 10(2), 160–177. [Google Scholar]
  71. Sato, T. (2012). The contribution of test-takers’ speech content to scores on an English oral proficiency test. Language Testing, 29(2), 223–241. [Google Scholar]
  72. Savignon, S. J. (1972). Communicative competence: An experiment in foreign language teaching. Philadelphia: Center for Curriculum Development. [Google Scholar]
  73. Shaw, S., & Imam, H. (2013). Assessment of international students through the medium of English: Ensuring validity and fairness in content-based examinations. Language Assessment Quarterly, 10(4), 452–475. [Google Scholar]
  74. Shohamy, E., Gordon, C. M., & Kraemer, R. (1992). The effect of raters' background and training on the reliability of direct writing tests. The Modern Language Journal, 76(1), 27–33. [Google Scholar]
  75. Spolsky, B. (2008). Language assessment in historical and future perspective. In E. Shohamy & N. Hornberger (Eds.), Encyclopedia of language and education (Second ed., Vol. 7: Language testing and assessment, pp. 445–454). New York: Springer Science. [Google Scholar]
  76. Stansfield, C. W. (2008). Lecture: “Where we have been and where we should go”. Language Testing, 25(3), 311–326. [Google Scholar]
  77. Suvorov, R. (2015). The use of eye tracking in research on video-based second language (L2) listening assessment: A comparison of context videos and content videos. Language Testing, 32(4), 463–483. [Google Scholar]
  78. Suzuki, Y. (2015). Self-assessment of Japanese as a second language: The role of experiences in the naturalistic acquisition. Language Testing, 32(1), 63–81. [Google Scholar]
  79. Tengberg, M. (2017). National reading tests in Denmark, Norway, and Sweden: A comparison of construct definitions, cognitive targets, and response formats. Language Testing, 34(1), 83–100. [Google Scholar]
  80. Timpe-Laughlin, V., & Choi, I. (2017). Exploring the Validity of a Second Language Intercultural Pragmatics Assessment Tool. Language Assessment Quarterly, 14(1), 19–35. [Google Scholar]
  81. Vogt, K., & Tsagari, D. (2014). Assessment literacy of foreign language teachers: Findings of a European study. Language Assessment Quarterly, 11(4), 374–402. [Google Scholar]
  82. Wagner, E. (2013). An investigation of how the channel of input and access to test questions affect L2 listening test performance. Language Assessment Quarterly, 10(2), 178–195. [Google Scholar]
  83. Wei, J., & Llosa, L. (2015). Investigating differences between American and Indian raters in assessing TOEFL iBT speaking tasks. Language Assessment Quarterly, 12(3), 283–304. [Google Scholar]
  84. Widdowson, H. G. (1983). Learning purpose and language use. Oxford: Oxford University Press. [Google Scholar]
  85. Winke, P., Gass, S., & Myford, C. (2013). Raters’ L2 background as a potential source of bias in rating oral performance. Language Testing, 30(2), 231-252. [Google Scholar]
  86. Xi, X., Higgins, D., Zechner, K., & Williamson, D. (2012). A comparison of two scoring methods for an automated speech scoring system. Language Testing, 29(3), 371–394. [Google Scholar]
  87. Zhang, L., Goh, C. C., & Kunnan, A. J. (2014). Analysis of test takers’ metacognitive and cognitive strategy use and EFL reading test performance: A multi-sample SEM approach. Language Assessment Quarterly, 11(1), 76–102. [Google Scholar]