Building a Singapore Learner Corpus of English Writing for Pedagogy

This paper documents the development of a Singapore learner corpus of English writing for pedagogy, which has been constructed at Nanyang Technological University, Singapore. This corpus comprises sample English artefacts produced by students at 3 levels, i.e. Primary 6 (Year 6), Secondary 4 (Year 10) and Junior College 2 (Year 12). It is built to capture and compare learners’ developmental features in terms of vocabulary, grammar and discoursal devices at different learning stages and therefore theorize on the nature of English writing development of learners in Singapore. The texts are tagged with meta information of learners’ school level, gender, ethnic group and grade. Issues of corpus design, e.g. representativeness in sampling, are also addressed. Finally, pedagogical implications and potential applications of the project are presented.

Building a Singapore Learner Corpus of English Writing for Pedagogy

This paper documents the development of a Singapore learner corpus of English writing for pedagogy, which has been constructed at Nanyang Technological University, Singapore. This corpus comprises sample English artefacts produced by students at 3 levels, i.e. Primary 6 (Year 6), Secondary 4 (Year 10) and Junior College 2 (Year 12). It is built to capture and compare learners’ developmental features in terms of vocabulary, grammar and discoursal devices at different learning stages and therefore theorize on the nature of English writing development of learners in Singapore. The texts are tagged with meta information of learners’ school level, gender, ethnic group and grade. Issues of corpus design, e.g. representativeness in sampling, are also addressed. Finally, pedagogical implications and potential applications of the project are presented

___

  • Braun, S. (2007). Integrating corpus work into secondary education: From data-driven learning to needs-driven corpora. ReCALL: the Journal of EUROCALL, 19(3), 307-328. doi:10.1017/S0958344007000535
  • Christie, F., & Derewianka, B. (2008). School discourse: Learning to write across the years of schooling. London, England: Continuum.
  • Deterding, D. (2007). Singapore English. Edinburgh, Scotland: Edinburgh University Press.
  • Deterding, D., & Low, E. L. (2001).The NIE Corpus of Spoken Singapore English (NIECSSE). SAAL Quarterly, 56, 2-5.
  • Garside, R., & Smith, N.(1997). A hybrid grammatical tagger: CLAWS4. In R.Garside, G.Leech & A.McEnery, (Eds.), Corpus annotation: Linguistic information from computer text corpora (pp. 102-121). London: Longman.
  • Granger, S. (2009). The contribution of learner corpora to second language acquisition and foreign language teaching: A critical evaluation. In K. Aijmer (Ed.), Corpora and language teaching (pp. 13-32). Amsterdam, Netherlands/Philadelphia, Pennsylvania: John Benjamins.
  • Guo, L., & Hong, H. (2009). Metaphorization in Singaporean student writing: A corpus-based analysis. In R. Silver, C. Goh, & L. Alsagoff (Eds.), Language acquisition and development in new English contexts (pp. 112-131). London, England: Continuum.
  • Hong, H. (2005). SCoRE: A multimodal corpus database of education discourse in Singapore schools. In Proceedings of the Corpus Linguistics Conference Series, Vol. 1, No.1.July 14-17, 2005, University of Birmingham, England.
  • Leech, G., Garside, R., & Bryant, M. (1994). CLAWS4: The tagging of the British National Corpus. In Proceedings of the 15th International Conference on Computational Linguistics (COLING 94, pp. 622–628). Kyoto, Japan.
  • Lim, L. (Ed.). (2004). Singapore English: A grammatical description. Amsterdam, Netherlands/Philadelphia, Pennsylvania: John Benjamins.
  • Mukherjee, J., & Rohrbach, J.-M. (2006). Rethinking applied corpus linguistics from a language-pedagogical perspective: New departures in learner corpus research. In B. Kettemann & G. Marko (Eds.), Planning, gluing and painting corpora: Inside the applied corpus linguist's workshop (pp. 205-232). Frankfurt, France: Peter Lang.
  • O’Donnell, M.(2013).The UAM Corpus Tool. Retrieved from http://www.wagsoft.com/ CorpusTool/.
  • O'Donnell, M. (2008). The UAM CorpusTool: Software for corpus annotation and exploration. Proceedings of the XXVI Congreso de AESLA, Almeria, Spain, 3-5 April 2008.
  • Rayson, P. (2003). Matrix: A statistical method and software tool for linguistic analysis through corpus comparison. Unpublished doctoral dissertation. Lancaster University.
  • Rayson, P. (2012). Wmatrix: a web-based corpus processing environment.Computing Department, Lancaster University.Retrieved from http://ucrel.lancs.ac.uk/wmatrix/.
  • Rayson, P., &Wilson, A. (1996). The ACAMRIT semantictagging system: progress report. In L. J. Evett, & T. G. Rose (Eds.), Language Engineering for Document Analysis and Recognition, LEDAR, AISB96 Workshop Proceedings (pp. 13-20). Brighton, England.
  • Römer, U. (2008). Corpora and language teaching. In A. Lüdeling & M. Kytö (Eds.), Corpus linguistics: An international handbook (Vol. 1, pp. 112-131). Berlin, Germany: Walter de Gruyter.
  • Scott, M. (2012). WordSmith Tools version 6. Liverpool, England: Lexical AnalysisSoftware.
  • Seidlhofer, B. (2002). Pedagogy and local learner corpora: working with learning-driven data. In S. Granger, J. Hung, & S. Petch-Tyson.(Eds.), Computer learner corpora, second language acquisition and foreign language teaching (pp. 213-234).Amsterdam, Netherlands/Philadelphia, Pennsylvania: John Benjamins.