Özel Amaçlı Derlemi Çevriyazmak: Bir Çevriyazı Modeli

nsanlar arasındaki etkileşim pek çok disiplinde çalışılan bir konudur. Bu etkileşimin sözlü ve yazılı dili içeren metinlerinin toplamı derlem (İng. corpus) olarak adlandırılır. Sözlü derlemlerin büyük çoğunluğu veriyi aynı fiziksel ortamda sundukları için tek-ortamlı (İng. mono-modal) ve metintemelli kayıtlardır. Böylesi tek-ortamlı derlemler dili ve etkileşimi metnin ötesinde yansıtamadıkları için kapsam bakımından sınırlıdırlar. Anlam, metin, jest ve bürün öğeleri gibi sözel ve sözel-olmayan özelliklerin birleşimi ile kurulduğu için etkileşimin, aslında, çok-ortamlı (İng. multi-modal) olduğu göz önüne alındığında, bu problem yaratmaktadır. Çok-ortamlı derlemler doğal konuşma kayıtlarından elde edilen etkileşimin video, ses ve metin kayıtlarından oluşmaktadır. Kayıt, etkileşimin sistematik bir incelemesi için tek başına yeterli değildir. Bu nedenle, çevriyazı sözlü dilin gösterimi ve incelenmesinde önemlidir. Sözlü etkileşimi incelemek için işitsel ya da görsel veri çevriyazı yoluyla yazılı metin biçimine getirilir. Çevriyazılar, verinin çalışmanın amacından bağımsız olarak oluşturulan yansız gösterimleri değildirler ve farklı aşamalarda farklı yollarla araştırmaya hizmet ederler. Bu çalışmada, derlem dilbilimine çok-ortamlı bir yaklaşım ile özel amaçlı bir derlemin gülmece ve gülme incelemesi için nasıl çevriyazılabileceği konusunda bir model önerilmiştir.

Transcribing Special Purpose Corpus: A Transcription Model

The interaction between people is an issue studied in many disciplines. The compilation of texts including written and spoken language of this interaction is called corpus. A great majority of spoken corpora is mono-modal and textbased recordings since they instantiate the data in the same physical medium. Such mono-modal corpora are limited in scope because they are unable to represent language and interaction beyond the text. This is problematic given that interaction is in fact multi-modal, as meaning is constructed through the combination of verbal and non-verbal characteristics such as text, gesture and prosody. Multi-modal corpora are composed of the audio, video and text recordings of the interaction gathered from naturally occurring speech. Recordings are not sufficient by themselves for the systematic analysis of interaction. For this reason, transcription is important in representation and analysis of spoken language. To analyze spoken interaction, audio or video data is transcribed into a written form. Transcriptions are not unbiased representations of the data formed independently from the purpose of the analysis, and they serve the work of analysis in different ways at different stages. In this paper, by a multi-modal approach to corpus linguistics, a model transcription system is proposed concerning how a specialized corpus is transcribed for an analysis of humor and laughter.

___

Adolphs, S., Knight, D. ve Carter, R. (2011). Capturing context for heterogeneous corpus analysis: Some first steps. International Journal of Corpus Linguistics, 16 (3), 305-324.

Adolphs, S. ve Carter, R. (2013). Spoken corpus linguistics: From monomodal to multimodal. New York: Routledge.

Aksan, Y., Aksan, M., Koltuksuz, A. Sezer, T., Mersinli, Ü., Demirhan, U. U., Yilmazer, H., Atasoy, G., Öz, S., Yıldız, İ. ve Kurtoğlu, Ö. (2012). Construction of the Turkish National Corpus (TNC). In Proceedings of the Eight International Conference on Language Resources and Evaluation (LREC 2012). İstanbul, Türkiye. http://www.lrec-conf.org/proceedings/lrec2012/papers.html

Allwood, J. (2008). Multimodal corpora. A. Lüdeling ve M. Kytö (Haz.), Corpus linguistics: An international handbook (Vol 1), (207-225). Berlin: Walter de Gruyter.

Anthony, L. (2012). AntConc (version 3.3.5w) [Computer software]. Tokyo, Japan: Waseda University.

Bezemer, J. ve Jewitt, C. (2010). Multimodal analysis: Key issues. L. Litosseliti (Haz.), Research methods in linguistics (180-197). London: Continuum International Publishing.

Bezemer, J. ve Mavers, D. (2011). Multimodal transcription as academic practice: A social semiotic perspective. International Journal of Social Research Methodology, 14 (3), 191-206.

Biber, D., Conrad, S. ve Reppen, R. (1998). Corpus linguistics: Investigating language structure and use. Cambridge: Cambridge University Press.

Bowker, L. ve J. Pearson (2002). Working with specialized language: A practical guide to using corpora. London: Routledge.

Brown, G., ve Yule, G. (1983). Discourse analysis. Cambridge: Cambridge University Press.

Bucholtz, M. (2000). The politics of transcription. Journal of Pragmatics, 32 (10), 1439–1465.

Bucholtz, M. (2007). Variation in transcription. Discourse Studies, 9 (6), 784-808. Burnard, L. (2000). Reference guide for the British National Corpus (World Edition). Available on-line at _____http://www.hcu.ox.ac.uk/BNC/World/html/urg.html>. Cameron, D. (2001) Working with spoken discourse. London: Sage.

Cienki, A. (2005). Image schemas and gesture. B. Hampe (Haz.), From perception to meaning: Image schemas in cognitive linguistics (421-442). Berlin: Mouton de Gruyter.

Cienki, A. ve Müller, C. (2008). Metaphor, gesture and thought. R. W. Gibbs, Jr. (Haz.), The Cambridge handbook of metaphor and thought (483-502). Cambridge: Cambridge University Press.

Conrad, S. (2002). Corpus linguistic approaches for discourse analysis. Annual Review of Applied Linguistics, 22, 75-95.

Davies, M. (2008). The corpus of contemporary American English: 450 million words, 1990-present. Available online at http://corpus.byu.edu/coca/

Dressler, R. A. ve Kreuz, R. J. (2000). Transcribing oral discourse: A survey and a model system. Discourse Processes, 29 (1), 25-36.

Du Bois, J. W. (1991). Transcription design principles for spoken language research. Pragmatics, 1 (1), 71-106.

Du Bois, J. W., Schuetze-Coburn, S., Cumming, S. ve Paolino, D. (1993). Outline of discourse transcription. J. A. Edwards ve M. D. Lampert (Haz.), Talking data: Transcription and coding in discourse research (45-89). Hillsdale, NJ: Lawrence Erlbaum.

Edwards, J. A. (1992). Design principles for the transcription of spoken discourse. J. Svartvik (Haz.) Directions in Corpus Linguistics: Proceedings of the Nobel Symposium 82, Stockholm, August 4–8, 1991 (129-148). New York: Mouton de Gruyter .

Edwards, J. (1993). Principles and contrasting systems of discourse transcription. Edwards, J. ve M. D. Lampert, (Haz.), Talking data: Transcription and coding in discourse research (3-44). Hillsdale, NJ: Lawrence Erlbaum Associates.

Edwards, J. A. (2003). The transcription of discourse. D. Schiffrin, D. Tannen ve H. E. Hamilton (Haz.), The handbook of discourse analysis (321-348). Malden: Blackwell Publishing.

Edwards, J. A. ve Lampert, M. D. (Haz.) (1993). Talking data: Transcription and coding in discourse research. Hillsdale, NJ: Lawrence Erlbaum.

Ehlich, K. (1993). HIAT: A transcription system for discourse data. J. A. Edwards ve M. D. Lampert (Haz.), Talking data: Transcription and coding in discourse research (123-148). Hillsdale, NJ: Lawrence Erlbaum.

Erickson, F. (2011). Uses of video in social research: A brief history. International Journal of Social Research Methodolog, 14 (3), 179-189.

Floewerdew, L. (2004). The argument for using English specialized corpora to understand academic and professional settings. U. Connor ve T. Upton (Haz.), Discourse in the professions: Perspectives from corpus linguistics (11-33). Amsterdam: John Benjamins.

Francis, W. N. ve Kučera, H. (1964). Brown Corpus manual, Brown University, Providence, Unpublished manuscript, Rhode Island, US.

Gee, J. P. (2011). An introduction to discourse analysis: Theory and method. New York: Routledge.

Green, J., Franquiz, M. ve Dixon, C. (1997). The myth of the objective transcript: Transcribing as a situated act. TESOL Quarterly, 31 (1), 172–176.

Gu, Y. (2006). Multimodal text analysis: A corpus linguistic approach to situated discourse. Text and Talk, 26(2), 127-167.

Gu, Y. (2009). From real-life situated discourse to video-stream data-mining: An argument for agent-oriented modeling for multimodal corpus compilation. International Journal of Corpus Linguistics, 14 (4), 433-466.

Gu, Y. (2010). The activity type as interface between langue and parole, and between individual and society: An argument for trichotomy in pragmatics. Pragmatics and Society, 1 (1), 74-101.

Gumperz, J. J. ve Berenz, N. (1993). Transcribing conversational exchanges. J. A. Edwards ve M. D. Lampert (Haz.), Talking data: Transcription and coding in discourse research (91-121). Hillsdale, NJ: Lawrence Erlbaum.

Hatipoğlu, Ç. ve Karakaş, Ö. (2010). Sözlü derlem çeviriyazısını standart dil ve ağıza gore ölçünleştirme. Ç. Sağın Şimşek ve Ç. Hatipoğlu (Haz.), 24. Ulusal Dilbilim Kurultayı Bildiri Kitabı, 17-18 Mayıs 2010 (444-454). Ankara: ODTÜ Yabancı Diller Eğitimi Bölümü.

Holsanova, J. (2012). New methods for studying visual communication and multimodal integration. Visual communication, (Special Issue: Methodologies for multimodal research), 11 (3), 251-257.

Hunston, S. (2002). Corpora in applied linguistics. Cambridge: Cambridge University Press.

Işık Güler, H. ve Eröz Tuğa, B. (2010). Çeviriyazıda geribildirim, sesli duraklama, sessizlikler ve ünlemlerin ölçünleştirilmesi. Ç. Sağın Şimşek ve Ç. Hatipoğlu (Haz.), 24. Ulusal Dilbilim Kurultayı Bildiri Kitabı, 17-18 Mayıs 2010 (455-462). Ankara: ODTÜ Yabancı Diller Eğitimi Bölümü.

Jefferson, G. (1985). An exercise in the transcription and analysis of laughter. T. A. van Dijk, (Haz.), Handbook of discourse analysis. Vol. 3. (25-34). London: Academic Press.

Jefferson, G. (2004). Glossary of transcript symbols with an introduction. G. H. Lerner (Haz.), Conversation analysis: Studies from the first generation (13-31). Amsterdam/Philadelphia: John Benjamins.

Kipp, M., Neff, M. ve Albrecht, I. (2007). An annotation scheme for conversational gestures: How to economically capture timing and form. Language Resources and Evaluation, 41(3/4), 325–339.

Koester, A. (2010). Building small specialised corpora. A. O’Keeffe ve M. McCarthy (Haz.), The Routledge handbook of corpus linguistics (66-79). New York: Routledge.

Kress, G., Jewitt, C., Bourne, J., Franks, A., Hardcastle, J., Jones, K. ve Reid, E. (2005). English in urban classrooms: A multimodal perspective on teaching and learning. London: Routledge Falmer.

Kress, G., Jewitt, C., Ogborn, J, ve Tsatsarelis, C. (2001). Multimodal teaching and learning. London: Continuum Press.

Lapadat, J. C. ve Lindsay, A. C. (1999). Transcription in research and practice: From standardisation of technique to interpretative positioning. Qualitative Inquiry, 5 (1), 64-86.

Leech, G., Myers, G. ve Thomas, J. (Haz.). (1995). Spoken English on computer: Transcription, mark-up and application. London: Longman.

Mittelberg, I. (2007). Methodology for multimodality: One way of working with speech and gesture data. M. Gonzalez-Marquez, I. Mittelberg, S. Coulson ve M. J. Spivey (Haz.), Methods in cognitive linguistics (225-248). Amsterdam: John Benjamins.

Moisl, H. (2009). Exploratory multivariate analysis. A. Lüdeling ve M. Kytö (Haz.), Corpus linguistics: An international handbook Vol. 2 (874-899). Berlin: Mouton de Gruyter.

Nelson, G. (1997). Standardizing wordforms in a spoken corpus. Literary and Linguistic Computing, 12 (2), 79-85.

Norris, S. (2004). Analyzing multimodal interaction: A methodological framework. London: Routledge.

O’Connell, D.C. ve Kowal, S. (1999). Transcription and the issue of standardisation. Journal of Psycholinguistic research, 28 (2), 103-120.

O’Keeffe, A., McCarthy, M. J. ve Carteri R. A. (2007). From corpus to classroom. Cambridge: Cambridge University Press.

Ochs, E. (1979). Transcription as theory. E. Ochs, ve B. B. Schieffelin (Haz.), Developmental pragmatics (43-72). New York: Academic Press.

Ochs, E., Graesch, A. P., Mittmann, A., Bradbury, T. ve Repetti, R. (2006). Video ethnography and ethnoarcheological tracking. M. Pitt-Catsouphes, E. E. Kossek ve S. Sweet (Haz.), The work and family handbook: Multi-disciplinary perspectives, methods, and approaches (387-409). Mahwah, NJ: Lawrence Erlbaum Associates.

Oflazer, K., Say, B., Hakkani-Tür, D. Z., Tür, G. (2003). Building a Turkish treebank, A. Abeille (Haz.), Building and exploiting syntactically annotated corpora (261- 177). London: Kluwer Academic Publishers.

Paltridge, B. (2006). Discourse Analysis. London: Continuum. Roberts, C., (1997). Transcribing talk: Issues of representation. TESOL Quarterly, 31(1), 167-172.

Ruhi, Ş., Hatipoğlu, Ç., Işık-Güler, H. ve Eröz-Tuğa, B. (2010a). A Guideline for transcribing conversations for the construction of spoken Turkish corpora using EXMARalDA and HIAT (1st edition). Ankara: Setmer.

Ruhi, Ş., Eröz-Tuğa, B., Hatipoğlu, Ç., Işık-Güler, H., Acar, M., Güneş C., Eryılmaz, K., Can, H., Karakaş, Ö. ve Çokal Karadaş, D., (2010b). Sustaining a corpus for spoken Turkish discourse: Accessibility and corpus management issues. Language

Resources: From Storyboard to Sustainability and LR Lifecycle Management, LREC May 17-24, 2010, Malta, (44-48). Retrieved from http://lrec-conf.org/proceedings/ lrec2010/workshops/ W20.pdf#page=52

Ruhi, Ş. ve Işık Taş, E. E. (2012). Constructing general and dialectal corpora for language variation research: Two case studies from Turkish. Workshop on “Best Practices in Speech Corpora for Linguistic Research”, LREC 2012, 21 Mayıs 2012, İstanbul, ss. 17-21.

http://www.corpora.uni-hamburg.de/lrec2012/Proceedings_Complete.pdf Say, B., Zeyrek, D., Oflazer, K. ve Özge, U. (2004). Development of a corpus and a treebank for present-day written Turkish. K. İmer ve G. Doğan (Haz), Current Research in Turkish Linguistics (183-192). Gazimağusa: Eastern Mediterranean University Press.

Schiffrin, D. (1994). Approaches to discourse. Oxford: Blackwell.

Sinclair, J. M. (1991). Corpus, concordance, collocation. Oxford: Oxford University Press.

Stubbs, M. (1983). Discourse analysis: The sociolinguistic analysis of natural language. Chicago: University of Chicago Press.

Stubbs, M. (1996). Text and corpus analysis: Computer-assisted studies of language and culture. Oxford: Blackwell.

Thompson, P. (2005). Spoken language corpora. M. Wynne (Haz.), Developing linguistic corpora: A guide to good practice (59-70). Oxford: Oxbow Books. http://ahds.ac.uk/linguistic-corpora/

Thompson, P. (2010). Building a specialized audio-visual corpus. A. O’Keeffe ve M. McCarthy (Haz.), The Routledge handbook of corpus linguistics (93-103). New York: Routledge.

Tracy, K. ve Mirivel, J. C. (2009). Discourse analysis: The practice and practical value of taping, transcribing, and analyzing talk. L. R. Frey ve K. N. Cissna (Haz.), Routledge handbook of applied communication research (153-177). New York: Routledge.

Uçar, A., Koca, C. ve Yıldız, İ. (2011). Türkçede konuşma gülmecesinin incelenmesi. Mersin Üniversitesi Bilimsel Araştırma Projesi Sonuç Raporu. Proje no: BAP-FEF- İDE (AU) 2010-5 A .

Uçar, A. ve Yıldız, İ. (hazırlanıyor). Humor and impoliteness in Turkish: A corpusbased analysis of the television show Komedi Dükkânı ‘Comedy Shop’. Ş. Ruhi ve Y. Aksan (Haz.), Exploring (im)politeness in specialized and general corpora: Converging methodologies and analytic procedures. Cambridge Scholars Publishing.

Zeyrek, D., Demirşahin, I. ve Sevdik-Çallı, A. B. (2009a). ODTÜ metin düzeyinde işaretlenmiş derlem projesi tanıtımı. Y. Özdemir (Haz.), Mersin sempozyumu bildirileri, (544-552). Mersin: Güven Ofset.

Zeyrek, D., Turan Ü. D., ve Bozşahin C. (2009b). The role of annotation in understanding discourse. S. Ay, Ö. Aydın, İ. Ergenç, S. Gökmen, S. İşsever ve D. Peçenek (Haz.), Essays on Turkish linguistics. Proceedings of the 14th international conference on Turkish linguistics, August 6-8, 2008 (303-310). Wiesbaden: Harrassowitz Verlag