This NI_Readme.txt file was generated on 2021-12-01 by Esther glvarez de la Fuente and Raquel Fernndez Fuertes INDEX OF THE NATURAL INTERPRETING (NI) DATASET 1. GENERAL INFORMATION 1.1. Title of dataset 1.2. Author information 1.2.1. PI and co-PI 1.2.2. Lab 1.3. Objectives 1.4. Funding sources 1.5. Citing information 2. ACCESS INFORMATION 2.1. Licenses or restrictions 2.2. Publications 3. METHODOLOGICAL INFORMATION 3.1. Data elicitation procedure 3.1.1. CHILDES corpora 3.1.2. Annotations from other compilation forms 3.2. Data extraction procedure 3.3. Data classification: variables 4. DATA 4.1. Raw data 4.2. Database 4.3. Last update 5. RELATED DATASETS 1. GENERAL INFORMATION 1.1. Title of dataset: NI_Dataset 1.2. Author Information 1.2.1. PI and co-PI: Name: Esther glvarez de la Fuente Institution: University of Valladolid Address: Facultad de Filosofa y Letras, Paseo del Cauce s/n 47011, Valladolid (Spain) Email: esther.alvarez@uva.es Name: Raquel Fernndez Fuertes Institution: University of Valladolid Address: Facultad de Filosofa y Letras, Paseo del Cauce s/n 47011, Valladolid (Spain) Email: raquelff@uva.es 1.2.2. Lab: Name of the lab: UVALAL (University of Valladolid Language Acquisition Lab) Institution: University of Valladolid Address: https://uvalal.uva.es Email: gir.uvalal@uva.es 1.3. Objectives This investigation is focused on determining how bilingual children translate from one of their first languages to the other when they need to communicate. We examine the oral production of bilingual children with different language pairs as available in the CHILDES (Child Language Data Exchange System) (https://childes.talkbank.org/) project (MacWhinney 2000) (i.e., the FerFuLice, Ticio, Deuchar, Vila, GNP and Prez-Bazn corpora) as well as in other compilation forms (i.e., Ronjat 1913; Leopold 1939P1949; Swain 1972; Lanza 1988, 1997, 2001; Cossato 2008), covering two types of data: spontaneous, where the children interact with adults in a natural context (e.g., at home); and experimental, where the children act as interpreters between two monolingual researchers. The analysis of how these bilingual children interpret between their two first languages provides valuable information about the linguistic resources and translation strategies that these children use to communicate in a bilingual context through their interpreting performance. 1.4. Funding sources - 2017-2019: Regional Government of Castile and Len (Spain) and ERDF (European Regional Development Fund) [VA009P17], Aspectos de la dimensin internacional del contacto de lenguas: diagnsticos de la competencia lingstica bilinge ingls-espaol, PRINCIPAL INVESTIGATOR: R. Fernndez Fuertes (University of Valladolid, Spain) - 2007-2010: Spanish Ministry of Science and Technology and ERDF [HUM2007-62213], Elaboracin y anlisis de un corpus de datos de adquisicin del ingls y del espaol como L1 y L2 de nios y adultos: aprendizaje formal, naturaleza del input y factor edad, PRINCIPAL INVESTIGATOR: R. Fernndez Fuertes (University of Valladolid, Spain) - 2006-2008: Regional Government of Castile and Len (Spain) [VA046A06], Lenguas en contacto [ingls/espaol] en el contexto de Castilla y Len: adquisicin de L1 y L2, PRINCIPAL INVESTIGATOR: R. Fernndez Fuertes (University of Valladolid, Spain) - 2002-2005: Spanish Ministry of Science and Technology and ERDF [BFF2002-00442], La teora lingstica y el anlisis de los sistemas bilinges simultneos del ingls y del espaol, PRINCIPAL INVESTIGATOR: R. Fernndez Fuertes (University of Valladolid, Spain) - 2002-2003: Regional Government of Castile and Len (Spain) [UV 30/02], Estrategias para la enseanza de lenguas y la formacin del profesorado: estudio terico y prctico de la produccin lingstica de gemelos bilinges ingls/espaol, PRINCIPAL INVESTIGATOR: R. Fernndez Fuertes (University of Valladolid, Spain) 1.5. Citing information Publications using this dataset (or any part of it) should cite this dataset as follows: glvarez de la Fuente, E., R. Fernndez Fuertes, and n. Arratia Garca (2019) Bilingual children as interpreters in everyday life: how natural interpreting reinforces minority languages. Journal of Multilingual and Multicultural Development 40 (4): 338-355. https://doi.org/10.1080/01434632.2018.1518985. 2. ACCESS INFORMATION 2.1. Licenses or restrictions: There are no licenses/restrictions placed on the data from the corpora in CHILDES as they are freely available at the CHILDES project (https://childes.talkbank.org/) (MacWhinney 2000). However, in order to be able to run the CLAN programs (Computerized Language ANalysis) to perform automatic searches and calculations in the data from the FerFuLice corpus the CLAN software needs to be downloaded and installed. The CLAN software is freely available in CHILDES and there are Windows, Mac and Unix versions (https://dali.talkbank.org/clan/). 2.2. Publications: A partial or total access to information contained in the database can be found at the UVALAL webpage (publications section, http://uvalal.uva.es/index.php/results/publications-2/) 3. METHODOLOGICAL INFORMATION 3.1. Data elicitation procedure 3.1.1. CHILDES corpora (https://childes.talkbank.org/) (MacWhinney 2000): name of the corpus; age range of children; language pair; region/state (country); date (if available) - FerFuLice (https://childes.talkbank.org/access/Biling/FerFuLice.html); 1;01-6;11; English-Spanish; Salamanca (Spain); 1998-2004 - Ticio (https://childes.talkbank.org/access/Biling/Ticio.html): 1;06-2;04; Spanish-English; Texas (USA); 2007-2008 - Deuchar (https://childes.talkbank.org/access/Biling/Deuchar.html): 1;03-2;06; Spanish-English; Brighton (UK); 1986-1987 - Vila (https://childes.talkbank.org/access/Biling/Vila.html): 4;05-5;04; Spanish-Catalan; Barcelona (Spain); 1981-1984 - GNP (Genesee-Nicoladis-Paradis, https://childes.talkbank.org/access/Biling/GNP.html): 1;10-4;00; English/French; Montreal (Canada) - Prez-Bazn (https://childes.talkbank.org/access/Biling/Perez.html): 1;08-3;03; Spanish-English; Michigan (USA) 3.1.2. Annotations from other compilation forms - Ronjat, J. 1913. Le dveloppement du langage observ chez an enfant bilingue. Paris: Librairie Ancienne H. Champion. - Leopold, W.F. 1939-1949. Speech development of a bilingual child. A linguistUs record. Evanston, IL: Nortwestern University Press. - Swain, M.K. 1972. Bilingualism as a first language. PhD dissertation, University of California, Irvine. - Lanza, E. 1988. Language strategies in the home: linguistic input and infant bilingualism. Holmen, A., E. Hansen, J. Gimbel and J.N. J?rgensen (eds.) Bilingualism and the individual. Clevedon, U.K.: Multingual Matters. 69-84. - Lanza, E. 1997. Language mixing in infant bilingualism: a sociolinguistic perspective. Oxford: Oxford University Press. - Lanza, E. 2001. Bilingual first language acquisition. A discourse perspective on language contact in parent-child interaction. Cenoz, J. and F. Genesee (eds.) Trends in bilingual acquisition. Amsterdam: John Benjamins. 201-229. - Cossato, D. 2008. La mediazione lingstica in contesti bilingui: la parola ai bambini. MA dissertation, University of Trieste. 3.2. Data extraction procedure: All the NI cases were manually extracted from each data source, and they are compiled in the following pdf documents: - NI_Annotations_Ronjat_Leopold_Swain_Lanza.txt - NI_Cossato.txt; NI_Deuchar.txt - NI_FerFuLice.txt - NI_Genesee (GNP).txt - NI_Perez-Bazan.txt - NI_Ticio.txt - NI_Vila.txt. 3.3. Data classification: variables - Identifying variables: session of recording; file name; Natural Interpreting (NI) case; NI number - Demographic variables: age of the child (years; months; days); MLUw (per session); participantUs name - Linguistic variables [6 = (1)-(6)]*: (1)ACTIVITY: COMP = complete NI INCOMP = incomplete NI NULL = null NI (2)OT (Original Text) length in null NI cases: 1-4 words; 4-7 words; 14 (14 or more words) (3)OT (Original Text) complexity in null NI cases: SP (Simple Phrase); SS (Simple Sentence); CS (Complex Sentence) (4)DIRECTION: from language X to language Y or from language Y to language X, being X or Y = FR-French; GER-German; EN-English; SP-Spanish; NOR-Norwegian; CAT-Catalan; SWE-Swedish; HUN-Hungarian; IT-Italian (5)TYPES AND (6)SUBTYPES: equivalent (i.e., pairing, pairing CN = pairing with a communicative necessity) non-equivalent (i.e., ECO = economic, EXP = expansive) - Contextual variables [7 = (7)-(13)]: (7)TYPES OF STIMULUS: IN: induced OI: own initiative (8)INDUCER 1: PA = one of the parents ADULT = other adult different from parents INV = investigator (9)INDUCER 2: Min L of FA / MO / INV / ADULT = minority language of the FA_father/MO_mother/INV_investigator Com L of FA / MO / INV / ADULT = community language of the FA_father/MO_mother/INV_investigator (10)OT (Original Text) origin: AUTOTRANS = auto-translation OTHER = translation of other interlocutorUs utterance(s) SITUA = translation of a SITUATION (11)TYPE OF DATA: SPONTANEOUS = from naturalistic data EXPERIMENTAL = from NI experimental tests (12)DIRECTION/SETTING: 1 = to the community language in a monolingual setting 2 = to the minority language in a monolingual setting 3= to one of the languages in a bilingual setting 4 = to the other language in a bilingual setting (13)COMUNICATIVE STRATEGY: OPOL(M) = One Parent, One Language (in a Monolingual community) BMI = Bilingual-Monolingual Interaction MLH = Minority Language at Home OPOL(M) = One Parent, One Language (in a Bilingual community) [*Numbering coincides with that in the database] 4. DATA 4.1. Raw data - NI_Annotations_Ronjat_Leopold_Swain_Lanza.txt: it contains the NI cases produced by the bilingual children in RonjatUs (1913), LeopoldUs (1939-1949), SwainUs (1972) and LanzaUs (1988, 1997, 2001) studies. - NI_Cossato.txt: it contains the NI cases produced by the bilingual children in CossatoUs (2008) PhD Dissertation. - NI_Deuchar.txt, NI_FerFuLice.txt, NI_Genesee (GNP).txt, NI_Perez-Bazan.txt, NI_Ticio.txt, and NI_Vila.txt: each of these files contains the NI cases produced by the bilingual child(ren) of each of the corpora available in CHILDES [see section 3.1.1]. 4.2. Database NI_Database.csv: it contains the raw data with all the information related to the dataset, organized according to 3 different types of variables: identifying, demographic, linguistic and contextual [see section 3.3.]; number of variables = 16; number of rows = 756. The NI cases compiled were analyzed using Microsoft Excel (updated as Microsoft Excel for Mac, version 16.49 (21050901)). 4.3. Last update: 2019 5. RELATED DATASETS - Bilingual acquisition data: longitudinal corpus_FerFuLice dataset (https://uvadoc.uva.es/handle/10324/50964)