The transcription of COSER materials reflects three types of information and each has its own conventions for indicating it.

Orthographic transcription

COSER is a corpus oriented towards the study of morphosyntactic variation, so the transcription of the materials is generally carried out following the conventions of standard orthography. Although a few concessions are made to dialectal pronunciation, in no case can the COSER transcriptions be considered a reflection of the phonetic or phonological variation of the varieties of peninsular Spanish. To draw conclusions or comparisons of this kind, it is necessary to resort directly to the sound files which available to users on the web together with the corresponding transcriptions.

1. Punctuation

Punctuation marks (. / , / ; / : /…/ ¿? / ¡!) are used in accordance with the general rules of Spanish punctuation.
The ellipses (…) (...) are mainly used when a sentence or syntagm remains unfinished or in cases of a short pause within a sentence, sometimes with repetition of the same word. If the word is repeated, a comma (,) is added after the ellipsis. For example:

I1: No, mujer es que como tengo aquí el… Se hace una masa de, de, de todos esos ingredientes y después se meten en las tripas del cerdo, y se cuecen una hora o hora y media, y luego se fríen. Y nada más… y luego, pues al otro día se estaca al marrano, pues despedazarle todo y se aparta por un lado el…, el magro y por otro lado los jamones, el lomo, con el magro se hacen los chorizos, y nada más.
E1: Y el magro, ¿se pica o…?
I1: Se pica, claro, para hacer el…
E1: ¿Cómo, cómo es eso?

If there are noticeable pauses, these are usually indicated using the list of conversation tags ("Pauses and silence").
All interventions begin with a capital letter. However, when there is an overlap at the end of a question, the informant's or interviewer's intervention should begin with a lower-case letter. For example:

E1: Y ¿usó algún banco o una mesa [HS:I1 Bueno, sí,]?
I1: una mesa para echar el cerdo y luego pues tres o cuatro hombres, pues se le tenía a fuerza, no había que atarle ni nada, era solo pues la fuerza de los hombres.

Simultaneous speech interventions have their own punctuation. For example:

I1: Pues la matanza llega el mes de… diciembre [HS:E1 Por diciembre.] y normalmente pues entonces…

Words that the informant leaves out are marked with a hyphen (-) at the cut-off point. If it is interpreted that there is a pause, ellipses may follow. When the informant completely leaves a mid-sentence and changes the subject, a pipe symbol or vertical bar [ | ] is used. Most frequently, the informant rephrases or self-corrects within the same discourse. In these cases, the comma (,) is used. Examples:

Fui–, fuimos allí.
Fui–, marchamos allí.
y volver a echar agua con jabón otra vez, eh..., que esté fre–..., eh..., húmedo la ropa, y se le va la mancha, mejor que con lejía y con todo.
Recogíamos pri–..., en los caseríos se recogían.
en la casa, la tienda había levadura.

The repetition of a word or sequence is always separated by a comma. If there is a short pause, an ellipsis followed by a comma is used. For example:

Estábamos en, en la casa.
Empecé con…, empecé con la tienda.

The transcription of direct speech quotations mentioned by the informant is transcribed in inverted commas. If the speech verb precedes the direct speech it is followed by a colon, inverted commas, capital letter. If the verb follows the direct speech, it is separated by commas. For example:

I1: El cura me dijo: "Mañana os caso".
I1: "Mañana vendrá", dice.

When direct speech reflects the interventions of different speakers, without using speech verbs, these interventions are separated by a full stop and begin with a capital letter. Example:

I1: "Aintzane". "¿Qué?". "Tú qué | cómo… Tú, ¿quién crees que eres?". Me dijo: "¿Pues?". "Ninguna pasea y tú pa pasear un domingo por la tarde con el novio, tú, ¿quién crees que eres?". "Pues yo, Aintzane". Le dije: "Porque por las noches nos tienes controlaos y de día no podemos, a ver qué coño vamos a hacer", le dije.

2. Transcription of phonetic, phonological and morphophonological variants

The general rule is the use of conventional spelling, although certain concessions are made to dialectal phonetics, especially in reflecting the deletion, addition and metathesis of sounds. The substitution of sounds for other sounds is more problematic, since without a rigorous phonetic analysis it is not easy to determine which segment is used as a substitute, except in an impressionistic way. On the other hand, the graphic mimesis of seseo, ceceo, yeísmo or the glottalisation of consonants in syllabic coda would require an actual graphic metamorphosis of the transcribed texts, which is not the best option for the subsequent morphosyntactic annotation. For this reason, we have dispensed with transcribing those phonetic evolutions which imply a change of timbre in the vowels or processes of assimilation or relaxation (such as fricatisation, rhotacism or glottalisation) in the consonants.


If a vowel, consonant or syllable segment is deleted, either within a word or because of syntactic phonetics, it is not transcribed. If it is a vowel merger between words in the syntactic chain, it is marked with ('). For example:

comprao corresponds to "comprado"
comío corresponds to "comido"
comelo corresponds to "comerlo"
s’ha caído corresponds to "se ha caído"
corresponds to "está"
pa corresponds to "para"
to’l corresponds to "todo el"
to corresponds to "todo"
na corresponds to "nada"
sabís corresponds to "sabéis"
ventitrés corresponds to "veintitrés"
mu corresponds to "muy"
pa’l corresponds to “para el”
d’allí corresponds to "de allí"
onde corresponds to "donde"
ande corresponds to "adonde"
buenismo corresponds to "buenísimo"
ará corresponds to "arar" or "arada"
cazaó corresponds to "cazador"

Exceptionally, we will also transcribe pos ("pues") in this category.

When such a merger is of equal vowels and usually occurs in Spanish, the apostrophe is not used:

entre el
le echamos
de este

Segment replacement

In general, phonetic or phonological variations involving segment substitution will not be reflected. This means that aspects where there is a phonetic alteration in the habitual pronunciation of an unstressed vowel [e, i] [o, u] or a consonant are not reflected. The only exception is the hesitation between the unstressed [a, e, o]. A problematic point is the opening of vowels conditioned by the presence of a glottal consonant. According to the above criteria, this opening is not reflected, not even the possible consequent loss of the glottal, because without a rigorous phonetic analysis it is difficult to affirm its absence or presence in the sequence.

['ehta] is transcribed esta.
['deɾ̼de] is transcribed desde.
['tɾ̼aθtor] is transcribed tractor.
['bɾ̼anka] is transcribed blanca.
[kaθa'ol] is transcribed cazaor.
['ɣweno] is transcribed bueno.
[a'ßuxa] is transcribed aguja.
['bɛjle] is transcribed baile.
[ko'mel] is transcribed comer.
[ßu'sotros] is transcribed vosotros.
[sus] is transcribed sos.
[kan'sau] is transcribed cansao.
[pi'seβ̞ɾ̼re] is transcribed pesebre.
['parako] is transcribed párraco 'párroco'.
[alper'ɣata] is transcribed alpergata 'alpargata'.
['poʝo] is transcribed pollo.
['kasa] is transcribed caza.
[θo'koro] is transcribed socorro.
[mu'ʃaʃo] or [mu'ʝaʝo] is transcribed muchacho.
[ohpi'tɛ] is transcribed hospital.
[lah 'kasɛ] is transcribed las casas.

Adding or changing the order of segments

In the case of addition or change of order of phonemes, this change shall be transcribed in the conventional spelling:


When this addition involves a velar increment before the initial diphthong [we] or a palatal increment before the diphthong [je], the conventional spelling is used:

['ɣwerto] se transcribe huerto.
['ʝelo] se transcribe hielo

Accent changes

A diacritical accent mark is used to indicate certain dialectal pronunciations. The tonic possessives, as well as words that require it, can be given a written accent, as opposed to the conventional orthographic rules, to make the change of accent explicit.

['aj] áhi
['maiθ] máiz
[sa'ßana] sabána
[pa'xaros] pajáros
[sen'tajka] sentáica "sentadica"

Morphological variations

The dialectal morphology is always reflected as it appears:

un quesu
vs. unos quesos
bemos (por habemos)
béis (por habéis)

3. Other graphic conventions

Numerals are transcribed in letters, except in the case of years. For example:

I1: Tuve doce hijos.
I1: Me casé en el 48.

Words from other languages (Basque, Latin, English, etc.) are transcribed in italics.