Third Review: Text Corpuses

INTRODUCTION:

The Corpus diacrónico del español (CORDE) is a textual corpus of all the times and places where the Spanish language has been spoken, since the very beginning of the language until the year 1975, when the Corpus de referencia del español actual (CREA) was created. The CORDE is designed to extract information to study words and their meanings, as well as the grammar and its use over time.

The CORDE started to be used in 1994, when the Academy brought up the possibility of applying the new technologies of information in order to create a data bank which improved the quality of their working materials and made data access easier. Currently, it has about 250 millon registers. This volume of information is the biggest set of lexical registers of the history of Spanish language.

The corpus collects written texts of very different kinds. These are distributed in prose and poetry and, inside each modality, in narrative, lyrical, dramatic, scientific-technic, historical, juidical, religious, journalistic and so on. The aim is to collect all geographical, historical and generical so that the whole is representative enough.

Today, CORDE is a necessary tool for any diachronical study that is related to the Spanish language. The Academy uses the CORDE systematically to document words, to classify some of them as old-fashioned or obsolete, to know the origin of some terms, their tradition in the language, the first appearing of words…

But one of the most important objectives of the diachronic corpus is to serve as a basic material for the production of the Nuevo diccionario histórico.

TEXT ACQUISITION:

The origin or source of the texts which arrive to CORDE is diverse:

– Books which are scanned through a program of optical character recognition.
– other books obtained in electronical format.
– some are typed in digital format, beacuse there was no modern edition of some pieces which have been decided to be included for the peculiarity of their language.

SIZE AND SELECTION CRITERIA:

http://www.rae.es/rae/gestores/gespub000019.nsf/(voAnexos)/arch475E744872738671C125716500381CF8/$FILE/TamanoycriteriosCORDE.htm

ENCODING:

To all the materials processed in the CORDE, a series of textual mark-ups have been added, established according to the international standard of SGML (Standard General Markup Language) and according to the recommendations of TEI (Text Encoding Initiative), which will permit many possibilities of recuperation of information and the option to exchange texts with another corpus.

The diachronic corpus includes texts in verse; for these, a set of marks have been selected which collect the basic aspects of these texts.

Textual problems such as preliminary compositions, taxes, censorship, approvals, licenses and the intervention of different authors have been marked with several tags that will make it possible to differentiate between the main author and the rest of authors intervening.

MAINTENANCE AND CURRENT STATE:

The new version of CORDE contains 250 million forms belonging to texts of all periods of the history of Spanish language until 1974. This new version enhances the volume of texts that can be consulted. New works have been included and some others have been completed.

However, this new burden of works brings about a great amount of revision and a substitution of the editions included before for other more updated ones. Detected errors must also be corrected, which requires constant work.

The query system has three main windows. The first of them deals with the query profile construction. For that, we have a section aiming at writing the word we are looking for, and some selective criteria to make easier the dynamic selection of documentary subset of the corpus.

EXAMPLE WITH THE WORD “NACIÓN”:

The results offer statistical information about the query and offers the possibility to establish document reducing filters of documents and examples, just in case the number of documents exceeds the limits or becomes excessive for the purposes of the one who is consulting. As an example, I have looked up the word “nación”. The first thing it says is “13097 casos en 1867 documentos”.

If you click in “Ver Estadística”, some basic statistical data about the query will appear in a general view that is very useful to distinguish the appearance scope, thematic directions and the chronological distribution of the offered examples. Through the usage of charts, we are shown the number of cases and the absolute percentages of the obtained cases, classified according to subject, chronological or geographical criteria.

As we can see, the term “nación” appears most in documents of “historical prose”. Most documents containing the word “nación” are from the year 1820 (9502 cases) and most of the texts are from Spain.

This makes a lot of sense, mainly because of these reasons:

  1. The author of the book from which most of the examples come from is “Satiras y panfletos del Trienio Constitucional (1820-1823)”.
  2. The “Trienio Liberal” or “Trienio Constitucional” took place at that date, those three years.
  3. It was the kingdom of Fernando VII, “El Deseado”.
  4. The first of January 1820, the “pronunciamiento” of Colonel Rafael de Diego took place in the sevillian locality of Las Cabezas de San Juan.
  5. Although he had little success at the beginning, Riego immediately proclaimed the restoration of The Cadiz Constitution (1812, La Pepa) and the re-establishment of constitutional authorities.
  6. The support of the militar coup grew stronger and made the uprising last until March 10.
  7. That date, a manifest was published by Fernando VII respecting the Cadiz Constitution, which established a parliamentary monarchy.

 Rafael de Diego

Therefore, this was a date of great importance and no wonder why it appears that much in documents of that time.

As mentioned before, the documents can be seen as a whole (normal) or in a summarized version (resumido), depending on the objectives of the researcher. If you want the results to be more precise, you can always insert data in “Agrupación” and “Marcas”.

To obtain examples, the clasification is varied. Thus, we can search the word or expression by cases, authors, year, country, subject or title.

If we click in “Recuperar”, in the section of “Obtención de Ejemplos”, we will see the first page of results of documents containing the word “nación”. But, as indicated above the chart, this is only the first page of results out of 38. The first document is anonymous, from the year 1910, from the Spanish work “Solidaridad Obrera. Periódico sindicalista, 4 de noviembre de 1910”.

If we select some of the results above and press the option “Concordancias”, some examples of the uses of the word “nación” will show up with the reference of the work that the fragments belong to and the year:

If we take another corpus as an example, for example, the British National Corpus, we will see that it is very different from the CORDE in some respects. I find more disadvantages in the BNC than in the CORDE.

ABOUT THE BNC:

Firstly, because it shows no statistical charts, which is a very useful data to see the term we are searching as a whole. Secondly, the BNC shows the information at random and without any order, so it makes the research more complicated and less accurate.

The British National Corpus (BNC) is a 100 million word collection of samples of written and spoken language from a wide range of sources, designed to represent a wide cross-section of British English from the later part of the 20th century, both spoken and written.

The written part of the BNC (90%) includes, for example, extracts from regional and national newspapers, specialist periodicals and journals for all ages and interests, academic books and popular fiction, published and unpublished letters and memoranda, school and university essays, among many other kinds of text.

The spoken part (10%) consists of orthographic transcriptions of unscripted informal conversations (recorded by volunteers selected from different age, region and social classes in a demographically balanced way) and spoken language collected in different contexts, ranging from formal business or government meetings to radio shows and phone-ins.

PURPOSES OF THE BNC

The purpose of a language corpus is to provide language workers with evidence of how language is really used, evidence that can then be used to inform and substantiate individual theories about what words might or should mean. nTraditional grammars and dictionaries tell us what a word ought to mean, but only experience can tell us what a word is used to mean. This is why dictionary publishers, grammar writers, language teachers, and developers of natural language processing software alike have been turning to corpus evidence as a means of extending and organizing that experience.


SELECTION CRITERIA

Domain: The domain of a text indicates the kind of writing it contains.

•75% of the written texts were to be chosen from informative writings: of which roughly equal quantities should be chosen from the fields of applied sciences, arts, belief & thought, commerce & finance, leisure, natural & pure science, social science, world affairs.

•25% of the written texts were to be imaginative, that is, literary and creative works.

Medium: The medium of a text indicates the kind of publication in which it occurs. The classification used is quite broad.

•60% of written texts were to be books

•25% were to be periodicals (newspapers etc.)

•5 and 10% should come from other kinds of miscellaneous published material (brochures, advertising leaflets, etc)

•5 and 10% should come from unpublished written material such as personal letters and diaries, essays and memoranda, etc

•Small amount (less than 5%) should come from material written to be spoken (for example, political speeches, play texts, broadcast scripts, etc.)

LOOKING FOR EXAMPLES IN THE BNC

The corpus gives a random selection of 50 solutions among all the results of “nation”. Unlike the CORDE, it does not show any statistic charts and it does not give the option to specify authors or dates. You just enter a text or phrase.

Searching the corpus

CONCLUSION

I did not find any relevant information about the term “nation” in the BNC corpus, because the results are shown at random and are not organized in a chronological way. Therefore, the first result was from the book “The Tragedy of Belief”, by John Fulton, about whom I did not find any relevant information, apart from the fact that it is a text about Irish politics from the year 1991. Instead, the CORDE allowed me to do a quite complete research about the term “nación” and it let me know the reason why the results of the term were abundant in the year 1820.


Sources:

Second Review: Urban Dictionary

URBAN DICTIONARY

Some days ago, I was asked to do a translation from English to Spanish.The text I was given was The Race at Left Bower, by Ambrose Bierce. Since Bierce is an author that we have been studying recently in our American Literature course, I thought I wouldn’t find it particularly tough to translate it to Spanish. However, as I began to read the first paragraph, I totally changed my mind:

“It’s all very well fer (for) you Britishers to go assin’ (assing) about the country tryin’ to strike the trail o’ (of) the mines you’ve salted down yer (your) loose carpital (capital) in,” said Colonel Jackhigh, setting his empty glass on the counter and wiping his lips with his coat sleeve; “but w’en it comes to hoss (horse) racin’, w’y I’ve got a cayuse ken lay over all the thurrerbreds (thorough breds) yer little mantel-ornyment (ornament) of a island ever panned out—bet yer britches (bridges) I have! Talk about yer Durby (derby) winners—w’y this pisen little beast o’ mine’ll take the bit in her teeth and show ’em (them) the way to the horizon like she was takin’ her mornin’ stroll and they was tryin’ to keep an eye on her to see she didn’t do herself an injury—that’s w’at she would! And she haint (hasn’t) never run a race with anything spryer’n (sprayer than) an Injun in all her life; she’s a green amatoor, she is!”

Indeed, I found the text absolutely undecipherable at first sight and I did not know how to start, as I could not even fully understand the meaning of the first sentence. Nevertheless, by googling the words I didn’t know, I came across a very curious and useful online dictionary: the Urban Dictionary.

Example:

1. Hoss
buy hoss mugs, tshirts and magnets914 up, 253 down

one who is a beast that can basically do anything he wants. He is usually loved by all and a ladies man. He could break anyone or anything in half. Hoss is a compliment.

Man, that Stefan is a hoss. He could kill a freshman with one stare down.

beast awesome hosse huge monster
by J Hunter Jan 9, 2006 share this

2. Hoss
buy hoss mugs, tshirts and magnets 951 up, 320 down

August 3, 2006 Urban Word of the Day

A southern colloquial nickname for partner, a term of friendship.

You betta’ get that grass mowed, hoss.

by halide Apr 18, 2003 share this

3. Hoss
buy hoss mugs, tshirts and magnets 522 up, 218 down

Pretty much, a badass.

You’re such a hoss for doing that.

badass gangsta hardass nukkah g
by Amanda&Arlan May 24, 2006 share this

  • As we can appreciate, the first thing we see under the word we were looking for is already an ad, followed by the rating that users have given to the quality of the definition of the given word. Then, we are given the -not very elaborated- definition, followed by an example of a context where we could use that word. And finally, we can see the tagsauthor and date of the definition.
  • Sometimes, there are also images of the words we are looking for. Some of them are rather curious. For instance, if we look up the word “stupid”, the following image will appear under this title: “I’m stupid, I accidentally super glued my hand to the juicyfruit”.

Features of the dictionary:

  • Initiated in 1999, Urban Dictionary has currently about four million definitions and it continues increasing.
  • It is an open tool, free, and anyone can participate and include the terms he/she wants. This is clearly an advantage, because we can be almost 100% sure that it will be the actual language of the streets.
  • The web page also offers a chat to meet other users, as well as a blog dedicated to public opinion.
  • It also includes a popular encyclopedia with biographies of celebrities, among other things.

How does it work?

  • The main page has got different tabs to browse: word of the day, dictionary, store, text me, add, edit, chat and blog.
  • Its method is very simple. If we wish to have an overview over the page, we can find the full alphabet on the upper side of the page, which will help us look up the word we want. There is also an option to look up words in random.
  • If, on the contrary, we wish to find a certain word or term, we just have to type the word we are looking for and press the “search” button, and we will come up with all the existing definitions of the word.
  • Urban Dictionary is also available for IPhone.

Advantages:

  • It is a very useful tool to have fun and it serves as a complement to normal dictionaries if you come across a tricky expression like the above mentioned.
  • It offers information about colloquial expressions, idioms and even insults which do not appear in any common dictionary.
  • Quality is democratically regulated in two levels. Firstly, registered users vote in order to accept or reject recently presented definitions. Definitions appear in the dictionary after having received an acceptable voting rate. Secondly, any visitor can vote for the best definitions and, according to that, they appear in a descending order.

Disadvantages:

  • The site is full of advertisements which may be annoying at times.
  • As anyone can participate and write entries to the dictionary, sometimes we have doubts about whether the information will be trustworthy.

To conclude, it is well worth saying that, from all the resources available on the net, Urban Dictionary is one of the most ingenious dictionaries I have ever used. In fact, it’s no use learning a language in an academic way if we are not able to understand the jargons and slang language that native speakers are familiar with. In this sense, Urban Dictionary can be of great interest and utility.

Sources:

First Review: Machine Translation

FIRST TEXT (From English to Spanish) A NEWSPAPER ARTICLE

ORIGINAL VERSION:

Michael Joseph Jackson’s story was a quintessentially American tale of celebrity and excess that took him from musical boy wonder to global pop superstar to sad figure haunted by lawsuits, paparazzi and failed plastic surgery.

At the height of his career, he was indisputably the biggest star in the world; he has sold more than 750 million albums. He spent a lifetime surprising people, in recent years largely because of a surreal personal life, lurid legal scandals, serial plastic surgeries and erratic public behavior that have turned him — on his very best days — into the butt of late-night talk-show jokes and tabloid headlines. He died at age 50 in Los Angeles on June 25, 2009. Mr. Jackson’s death itself became an enormous spectacle. On television and on the Internet, tens of millions of people worldwide watched a memorial service at the Staples Center in Los Angeles.

TRANSLATION BY REVERSO:

La historia de Jackson de Michael Joseph era un cuento fundamentalmente americano de celebridad y exceso que lo tomó de la maravilla de muchacho musical a la superestrella de pequeña explosión global a la figura(al número) triste atormentada(frecuentada) por pleitos, paparazzi y falló la cirugía plástica.

A la altura de su carrera, él era irrefutablemente la estrella más grande en el mundo; él ha vendido más de 750 millones de álbumes. Él gastó(pasó) una vida que sorprende a la gente, en los últimos años en gran parte debido a una vida privada surrealista, escándalos espeluznantes legalescirugías sucesivas plásticas y el comportamiento errático público que lo ha girado – durante sus días muy mejores – en el extremo de bromas de programa de entrevistas nocturnas y titulares () tabloides. Él murió a la edad de 50 años en Los Angeles el 25 de junio de 2009. La muerte de Jackson de Sr. sí mismo se hizo un enorme espectáculo. Por televisión y sobre la Internet, decenas de los millones de personas por todo el mundo miraron un funeral en el Centro de Grapas en Los Angeles.

MACHINE TRANSLATION ERRORS:

  1. Word Order
  • Michael Joseph Jackson’s story/La historia de Jackson de Michael Joseph. It seems that the translator has had a problem with the artist’s middle name “Joseph” and with the Saxon genitive and has placed the words in the wrong order.
  • Mr. Jackson’s death/La muerte de Jackson de Sr. We can see the same kind of error here, where the Saxon genitive refers to Mr. Jackson, but the machine does not know where to place the Mr when translating the text to Spanish.
  • Lurid legal scandals/Escándalos espeluznantes legales. Here, the translator has placed the adjective in a place that does not sound natural to a Spanish tongue. It should have written “espeluznantes escándalos legales” or even “escándalos legales espeluznantes” instead.
  • Serial plastic surgeries/Cirugías sucesivas plásticas. We find the exact same word order problem here, where the adjective “sucesivas” should go before or after “cirugías plásticas”, but never in the middle.

2. Verbal mistakes

  • Was/era. The machine translates the verb “was” into Spanish as “era”, this is, it puts it in “pretérito imperfecto”, when it should be in “pretérito perfecto simple”, because though English language does not make a difference between these verbs, Spanish language does.
  • Took/tomó. In the context of the text we are analysing, the verb “take” should not be translated as “tomar”, but as “llevar” or “transformar”. This problem is very common regarding machine translation, because polysemic words have more than one meaning and the machine is unable to identify the meaning and, thus, the required word.
  • Failed plastic surgery/falló la cirugía plástica. In this case, the computer has translated the adjective “failed” into a verb, as if it were the past tense of “fail”, without noticing it is really an adjective in this sense.
  • Became/se hizo. It would sound better if the verb “become” was translated as “convertir”.
  • Watched/miraron. “Mirar” and “ver” are verbs which have similar meanings, but not exactly the same. Each if them is used for a particular context, and here it should be “ver”. No miras un funeral, lo ves.

3. Vocabulary problems

  • Pop/explosión. Instead of the musical genre, curiously, the word “pop” has been translated as an explosion.
  • Lawsuits/pleitos. This translation is not completely wrong, but it would be more natural to put it as “demanda judicial” than as “pleito” or “juicio”.
  • At the height of his career/a la altura de su carrera. “At the height” is an expression that cannot be translated word by word or literally. It means at the peak of something, at the highest point of something.
  • Erratic/errático. “Erratic” in the text refers to Jackson’s temperament, so the best way to put it would probably be “imprevisible” or “voluble”, not as “errático”, which sounds much more formal and not everybody would get the meaning.
  • Into the butt of/en el extremo de: “Into the butt of something” in English is an idiomatic expression meaning “el blanco de algo”, but here it translates “butt” as extremo, just as it could have put any other meaning of the word: tonel, colilla, trasero…
  • Plastic surgeries/cirugías plásticas: In Spanish, people do not pluralize the word “cirugía”. To pluralize it, we would use “operaciones quirúrgicas”, not “cirugías”.

4. Article Placement

  • From musical boy wonder to global pop superstar/de la maravilla de muchacho musical a la superestrella. Here, the translator has included articles in a sentence that does not need them, as it is a set expression with no need of articles. It should be “de maravilla de muchacho a superestrella”.
  • He died at the age of 50/Él murió a la edad de 50 años. While in English the subject is necessary, in Spanish we can omit it because using the article once and again would sound quite repetitive. Thus, we would simply say: “Murió a la edad de 50 años”.
  • The Internet/la Internet. The machine has translated the article “the” as feminine, which is absolutely wrong in this case, because the word “Internet” in Spanish is used with no article preceding it.
  • Tens of millions of people/Decenas de los millones de personas. Here, as in English, millions of people should go without article, as it is indeterminated. Therefore, the correct thing would be “decenas de millones de personas”.
  • Watched a memorial service/miraron un funeral. Although the English article “a” would be, theoretically at least, be translated into Spanish as “un”, it does not sound okay to a native Spanish speaker, for we know whose memorial service it is, and should therefore be “el funeral”.

5. Gender Agreement

  • Mr. Jackson’s death itself/la muerte de Jackson de Sr. sí mismo. The machine has not found the reference connection between “death” and “itself” and has translated “sí mismo” as masculine. The pronoun “itself”, however, refers to “death”, so in Spanish it should go in feminine, as “muerte” is feminine. So the correct way of writing this would be “la muerte de Jackson en sí misma”.

6. Prepositions

  • On television and on the Internet/Por televisión y sobre la Internet. In Spanish, we say “en Internet”, not “sobre”. Probably we would translate the preposition “on” as “sobre” in other contexts and other meanings of the word, but not in this one.

7. Proper Names

Proper names should never be translated to Spanish word by word, for they would lose their sense.

  • The Staples Center is, thus, to be translated as El Staples Center, or maybe El Centro Staples, but that is all. By no means should we write El Centro de Grapas, logically.

MY TRANSLATION:

La historia de Michael Joseph Jackson fue un cuento esencialmente americano de celebridad y exceso que le llevó de niño prodigio musical a superestrella del pop, y de superestrella a una triste figura atormentada por demandas judiciales, paparazzi y cirugía plástica fallida.

En la cima de su carrera, era indiscutiblemente la mayor estrella del mundo; ha vendido más de 750 millones de álbumes. Se pasó la vida sorprendiendo a la gente, en los últimos años en gran parte debido a una vida privada surrealista, a escándalos legales espeluznantes, constantes operaciones de cirugía y a una conducta pública imprevisible que le convirtieron -en sus mejores días- en el blanco de chistes en programas de entrevistas nocturnos y titulares de tabloides. Murió a la edad de 5o en Los Ángeles el 25 de junio de 2009. La muerte de Jackson en sí se convirtió en un enorme espectáculo. Por la tele y en Internet, decenas de millones de personas en todo el mundo vieron el funeral en el Staples Center de Los Ángeles.

SECOND TEXT (From English to Spanish)  A RECIPE

ORIGINAL VERSION:

BASIC APPLE PIE RECIPE

8 servings

This is my mom’s recipe for apple pie (I’ve even successfully made it a few times!) You can make the crust or you can use a premade one.

CRUST (recipe makes one double crust):

2 1/2 cups white flour
2 tbsp. sugar
1/4 tsp. salt
1/2 cup cold butter, broken into small pieces
5 tbsp. cold vegetable shortening
8 tbsp. ice water

  • Measure the flour, sugar and salt together. Stir to combine.
  • Add the chilled butter pieces and shortening to the bowl. Cut them in with a pastry cutter or knife. Don’t over mix them.
  • Add the ice water. Mix until the dough holds together (add a bit more water, if necessary).
  • Turn the dough onto a lightly floured surface, knead it together, then divide in half.
  • Flatten each half into a disk, wrap in saran wrap and chill for at least half an hour.
  • Roll out one of the disks on a lightly floured surface until you have a circle that’s about 12 inches in diameter.
  • Put the circle in a 9″ pie plate, trimming any extra dough from the edges with a sharp knife (parents only). Return it to the refrigerator until you are ready to make the pie.
  • Add filling (see below)
  • Roll out the second ball of dough and cover top. Use a fork or your fingers to pinch the edges together. Cut a couple slits in the top.

TRANSLATION BY REVERSO:

RECETA DE PASTEL DE MANZANAS de lenguaje BASIC

8 porciones

¡Esto es la receta de mi mamá para el pastel de manzanas (aún satisfactoriamente lo he hecho unas veces!) Usted puede hacer la corteza o usted puede usar uno prehecho.

La CORTEZA (la receta hace una doble corteza):

2 tazas 1/2 harina blanca
2 azúcar tbsp.
1/4 tsp. sal
1/2 mantequilla de frío de tazarota en pequeños pedazos
5 mantequilla tbsp. fría de verduras
8 agua tbsp. de hielo

  • Medir la harina, el azúcar y la sal juntos. Movimiento para combinarse.
  • Añadir los pedazos de mantequilla enfriados y acortando al tazón. Córtelos en con un cortador de pastel o el cuchillo. No haga sobre los mezclan.
  • Añadir el agua de hielo. La mezcla hasta () la masa sostiene juntos (añada un poco más () agua, si fuera necesario).
  • Girar la masa en un ligeramente floured la superficie, amáselo juntos, luego parta() por la mitad.
  • Aplanar cada mitad en un disco, el abrigo en el abrigo de saran y el enfriamiento durante al menos media hora.
  • Estirar uno de los discos sobre un ligeramente floured la superficie hasta que usted tenga un círculo esto es aproximadamente 12 pulgadas en el diámetro.
  • Poner el círculo en los 9 ” el plato(la placa) de tarta, ajustando cualquier masa suplementaria de los bordes con un cuchillo agudo (padres sólo). Devuélvalo al refrigerador hasta que usted esté listo a hacer la tarta.
  • Añadir el relleno (mirar debajo).
  • Estirar la segunda pelota de masa y cubrir la cima. Use un tenedor o sus dedos para pellizcar los bordes juntos. Corte una pareja corta en la cima.

MACHINE TRANSLATION ERRORS:

1. Word Order

  • I’ve even successfully made it a few times! / aún satisfactoriamente lo he hecho unas veces! I think that the problem here lies on the original sentence. It should be: Even I have successfully made it a few times! That is why the placement of the adverb sounds a little bit funny when translated.

2.  Verbal Mistakes

  • Broken into small pieces / rota en pequeños pedazos. Referring to butter pieces, we would never say that butter is “broken” in Spanish, so we would rather use “cortada” or “partida” or “troceada”.
  • Move to combine / Movimiento para combinarse. Here, the verb “move” hads been translated into Spanish as a noun, when in this case it intends to indicate an order in imperative. So it should be “mover” para combinar. And, rather than “combinar”, mezclar would be better.
  • Shortening to the bowl / acortando al tazón. Here, instead of translating “shortening” as “manteca”, the translator has put the verb “shorten”, that is, “acortar”,  in gerund.
  • The dough holds together / la masa sostiene juntos. The verb “holds” in Spanish should not go in present, but in subjunctive, which in English is the same but in Spanish should be “sostenga”. Moreover, “hold together” is a phrasal verb which I would put in Spanish as “solidificar”.
  • A circle that’s about 12 inches / un círculo esto es aproximadamente 12 pulgadas. In this case, instead of putting “that” as a relative clause, it has translated it as the connector “esto es”.
  • Chill for at least half an hour / El enfriamiento durante al menos media hora. Once again, “chill” has been translated as a noun instead of as an imperative verb.
  • Return it to the refrigerator / Devuélvalo al refrigerador. “Return” in this sense would be better translated as “volver a meter”.
  • Pinch the edges together / pellizcar los bordes juntos. The verb “pinch” alone means “pellizco”, but when it comes to a recipe, it means “juntar”.

3. Vocabulary Problems

  • Ice water / agua helada. I would translate it as agua fría.
  • Wrap in a saran wrap / el abrigo en el abrigo. The machine  translator doesn’t know that “saran wrap” is a trademark for a plastic film to wrap food. And, again, it has not noticed that “wrap” is a verb in this context, not a noun. Moreover, wrap in this sense is related to cuisine, not to clothing.
  • Sharp knife / cuchillo agudo. “Sharp” in this sense should be translated as “afilado”.
  • Top / cima. Again, in this sense “top” doesn’t refer to the peak of a mountain, but to a pie, so I would put it as “superficie”.
  • Couple slits / pareja corta. “Couple” here refers to “a pair of something” and slits are cuts. So the correct form would be “un par de cortes”.

4.  Article Placement

  • Cup cold water / mantequilla de frío de taza. Here, the lack of  prepositions in the list of ingredients leads the machine to a confusion of order and prepositions. It should be “una taza de mantequilla fría”.
  • Don’t overmix them / No haga sobre los mezclan. Here, the machine has taken “over” as a preposition apart, without knowing that it accompanies the verb, meaning “no los mezcle demasiado”.
  • Until the dough holds together / hasta () la masa sostiene. I have put those parentheses to indicate there’s a lack of something after the preposition “hasta”; a “que” is missing so that the sentence makes sense, as it is in subjunctive.
  • Until you are ready to / hasta que usted esté listo a. Curiously enough, the personal pronoun has been very well translated into the courtesy pronoun “usted”; however, in the same way that “ready” is followed by “to”, the word “listo” in Spanish is followed by “para”, not by “a”.

5.  Gender Agreement

  • Knead it together / amáselo junto. Knead here refers to the dough, which is feminine in Spanish. Therefore, it should be “amásela junta”.
  • Divide it in half  / pártalo por la mitad. In the same way, here it should be “pártala”, in feminine, referring to the dough once again.

6.  Untranslated Items

  • Basic Apple Pie Recipe /  receta de pastel de manzanas de lenguaje BASIC. It seems that it doesn’t know how to translate the adjective “basic”, which is rather strange.
  • The items tbsp and tsp have not been translated by the machine, probably because they are shortenings, meaning “tablespoonful” and “teaspoonful“.
  • The machine didn’t know how to translate the verb floured either, because it is a verb made up by the noun “flour” and doesn’t exist as a verb, though it is commonly used in everyday colloquial language.

MY TRANSLATION:

RECETA BÁSICA PARA PASTEL DE MANZANA

8 raciones

Esta es la receta de mi madre para hacer pastel de manzana (¡yo misma he logrado hacerla con éxito un par de veces!). Puedes hacer la pasta tú mism@ o puedes usar una prehecha.

LA PASTA (La receta da indicaciones para hacer una pasta doble):

2,5 tazas de harina blanca
2 cucharadas grandes de azúcar
1/4 de cucharadita de sal
1/2 taza de mantequilla fría, troceada en pequeños pedazos
5 cucharadas grandes de manteca vegetal fría
8 cucharadas grandes de agua fría

  • Medir la harina, el azúcar y la sal. Remover para mezclar.
  • Añadir los trozos de mantequilla fría y la manteca en la fuente. Cortarlos con un cortador de masa o un cuchillo. No mezclar demasiado.
  • Añadir el agua fría. Mezclar hasta que la masa este sólida (añadir un poquito más de agua si es necesario).
  • Hacer que la masa se vuelva una superficie ligeramente harinosa, amasarla y cortarla por la mitad.
  • Aplanar cada mitad en forma de disco, envolver en filme adherente y dejar enfriar durante media hora.
  • Extender con rodillo uno de los discos en una superficie ligeramente cubierta de harina hasta obtener un círculo de dos centímetros y medio de diámetro.
  • Poner el círculo en un plato de tarta de unos 23 centímetros, cortando la masa que sobra por los lados con un cuchillo afilado (sólo padres). Volver a meter en en frigorífico hasta que se esté preparado para hacer el pastel.
  • Añadir relleno (ver abajo).
  • Extender la segunda bola de masa y cubrir la parte de arriba. Usar un tenedor o los dedos para juntar los bordes. Hacer un par de cortes en la superficie.

THIRD TEXT (From Spanish to English) A LITERARY TEXT

ORIGINAL VERSION:

Hubo una vez una joven muy bella que no tenía padres, sino madrastra, una viuda impertinente con dos hijas a cual más fea. Era ella quien hacía los trabajos más duros de la casa y como sus vestidos estaban siempre tan manchados de ceniza, todos la llamaban Cenicienta.

Un día el Rey de aquel país anunció que iba a dar una gran fiesta a la que invitaba a todas las jóvenes casaderas del reino.

– Tú Cenicienta, no irás -dijo la madrastra-. Te quedarás en casa fregando el suelo y preparando la cena para cuando volvamos.

Llegó el día del baile y Cenicienta apesadumbrada vio partir a sus hermanastras hacia el Palacio Real. Cuando se encontró sola en la cocina no pudo reprimir sus sollozos.

– ¿Por qué seré tan desgraciada? -exclamó-. De pronto se le apareció su Hada Madrina.

– No te preocupes -exclamó el Hada-. Tu también podrás ir al baile, pero con una condición, que cuando el reloj de Palacio dé las doce campanadas tendrás que regresar sin falta. Y tocándola con su varita mágica la transformó en una maravillosa joven.

TRANSLATION BY TRADUKKA:

There was once a beautiful young woman who had no parents, but () stepmother, a widow with two daughters impertinent each more ugly. It was she who was the toughest jobs of the house as his clothes were always so stained () ash, everybody called her Cinderella.
One day the King of that country announced it would give a great feast, inviting all the young maidens of the kingdom.
– You Cinderella will not go, “said the stepmother. You’ll be at home scrubbing the floor and preparing dinner for when we return.
The day of dancing and was sorry Cinderella from her stepsisters to the Royal Palace. When she was alone in the kitchen could not suppress her sobs.
– Why should I be so unhappy? “He cried. Suddenly it appeared her Fairy Godmother.
– Do not worry, “said the Fairy. You too can go to the ball, but with one proviso, that when the clock struck twelve Palace will have to return without fail. And by touching it with his magic wand transformed () into a wonderful young man.

MACHINE TRANSLATION ERRORS:

1. Word Order

  • Una viuda impertinente con dos hijas a cual más fea / with two daughters impertinent each more ugly. Here, the adjective “impertinent” should describe the stepmother, and not the daughters.
  • De pronto, se le apareció su hada madrina / suddenly, it appeared her Fairy Godmother: It is clear that the sentence is ordered in the wrong way, and the object has been put twice because of that “it”, which should be “her fairy godmother”, without capital letters, by the way.

2. Verbal Mistakes

  • Era ella quien hacía los trabajos más duros / it was she who was the toughest job: The verb “hacer” has been translated as the verb “to be”, which makes no sense.
  • Fregando / scrubbing. “Fregar” should be translated as mop or wash, but “scrub” rather refers to washing with a brush.
  • Reprimir / supress. For desires and impulses, the word here should be “repress”, rather than “supress”, which refers to a rebellion or to a yawn.

3. Vocabulary Problems

  • Viuda / widow. Rather than “widow”, we should put “widower”.
  • Condición / proviso. “Proviso” is a rather rare word in English, which could be substitued by “condition”, that sounds more natural.
  • Baile / dancing. In English, “baile” is translated as “ball” in literary contexts of that time.

4. Prepositions

  • Manchados de / stained. The verb “stain” here should be followed by the prepositions “in” or “with”, as the translator probably has taken the word as an adjective rather than as a verb.

5. Gender

  • Era ella quien / it was she who. Instead of  “she”, we should put “her”, in the accusative case.
  • Ella / He. The machine sometimes confuses the masculine and feminine gender.
  • Seemingly, the Spanish word “joven” does not specify if it is a man or a woman, so the machine translates it as “man“.

6. Articles

  • Sino madrastra / but () stepmother. In Spanish, it is possible to write madrastra without an article when it is undetermined, but in English, it isn’t. We have to put the article “a” before the noun “stepmother”.

MY TRANSLATION:

There once was a very beautiful woman who had no parents, but a stepmother, an impertinent widower who had two daughters, each one uglier than the other. It was her who did the hardest housework and, as her dresses were always so stained with ash, everybody called her Cinderella.

One day, the king of that country announced that he was going to give a party to which he would invite all the maiden ladies of the kingdom.

‘You, Cinderella, will not go’, said the stepmother. ‘You will stay at home mopping the floor and preparing the dinner for us when we come back’.

The day of the ball arrived and Cinderella, sad, watched her stepsisters set off to the Royal Palace. Once she was let alone in the kitchen, she could not help sobbing.

‘Why should I be so unfortunate?’, she cried. Suddenly, her fairy godmother appeared.

‘Don’t worry’, said the fairy. ‘You can go to the ball, too, but on one condition; when the palace clock strikes twelve, you will have to come back without fail’. And, touching her with her magic wand, she turned her into a wonderful young girl.

CONCLUSION: As a brief conclusion, we should note that there are noticeable differences when translating texts from very different gender. We have not found the same kind of mistakes in the translation of the newspaper article and in the recipe or in the literary text.

SOURCES:

*Michael Jackson. The New York Times.

Retrieved 13:28, April 19, 2010, from:

http://topics.nytimes.com/top/reference/timestopics/people/j/michael_jackson/index.htm

*Basic Apple Pie Recipe. DLTK’S Growing Together.

Retrieved 17:10, April 19, 2010, from:

http://www.dltk-teach.com/alphabuddies/recipe/apple_pie_recipe.htm

*La Cenicienta.

Retrieved 17:12, April 19, 2010, from:

http://adigital.pntic.mec.es/~sanagust/cenicienta.htm

*Reverso Online Translator

Retrieved 17:10, April 19, 2010, from:

http://www.reverso.net/text_translation.asp?lang=EN

*Tradukka Online Translator

Retrieved 17:10, April 19, 2010, from:

http://tradukka.com/