View Full Version : Foreign (non English speaking) Members


Pages : [1] 2

Chris Soucy
December 31st, 2009, 11:46 PM
Or possible members.

As an idle, New Years Day thought (not that I have been, another hard day at the office, er, working in the garden, and please say if you've got here before me) it occured to me just how many non English (as a first language) speaking members there are on DVinfo.

From there I got to wondering just how many members DVinfo actually does have in what countries across the planet?

I know you're up to about 26,000 odd (some of them very odd indeed. What, who, me?) but where are they?

From there it was but a small skip and jump to, hmm, there seems to be a big discrepency in observed posters between the English speaking and non English speaking parts of the planet (You don't say!).

Now, considering the roaring success of the Google Search function added a while ago (brilliant idea, why didn't I think of it myself?) and as an aid to DVinfo's march to take over the known Universe on the topic of, er, DV, I thought, can DVinfo be married to Googles translation system and become almost universal?

You'all must be familiar with this:

Language Tools (http://www.google.co.nz/language_tools?q=studio+camera+controls&hl=en&rlz=1W1ADBS_en)

The question is, can it be married to IP addresses to automatically translate posts/ searches for the countries destination if required or switched off for a comfortable ESL* member?

I don't think you'll need to worry too much about the Elmer Fudd (that could grow on you tho'), Pirates, Bork etc, Hacker or Klingon translations, especially the latter, tho' you never know, there's some strange creatures lurking on this site.

The remainder (quite a few, actually) may well open a new door for DVinfo.

Nope, I have no idea how succesfull the system is, as the only languages I speak are Strine, Kiwi, Canuck and Pom, all of which seem to be mysteriously absent from the list - hmm, there's a conspiracy theory buried in there somewhere.

And no, this isn't a back door dig about regional codes (intesting subject that it is) just a complete "think out of the box" moment (and the first one to come up with "New Years Day = Out of Tree, not Box, wouldn't be too far off....er, ahem!).

Have a good one, Chris H. and all you DVinfo'ers.


CS


PS: This is not a drill.

PPS: *ESL = English as a Second Language

Paul Inglis
January 2nd, 2010, 05:53 AM
Well, in my blog stats I am seeing more and more different translators as well as Google. Chinese seems to be a big one.

I do believe that the European members (where English isn’t their primary language) on this site learnt to speak and write better than some English folks and don’t actually need a translator.

Happy New Year!

Chris Soucy
January 2nd, 2010, 04:06 PM
Funny you should mention about the standard of English (here), it did subsequently occur to me that this would be where the whole thing could fall at the first hurdle.

If I can't wade through the appalling spelling, grammar, txt type shortcuts/ truncations and "I know what I mean, but I'll let you guess" efforts and make some semblance of sense out of a post (which is about a 1 in 10 occurance) what on earth would any non - human translator make of them?

Any bilingual members out there who'd like to try the experiment with a random sampling of posts and measue the hit rate?

By which I mean, how often a translated post actually makes sense in both languages. Use any translator that takes your fancy.

I'd ask for some input from non - English speakers, but if they don't.................................


CS

PS:

Now, here's a thought!

What say Chris Hurd write a banner post in his usual perfect English and have it translated into the top 5 non - English languages represented by the membership numbers (loosely). Something along the lines of "would you be interested in a (insert language here) version of "X" DVinfo content etc" and either let non members post to that particular thread (hmm, maybe not) or send him an e - mail via the Contact US button?

At least it would be testing the waters.

PPS:

Chinese has got to be worth a try. They're the most avid shutter bugs on the planet and what with the economic revolution there and all..........

PPPS:

Coo, look what I just found:

http://translate.google.co.nz/translate_tools?hl=en&layout=1&eotf=1

Don't half go out of their way to make it easy.

Good, but don't think it's going to hit the spot: going to get exceedingly tedious clicking the button for every page (if you're speaking foreign, that is), gives English readers a bit of a problem if same person actually posts in foreign, then there's the small problem of those 1,169,606 odd posts going back to the beginning of time to take into account.

Nope, think there's going to need to be entire "parallel universes" of DVinfo, with each new post in one being mapped to the others but translated "on the fly" and the historical stuff deciphered and stored in foreign for each one.

Think you might need to put in a bulk order for servers there Chris.

PPPPS:

While you're at it, as an aid to getting clearer posts all round, you could always park a Spell Checker/ Grammer Corrector as options in the "make a post" page. Should be a doddle.

Bob Hart
January 2nd, 2010, 09:55 PM
About the quality of our spoken English as our first language and our slackitude in not learning a second. - I got a wake-up call in 1994 when I went to Daruba Morotai in the backblocks of the Republic of Indonesia and there met a local middleschool teacher who knew more about my language and spoke it more correctly than I did.

Chris Soucy
January 4th, 2010, 12:10 AM
Yeah, the English linguistic laziness is and has been a constant over the years, has to be something to do with the culture, I guess.

In the absence of any further movement or comment, I've been conducting some tests, this is my opening post in Swedish, courtesy of Google:

Utländska (icke engelskspråkiga) ledamöter samt eventuell members.As en sysslolös, tänkte Nyårsdagen (inte att jag har varit, en hård dag på jobbet, er, jobbar i trädgården, och säg om du har här förut mig) slog det mig hur många icke engelska (som första språk) talar ledamöterna finns på DVinfo.From det måste jag undrar bara hur många medlemmar DVinfo faktiskt har i vilka länder i hela världen? Jag vet att du är upp till ca 26.000 udda (vissa av dem mycket underligt. Vad, vem, jag?) Men var är de? Därifrån är det bara var en liten hoppa och hoppa till, hmm, det verkar vara en stor diskrepans i observerade affischer mellan engelsktalande och icke engelskspråkiga delar av planeten (Du behöver inte säga!). nu ser till den brusande framgång med Google Search funktion lagt ett tag sedan (lysande idé, varför jag inte tror av det själv?) och som ett stöd för DVinfo i mars att ta över universum på temat, er, DV, tänkte jag, kan DVinfo vara gift med Googles översättning systemet och blir nästan universell? You'all måste känna till detta: Språk ToolsThe fråga är, kan det vara gift med IP-adresser för att automatiskt översätta tjänster / söker efter destinationsländer om det behövs eller stängs av för en bekväm ESL * medlem? Jag tror inte du behöver oroa sig alltför mycket om den Elmer Mudd (som kan växa på dig Tho '), Pirates, Bork etc, Hacker eller Klingon översättningar, särskilt den sistnämnda, tho' man vet aldrig, det finns några underliga varelser som lurar på denna site.The återstoden (ganska många, faktiskt ) kan väl öppna en ny dörr för DVinfo.Nope, har jag ingen aning om hur bra systemet är, som enda språk jag talar är Strine, Kiwi, Canuck och Pom, som alla verkar vara mystiskt frånvarande från listan - Hmm , det finns en konspirationsteori begravd där inne någonstans. Och nej, detta är inte en bakdörr dig om regionala koder (intressant fråga som det är) bara är en komplett "tänka utanför boxen" ögonblick (och den första som kom upp med "New Years Day = Out of Tree, inte Box, skulle inte vara alltför långt borta .... er, ahem!). Have a good one, Chris H. och allt du DVinfo'ers. landstrategidokumenten: Detta är inte en drill.PPS: * ESL = engelska som andraspråk Senast redigerad av Chris Soucy, 1 januari, 2010 kl 06:58. Orsak: Whoops.

Russian:

Иностранные (Non Английский говоря) Члены или возможных members.As простоя, Новый год День мысли (не то чтобы я был, другой тяжелый день в офисе, ER, работая в саду, и, пожалуйста, говорить, если у вас есть здесь раньше меня) мне пришло в голову, сколько именно, не английский (в качестве первого языка), выступая Члены есть на DVinfo.From там я получил на интересно только, сколько членов DVinfo действительно имеет в каких странах по всей планете? Я знаю, ты до примерно 26000 нечетные (некоторые из них действительно очень странный. Что, Кто, я?), Но где они? Оттуда оно было небольшим, но пропустить и перейти к, гм, как представляется, будет большая расхождения наблюдаются в плакатах между английской речи и без Английский выступивший частях планеты (Вы не говорите!). Теперь, учитывая оглушительный успех в функцию поиска Google добавил некоторое время назад (блестящая идея, почему я не думаю, его себе?), а в качестве помощи по март DVinfo завладеть известной Вселенной на тему, ER, DV, я подумал, может быть DVinfo браке с системой перевода Googles и стал почти универсальным? You'all должны быть знакомы с это: Язык ToolsThe вопрос, можно ли в браке с IP адресов для автоматического перевода сообщений / поиска для стран назначения, если необходимо или выключены для комфортного ESL * Член? Я не думаю, что вам нужно слишком беспокоиться о Элмер фадд (что может вырасти на вас Tho '), "Пираты, BORK и т.д., Hacker или клингонский переводов, особенно последний, Tho' Вы никогда не знаете, есть какие-то странные существа скрываются на эту site.The остальная часть (довольно много, на самом деле ) вполне может открыть новую дверь для DVinfo.Nope, я понятия не имею, насколько успешно система является, как только я говорю Языки являются Strine, киви, канадец и ПОМ, все из которых, как представляется, таинственным отсутствуют в списке - Хмм , есть теории заговора похоронены где-то там. И нет, это не двери назад DIG о региональных кодов (интересно тем, что он является) просто полный "думать" из коробки "момент (и первый, кто пришел с" New Years Day Out = деревьев, Не сейф, будет не слишком далеко .... ER, гм!). имеющими хорошую, Крис Г. и все, что вам DVinfo'ers. КСО: Это не drill.PPS: * = ESL Английский как Второй язык Последняя редакция Крис Soucy; 1 января 2010 в 06:58 PM. Причина: Whoops.

French:

Étrangers (non anglophone) les membres ou members.As possible, un ralenti, New Years Day cru (non pas que j'ai eu, un autre dure journée au bureau, ER, travaillant dans le jardin, et dire s'il vous plaît si vous avez ici devant moi), il m'est apparu à quel point de nombreux non-anglophones (langue première) membres de langue il ya sur DVinfo.From là, je dois me demandais juste combien de membres DVinfo n'a réellement dans ce pays à travers la planète? Je sais que vous êtes à environ 26.000 impaires (certains d'entre eux très étrange. Quoi, Qui, moi?) Mais où sont-ils? De là, elle n'était qu'une petite sauter et sauter, hmm, il semble y avoir un gros un écart dans affiches observées entre les anglophones et non partie anglophone de la planète (Tu ne dis pas!). Or, étant donné le succès grondement de la fonction de recherche Google a ajouté il ya un certain temps (idée géniale, pourquoi n'ai-je pas penser de moi-même?) et comme aide à mars DVinfo de prendre en charge l'Univers connu sur le thème de, euh, DV, j'ai pensé, peut être marié à DVinfo Googles système de traduction et deviennent presque universel? you'all doivent être familiers avec celle-ci: ToolsThe est question de la langue, peut-il se marier à des adresses IP de traduire automatiquement des messages / recherches pour le pays de destination si nécessaire, ou désactivée pour un confort ESL * Membre? Je ne pense pas que vous aurez besoin de trop s'inquiéter la Fudd Elmer (qui pourrait croître sur vous Tho '), Pirates, Bork etc, Hacker ou traductions Klingon, surtout les derniers, quoique on ne sait jamais, il ya des créatures étranges, se cache sur ce reste site.The (pas mal, effectivement ) mai ainsi ouvrir une nouvelle porte pour DVinfo.Nope, je n'ai aucune idée de comment le système est réussie, comme les langues que je parle sont Strine, Kiwi, Canuck et POM, qui tous semblent être mystérieusement absent de la liste - hmm , il ya une théorie de la conspiration enterré quelque part. Et non, ce n'est pas une fouille porte d'en arrière sur les codes régionaux (intéressant sujet que ce soit) juste une complète "think out of the box" moment (et la première personne à arriver à "New Years Day = dehors de l'arbre, pas fort, ne serait pas trop loin .... euh, hum!). Have a good one, Chris H. et tout ce que vous DVinfo'ers. EFPC: Ce n'est pas une drill.PPS: * ESL = anglais comme a Second Language Last edited by Chris Soucy; Janvier 1st mai 2010 à 06:58 PM. Motif: Oups.

German:

Ausländische (nicht Englisch sprechenden) Mitglieder oder mögliche members.As eine müßige, dachte New Years Day (nicht, dass ich gewesen bin, einen anderen harten Tag im Büro, er, die Arbeit im Garten, und bitte sagen, wenn Sie hier mußt vor me) fiel mir ein, wie viele nicht Englisch (als erste Sprache), spricht Mitgliedern gibt es auf DVinfo.From Dort lernte ich fragen, wie viele Mitglieder dvinfo tatsächlich in welchen Ländern auf dem ganzen Planeten haben? Ich weiß, Sie sind bis zu etwa 26.000 ungerade (einige von ihnen sehr merkwürdig. Was, Wer, ich?), Aber wo sind sie? Von dort sind es aber eine kleine überspringen und direkt zum, hmm, es scheint eine große Diskrepanz zwischen den beobachteten Plakate Englisch sprechen und nicht englisch sprechenden Teile des Planeten (Was du nicht sagst!). Nun, angesichts der durchschlagender Erfolg der Google Search-Funktion hat vor einiger Zeit (geniale Idee, warum ich nicht glaube, es selbst?) und als Hilfsmittel zu marschieren dvinfo's über das bekannte Universum zu dem Thema, äh, DV nehmen, dachte ich, kann dvinfo zu Googles Übersetzungssystem verheiratet zu sein und werden fast universell? You'all müssen vertraut sein mit dazu: Sprache ToolsThe Frage ist, kann es zu IP-Adressen verheiratet werden, um automatisch Beiträge translate / Sucheinträge für die Länder Bestimmung, wenn erforderlich oder für einen komfortablen ESL * Mitglied ausgeschaltet? Ich glaube nicht, müssen Sie sich zu viel Sorgen machen die Elmer Fudd (das sich auf Sie tho 'wachsen zu können), Pirates, Bork etc, Hacker oder Klingonisch Übersetzungen, besonders die letztere, wiewohl man weiß ja nie, es gibt einige seltsame Kreaturen lauern auf dieser flagranter Rest (nicht wenige, tatsächlich ) kann auch eine neue Tür geöffnet für DVinfo.Nope, ich habe keine Ahnung, wie erfolgreich das System ist, wie die einzigen Sprachen Ich spreche Strine, Kiwi, Canuck und Pom, die alle scheinen auf geheimnisvolle Weise fehlen aus der Liste - hmm sind gibt es eine Verschwörungstheorie dort irgendwo begraben. Und nein, das ist keine Hintertür dig über regionale Codes (interessantes Thema, dass es sich handelt) nur ein komplettes "Think out of the box"-Moment (und der erste, der kommt mit "New Years Day = Out of Tree, Box nicht, wäre nicht zu weit weg .... äh, hm!). Have a good one, Chris H. und alles, was Sie DVinfo'ers. CSPS: Dies ist kein drill.PPS: * ESL = Englisch als Zweitsprache Zuletzt bearbeitet von Chris Soucy; 1. Januar 2010 um 06:58 Uhr. Begründung: Whoops.

Hindi (India):

(गैर अंग्रेजी भाषी) के सदस्यों या संभव members.As एक बेकार, नई सालों दिवस (मैं नहीं गया है कि एक और कड़ी दिन दफ्तर में, एर, बगीचे में काम कर रहे, विचार और कृपया विदेश कहना है अगर तुम यहाँ है पहले मुझे) यह कितने गैर अंग्रेजी मुझे हुआ (एक पहली भाषा के रूप में) बोल वहाँ सदस्यों DVinfo.From पर हैं, मैंने बस कितने सदस्य DVinfo वास्तव में ग्रह भर के देशों में क्या होता है सोच के लिए है? मैं जानता हूँ कि आप के बारे में 26,000 (उनमें से कुछ अजीब वास्तव में बहुत करीब तक रहे हैं. क्या, जो? मुझे) लेकिन कहाँ हैं? वे वहाँ से था, लेकिन एक छोटे से छोड़ और कूद, हम्म, लगता है वहाँ पर बहुत बड़ा अंग्रेजी भाषी तथा गैर अंग्रेजी भाषी ग्रह के भागों के बीच मनाया पोस्टर में विसंगति (आप कहो! नहीं है). अब, गूगल खोज समारोह का गरजना सफलता पर विचार कर थोड़ी देर (प्रतिभाशाली विचार, पहले कहा कि मैं क्यों नहीं किया था लगता है इसके बारे में? अपने आप को) और DVinfo मार्च के लिए एक सहायता के रूप में के विषय पर ज्ञात ब्रह्मांड, एर, डीवी पर लेने के लिए, मैंने सोचा, DVinfo Googles अनुवाद प्रणाली के लिए कर सकते हैं शादी हो गई और लगभग सार्वभौमिक? You'all से परिचित होना चाहिए इस: भाषा ToolsThe सवाल यह है आईपी पते से शादी कर सकते हैं स्वतः पदों अनुवाद / देशों गंतव्य यदि जरूरी हुआ तो या एक आरामदायक ESL सदस्य * के लिए बंद? मुझे नहीं लगता कि तुम बहुत अधिक के बारे में चिंता करने की आवश्यकता होगी के लिए खोजें Elmer फ़ड (कि तुम अगरचे पर जाना), आदि हैकर या क्लिंगन अनुवाद, विशेष रूप से बाद, यद्यपि पता है 'कभी तुम बोर्क समुद्री डाकू, वहाँ सकता है कुछ अजीब इस site.The शेष पर गुप्त जीव (बहुत कुछ, दरअसल ) अच्छी तरह से DVinfo.Nope के लिए एक नया द्वार खोल सकता है, मैं कोई कैसे सफल प्रणाली है पता नहीं है, के रूप में ही भाषाओं में मैं बात Strine, किवी, Canuck और पोम, जो सभी के लिए सूची से रहस्यमय तरीके से अनुपस्थित लग रहे हो - हम्म हैं , वहाँ एक साजिश कहीं वहाँ में दफन सिद्धांत है. और नहीं, यह क्षेत्रीय कोड के बारे में पीछे के दरवाजे खुदाई (दिलचस्प विषय है कि यह है) एक पूरा "बॉक्स पल" से बाहर लगता है (और पहली बार एक अप 'के साथ नए वर्ष = दिवस ट्री में से नहीं आ रहा है, बॉक्स नहीं, नहीं होगा भी दूर .... एर, अहम!). एक अच्छा, क्रिस एच. और सभी DVinfo'ers तुम ले लो. CSPS:: यह एक drill.PPS नहीं है * ESL = के रूप में अंग्रेजी एक दूसरी भाषा पिछले क्रिस Soucy द्वारा संपादित; 1 जनवरी 2010 06:58 पर. वजहः वूप्स.


Anyone want to comment on the accuracy?

This is getting a lot of hits, but very little input, anyone want to chime in?


CS

Noa Put
January 4th, 2010, 07:38 AM
Hi Chris,
I just translated your openingspost to Dutch and if I would have read the translated part only I wouldn't have a clue what you were talking about. :)
The google translator puts many sentences in the wrong context and the result is actually quite funny but it hardly makes any sence to me. It often translates words very literally and in my language the sentence gets a totally different and weird meaning then.
Googles translator is certainly not up to the task to make a good translation

Chris Soucy
January 4th, 2010, 12:45 PM
So now we know.

Hmm, anyone got any better suggestions?

Or am I asking the impossible here?

If it can go that badly wrong with one of my less impenetrable posts, what hope for the average?

Back to the drawing board, I think.

Onwards and upwards.


CS

Colin McDonald
January 4th, 2010, 05:30 PM
I would anticipate that the automatic translations would fall apart on the more technical posts. Translating technical documents is an expert job.

I actually did go as far as to download the pdf manuals for my Canon XH-A1 in a number of languages with a view to using the translating tools to translate some of the more arcane sections into English, then see if they made any sense and compare them with the official Canon English version. But unfortunately the pdf manuals seemed to be locked and wouldn't let me copy bits of text to paste into the translator. I didn't have the time to type them in myself as I find typing unfamiliar languages accurately is a slow job.

Perhaps somebody else could have a shot at this as it might be an interesting experiment.

Chris Soucy
January 4th, 2010, 07:03 PM
You mentioned other translators.

Any of them have names? I'll do a Google on the subject but if you've got names it would sure make things easier.


CS

Aha, now this looks more like it:

http://www.systransoft.com/translation-products/server/server-editions-comparison

http://www.translationsoftware4u.com/systran-enterprise.php

Of course, had to check this out:

http://en.wikipedia.org/wiki/Machine_translation

Ervin Farkas
January 4th, 2010, 08:00 PM
Google translator is totally unusable! Here is a Hungarian translation of the first post's first 4-5 paragraphs - please note, could not even translate such simple a thing as New Year.

"Mint lusta, New Years Day gondolta (nem, hogy én már egy újabb kemény nap az irodában, ööö, a munka a kertben, és kérem, ha már itt van előttem) eszembe jutott, milyen sok nem angol nyelvű (mint az első nyelv) beszélő tagjai vannak a DVinfo.

Onnan kaptam, hogy csak kíváncsi, hogy hány tagja van DVinfo valójában milyen ország az egész bolygón?

Tudom, hogy ki kb 26.000 páratlan (némelyikük valóban nagyon furcsa. Mi, akik engem?), De hol vannak?

Innen már csak egy kis ugrás és ugrás, hmm, úgy tűnik, hogy egy nagy discrepency megfigyelt plakátok között, az angol anyanyelvű és nem angol anyanyelvű részének a bolygó (Ne mondd!).

Nos, figyelembe véve a tomboló siker a Google Search funkció egészíti ki, míg ezelőtt (zseniális ötlete, miért nem arra gondolok, hogy magam?), Valamint a támogatás DVinfo menetelés átvenni az ismert univerzum a témához, ööö, DV, gondoltam, lehet DVinfo férjhez Googles fordítási rendszer, és már majdnem univerzális?"

Just to have fun, I translated it back with Google - this is what came out:

"As a lazy, thought New Years Day (not that I had another hard day at the office, er, working in the garden, and please, if you are already here in front of me) I thought, how many non-English language (as a first language) speaking with members of the DVinfo.

From there I was, just curious, how many members are in fact what DVinfo country around the planet?

I know that up to about 26,000 odd (some very strange indeed. We, who me?), But where are they?

From there it is only a small jump and go, hmm, it seems that a large discrepency between the observed posters, the English-speaking and English-speaking part of the planet (do not tell!).

Well, given the raging success of the Google Search function is complemented by a while ago (brilliant idea, why did not I think of that myself?), And the support DVinfo march to take over the known universe to the topic, uh, DV, I thought, maybe DVinfo married Google translation system, and is now almost universal?"

Note that it lost something so basic as English speaking and non English speaking (lost the NON word) - how can you trust such a translation???

And this is all simple, general talk - I won't even try anything remotely technical...

Chris Soucy
January 4th, 2010, 08:38 PM
This is not looking promising at all.

Hmm, there's Dutch and Hungarian shot down in flames.

The really scary thing about it is that according to the Wiki article linked to in my previous post, Google used to use SysTrans (also linked) but have developed their own system which is supposed to be better.

Thanks Ervin.


CS

Hm, on reflection, and having re - read it with translation in mind, I think I/we may have set too severe a challenge for any system with that text.

From memory, the difficulties in translation compound severely with increasing sentence length, and a couple of the ones in that test are nearly two and over two lines long - that's too long and doesn't, I am certain, correspond anywhere near to the average post sentence length here on DVinfo.

Some stats would be really usefull about now - what is the "average" post sentence length used here on DVinfo? Is there any way to find out? Automatically? Without getting out the pencil and paper? The average total post length would be usefull as well, as I believe that has a bearing on translatability (any chance Chris/ Jeff?).

Length aside, I really don't think many posters construct such gramatically tortured missives, either.

I know from personal experience that if I'm posting to a member who has ESL, I deliberately keep the sentence length short and keep the grammer as simple as possible.

I don't think it would take too long to train oneself to keep it that way when writing posts (with a bit of reminding from the system perhaps, a red flag once a certain sentence length was exceeded maybe).

No, disastrous as the tests to date have been I don't think that's the end of the story, by any means.

Of course, all to no avail if Chris doesn't want to go there, heck, it's not like I'm suggesting changing the door mat, this is serious stuff to do properly, and there could be a myriad of reasons it doesn't fit in with his business model.

Heck, who wants to take over the Universe anyway!

Onwards and upwards.

Mugurel Dragusin
January 4th, 2010, 09:21 PM
"Ca un idle, Anul Nou Ziua crezut (nu că am fost, o altă zi grea la birou, er, care lucrează în grădină, şi vă rugăm să spun că dacă aţi ajuns aici înaintea mea), aceasta a avut loc la mine nu doar câte English (ca o prima limbă) de membri vorbind nu sunt pe DVinfo.

De acolo am ajuns la întrebam doar membri câţi DVinfo de fapt, are in ceea ce de ţări de pe planeta?

Ştiu că sunteţi de până la aproximativ 26,000 impar (unele dintre ele într-adevăr foarte ciudat. Ce, Cine, eu?), Dar în cazul în care sunt acestea?

De acolo a fost, dar un mic skip şi sări la, hmm, se pare că există diferente mari în postere observat între vorbeşte limba engleză şi care nu vorbesc limba engleză părţi ale planetei (Nu spun!).

Acum, având în vedere succesul hohotitor a funcţiei de căutare Google a adăugat un timp în urmă (idee stralucita, de ce nu cred că nu de-o eu?) Şi ca un ajutor pentru martie DVinfo de a prelua Universul cunoscut pe tema, er, DV, m-am gândit, poate DVinfo fi căsătorit cu Googles sistem de traducere şi a devenit aproape universal?"


That is in Romanian and it's a mess :) While one could grasp the basics of your post, it would take more than your average Joe to do so. It is mostly translated ad literam and so it becomes a funny text.

However I believe these translation tools were meant to be an aid rather than provide a perfect translation, given that true AI is not here yet.

Ervin Farkas
January 4th, 2010, 09:47 PM
Chris, this is going to end up the way good old missionary stories used to end: they found that it's easier to teach the isolated tribes English than to translate the Bible into their language.

Here's a suggestion: there are quite a few of us who speak both English and another language at mother's tongue level. The forum might want to post a sticky with a list of members willing to translate for those who need help in some other language.

I myself speak Hungarian and Romanian, and I'm willing to help.

But honestly, my experience is that those shooting for excellence in videography have usually no problem understanding and expressing themselves in English.

Chris Soucy
January 4th, 2010, 10:21 PM
Check out the update to my post #11, I think we may need to come at this from another angle, but point taken Ervin.

Thanks for the try, Mugurel, as I said, I think any system would fail with the text as written.

I'm not hoping for any better results with any other system/ language till I've investigated what the system would REALLY have to work on in a "real world" situation - there aren't too many gasbags like me on DVinfo, so it isn't a fair test at all.

If I have to resort to taking random (average) posts and running them through the translator, I might need to dragoon you guys into checking the results of the Hungarian and Romanian (& Filipino?) tests.


CS

Shaun Roemich
January 4th, 2010, 10:24 PM
I myself speak Hungarian and Romanian, and I'm willing to help.

At first glance I THOUGHT that said "Romulan"...

A very interesting thread and some very sincere and thoughtful offers to assist here. I love this forum.

Chris Soucy
January 5th, 2010, 12:34 AM
We may yet get a live one (Romulan), in which case we can leave the Klingon translator in and watch the continuing series of Star Trek as they have at it.

I thought this was going to take a swerve into the exotic when Ervin mentioned missionary, but alas, it did not transpire.

The offers of help were a great encouragement however, thanks guys.

On a more serious note, the body, er, view count is racking up nicely, but there's not nearly the input coming in I was expecting for such a mind bogglingly possible expansion of DVinfo's capabilities.

To quote from a very well known Australian tourist ad:

"Where the bloody hell are you"

And as for setting out on this journey, well, I'll quote another well knowm OZ tourist ad:

"You'll never never know, if you never never go"

Come on guys and gals, give us some feedback/ input.


CS

Bob Hart
January 5th, 2010, 01:32 AM
Ngai brubi learnaman wadji fillummaka. Ngai jaja fillum nyarmbali yaala.

I'll leave you guessing on this one. In the second Starwars, there were two short phrases in this language which made sense contextually so I assume the writer dug deep and wide through uncommon languages to create Huttese.

Ervin Farkas
January 5th, 2010, 06:22 AM
Kion pri Esperanto? Estus bela mezo tero...

[How about Esperanto? Would be a nice middle ground...].

Gareth Watkins
January 5th, 2010, 08:27 AM
The Dutch isn't the only totally gobbledygook translation... the French is just as bad... good job I had the English one to look at to see what it was all about...

Guess the perfect automatic translator has yet to be invented.

Amitiés
Gareth

Chris Soucy
January 5th, 2010, 03:35 PM
amongst us.

Further reflection on just why this translation stuff is failing so badly has led me to the inescapable conclusion that a major part of the problem is the sheer plasticity of English itself.

[There, case in point. I've probably broken a half dozen "rules of English" in that sentence alone, no wonder the translators are falling off their perch. I doubt that any of the English as a first language readers even batted an eyelid].

Which led me into a bit of a minefield wondering whether other languages are more "structured" and thus less likely to be mangled by the people using them.

So, thought I, what happens if you make up a sentence, no longer than, say, 15 words maximum, which is totally incapable of being misconstrued (in English), obeys all the rules, has no spelling mistakes, contains no ambiguity whatsoever, has minimal punctuation and delivers the message loud and clear.

Now, my logic says that if that sentence is run through the translator, the result should be just as unequivacal as the original (ah, but is it?).

Let us assume the previous assumption is correct.

If it is, then running the translated sentence back through the translator should return you to the original English version in perfect health.

Take this sentence:

The quick brown fox jumped over the lazy dog.

All of you that speak foreign, type it into the translator (go to Google, it's on the tool bar, top left) and translate it into your favourite "other language".

How does it read?

Now, copy the result, delete the original and paste the result into the text box. Reverse the language from/ to selection.

Hey presto, in all the languages I've tried, I get the original back.

Now try this one:

Mary had a little lamb, it's fleece was white as snow.

I won't beat about the bush, it fails.

Why?

Because of that comma.

It should read:

Mary had a little lamb. It's fleece was white as snow.

Most of us that use English (as a first language) have forgotten so many basic rules we don't even stop to think about how badly we're mangling it, because we can navigate quite happily without them.

So, where am I going with this?

I'd like anyone who has some spare time and a bit of curiosity to try out their own sentences (any and every language you like) and test them on the translate/ re - translate torture rack and see if my premise holds true if the rules are obeyed.

So, what's the bottom line here?

Quite simply, if my theory is correct, running the tranlate/ re - translate routine as you're typing, so that each sentence is displayed back to you in the source language, would very quickly flag where the rules were being broken, effectively re - teaching you, er, me, how to write according to the rules.

[See, there, I've done it again - 3 1/3 lines (in the post box, maybe 2 on the page) and not a full stop to be seen and probably utter giberish on it's return from the translator, but, then, maybe not.]

To put it another way, as it looks like we'll never teach a machine based translator to accurately decode a moveable feast such as coloquial English, maybe the translator can teach us to write within the rules and give it a sporting chance.

Thoughts?


CS

Trond Saetre
January 5th, 2010, 05:13 PM
I am just glad I do not have to read a Norwegian translation of this site.
I have read, or tried to read, some Norwegian technical books about both computers, electronics, video...
And honestly, it is much easier to understand the English tech words than translated words.

And as been said already, I have yet to find an online translator which can do a fairly good job with more than only basic sentences.

Chris Soucy
January 5th, 2010, 05:42 PM
Put my money where my mouth is and gave it a try.

http://www.dvinfo.net/forum/open-dv-discussion/470376-dcr-sr33-video-problem.html#post1468380

Check out post #11.

If ever there was a time to get this thing working, this thread is it.

Took the basis from a previous post further up the page, pasted it into Word, tweaked it a tad then stuffed it into Google and translated it to Swedish.

Translated it back to English where, much to my amazement, it had actually improved the wording of the first line significantly.

Changed the first line to the suggested, added a bottom line and did it again through Google.

It came back out of the wash looking pretty good so decided to fire it off to Sweden.

What the reaction there will be is anybody's guess.

Not a fandango you'd want to do too often, but if a system could be arranged whereby a spell check, grammer check and the translate/ re - translate worked concurrently as you typed, could be very effective indeed.

I have yet to figure out just why the differences I see between the original and the subsequent re - translate back to English are as they are, that makes no sense as far as a rule base goes, but something that could easily be adapted to in a short space of time.

The down side of it all is that with all three running concurrent with one's typing, it'd be a bit like having an entire car full of back seat drivers.

The up side is it'd be so painfull that in pretty short order one would learn to obey the rules and they'd shut up.

In case anyone is having a problem with this concurrent translate/ re - translate concept, imagine the white box you see when typing a post.

Now imagine another seperate white box under/ over it which is being filled with English text from the tranlate/ re - translate routine for each completed sentence or usable part thereof.

Instant feedback of translator bloopers.

The theory being that if your text can be translated from English to (pick a language) and back to English with some semblance of sanity, it's fit to go out to all the other possible languages that might be on the system.

Hey presto, a multi lingual DVinfo that teaches you how to work it.

Think it might be good to get some feedback from Chris Hurd before going too much further down the track. Anyone know if he's gone/ going to CES?


CS

John McCully
January 6th, 2010, 02:33 AM
Chris Soucy, I can not resist joining, momentarily, this conversation. Not because I can offer any constructive advice regarding your goal with this thread but because I found your wanderings into Literary Theory interesting. You got my attention; pondering the notion of translation and translatability all within the context of cultural and phonetic differences, fairy stories, a little lamb (glad you added the lamb; very New Zealand) not to mention the miss-connects that occur between like peoples because of traditional beliefs, dogmas, and other non-touchable ways of interpreting. And what a difference a full stop makes, not to mention that dastardly comma, that often inserts a pause where one might rather have moved right along.

I also detected in your posts in this thread a touch of the same predisposition towards critical inquiry that leads you down the path of tripod deconstruction.

Good for you.

I discovered a downloadable series of lectures at Yale University. Digression: It is totally wonderful that the USA in the form of Yale give away lectures on many subjects. It doesn’t come any better. And speaking as a Kiwi Canadian I say to all those who sneer at the States, seems there’s no letup, to them I say go online to Yale University and check what that American institution is doing, at no cost to the downloader. I suggest it is incredibly generous, thoughtful, living breathing Human Rights on a grand scale. The US takes a lot of criticism and sometimes I like to point out the other side of the coin, especially when it benefits me, as in this instance. I’m doing ‘An Introduction to Literary Theory – Paul H Fry’. I’ve already downloaded 4 lectures and the text book for the series will be here in about a week. So after I get through that, 26 lectures and some heavy reading, I might have more for you on the business of translation. In the meantime I might take your words ‘have forgotten so many basic rules we don't even stop to think about how badly we're mangling it’, (the English language you are talking about) and ask Professor Fry how he feels about that notion. After I do this course I’ll report back with perhaps a sensible suggestion or two about the task of the translator, sliding signification and the mangling business.

I’d better quit this line of reply before Chris Hurd says ‘look boys, if you want to have that kind of conversation could you go somewhere else, please’, whereupon I’d quickly reach for my EX1 and let him have it, handheld, 24p, (the film look). If that didn’t work you could come to my rescue and with one fluid pan of your Vinten carbon-fibre quick-lock legs you would take the wind right out of his sails, with a smile. No worries.

Cheers

John

Chris Soucy
January 6th, 2010, 03:02 AM
What you smoking up there? (ROFLH).

Can you send a small parcel down here? You've got the address.

I'll most defiantely have what he's having!

I think I'm gonna need all the help I can get with this one.

Thanks for one of the most erudite responses to a thread I've ever read, good on you mate (us Kiwi Canadians have to stick together).

Don't think you have to worry too much about CH, he's most probably watching, waiting to see where it goes, if anywhere, and then he'll chime in, no hurry.

Hey, he's threatened to visit us in NZ on hols one year, he won't blow that on a whim!

Enjoy your Yale lectures.

Regards,


CS

Trond Saetre
January 6th, 2010, 03:58 AM
Now try this one:

Mary had a little lamb, it's fleece was white as snow.

I won't beat about the bush, it fails.

Why?

Because of that comma.

It should read:

Mary had a little lamb. It's fleece was white as snow.



When translating these to/from Norwegian, the words "It's fleece" was not translated from English, using Google translator.

Colin McDonald
January 6th, 2010, 12:48 PM
I try to switch off from teacher mode out of hours but I may be able to clarify this one:

Mary had a little lamb. It's fleece was white as snow.
When translating these to/from Norwegian, the words "It's fleece" was not translated from English, using Google translator.

"It's" is an abbreviation for "it is" NOT a possessive.

So the second line of the nursery rhyme should be "Its fleece was white as snow."

/pedant mode

Ervin Farkas
January 6th, 2010, 01:22 PM
Correct. I tried going English to German and back with the corrected sentence, and "Mary hatte ein kleines Lamm. Sein Fell war weiß wie Schnee" was translated back into the exact same English phrase.

Bingo,

Chris Soucy
January 6th, 2010, 03:43 PM
Well spotted.

You know, I looked at that about a dozen times whilst I was playing around and KNEW something wasn't right, even though it was staring me in the face.

Hmm, spelling, grammer and punctuation - perfection required.

Somebody please tell me that other languages aren't this easy to mangle!

On another note, it has occured to me that this process can be made slightly easier if the translator itself is "trained" to reckognise slightly more diverse text than it currently does.

If anybody read the article on Machine Translation I linked to, they will remember that Google got their system much improved by feeding it 200 BILLION words from UN documents.

All very well, but I doubt we'll see writing like that here any time soon.

However, how many words are there in the DVinfo post backlog?

That 1 million plus posts must contain a heck of a lot of coloquial English.

Maybe...............

But, then again, maybe not!

The problem, of course, is that the 200 billion words would have been taken from documents translated by many highly qualified human beings at a cost I don't even want to think about.

The DVinfo archive is English and only English.

Back to the drawing board.


CS

Chris Soucy
January 7th, 2010, 06:07 PM
Having reached something of a hiatus on the subject, a little bit of conjecture to be going on with.

Disambiguation.

Great word, and probably one of the toughest nuts of all to crack, especially if the translator has no access to the author, as is the case with standard document based translations. Anyone who read this article: Machine translation - Wikipedia, the free encyclopedia (http://en.wikipedia.org/wiki/Machine_translation) in one of my earlier posts may remember the translation pyramid it contained, which shows source text being analysed into an interlingua before being translated into the target language.

Now, for our purposes, the authors (you and I) are not only available but actually in the process of creating the text to be translated, it is thus capable of being altered in its entirety.

Now, assuming we have a spell checker, grammar corrector and punctuation policeman from hell, all we’re left with is the dreaded disambiguation.

As was demonstrated previously, if a sentence can be produced with perfect spelling etc and contains no ambiguities, the statistical translator in Google is pretty well spot on as shown by translating the sentence to another language and back again to receive the same text as transmitted.

So, how do we solve the ambiguities conundrum? If you take a sentence and pass it through the analysis engine towards the interlingua stage, an ambiguity must create 2 or more possible results. Take the word pair “going out” for example. Is this used in the context of “the fire is going out” OR “We’re going out tonight”, describing two completely different things (fire and socialising), which to a machine make no sense at all?

However, let us assume a sentence is designed which follows all the rules and contains just one ambiguity. Now, if the text is passed through the translator to the interlingua stage and all of the possible generated options are fired back into English and compared to the original, one should be correct and the others should not, easily measured.

So, if, as we are typing, the background software continually took the sentence, passed it through the analysis phase, passed it back into English and compared all the possible results with the original text, the results should be capable of generating a statistical ranking based on the original text, 100% being the goal but I’d settle for 95%. As the author is right there, a major blooper caused by nested ambiguities, say, which gave a stats figure of less than the chosen 95%, can be corrected on the fly, tested and either changed again or accepted.

This sounds all very long winded and is certainly not something you’d do manually, but heck, you’ve got that quad core and 8 Gig connected by megabit broadband to Chris’s server farm and it’s doing nothing but waiting for you to hit a key twice a second at best.

Note that I have made no mention of Google. I envisage this software app to be on Chris’s servers and freely downloadable. The actual translation to other languages happens after your post is done and dusted on said server(s).

Now, you may have noticed I didn’t suggest passing the text through the translator to another language and back? The reason is simple, I’m certain there are many, many English words that simply cannot map to a word or phrase in other languages. If it fails there it must fail hopelessly going back again, so it’s actually introduced an error needlessly. Quite how the Google translator handles this situation I’d be interested to know.

There is very likely a good reason why the system doesn’t or couldn’t work this way, but it does seem like a logical presumption to make given the above.

Just a little something to keep you thinking.

CS

Marty Welk
January 7th, 2010, 08:02 PM
exactally ^

If you write simply, it can be translated.
If the translator was phrase based, the phrase would still have to be in the database.

With a constant operating live cross translation, feedback to the original language, a writer could create sentances that conformed to the ability to translate, by thier choice of words and phrases.

What would it be like to write about video, so it reads like a childrens book?
The fur on the sheep was white. Snow is white. The fur and the snow were similar.

I have a metal object. The object has 3 legs. There is an item on the top of the object. The item on top will move up and down. The item on top will move left and right. The item is a Head. The item is not your head. the item is not my head. I want to make movements that are smooth or soft. . .

Matt Davis
January 8th, 2010, 08:19 AM
If you write simply, it can be translated.

Once, a judge of Court of Session of Scotland sent the Editor his candidate which reads: "In the Nuts (unground), (other than ground nuts) Order, the expression nuts shall have reference to such nuts, other than ground nuts, as would but for this amending Order not qualify as nuts (unground) (other than ground nuts) by reason of their being nuts (unground)".

I also refer the honourable gent, and to all herein, to the case of the most unsuccessful phrasebook, now known as 'English as she is spoke':

English As She Is Spoke - Wikipedia, the free encyclopedia (http://en.wikipedia.org/wiki/English_As_She_Is_Spoke)

Alas, the Wikipedia entry does not mention the howlers, which you can find easily on the web. But thanks to the idiom that says 'to err is human, to really foul up requires a computer' (and to get us back OT) here's BabelFish v. English As She Is Spoke:

English As She Is Spoke vs. Babelfish! (http://www.zompist.com/spoke.html)

Paul Mailath
January 8th, 2010, 03:28 PM
And the Australian Version of that wonderful book:

Lets Stalk Strine by Afferbeck Lauder

http://en.wikipedia.org/wiki/Afferbeck_Lauder

You'll be speaking like a drover in no time

Andris Krastins
January 8th, 2010, 05:09 PM
I can attest that the Google translator from English to Latvian is totally unusable.
But I hear it's a lot better with major languages. Any way, I know English well enough. :)

Hans Ledel
January 10th, 2010, 01:22 AM
I just read the English to Swedish translation and I´m sorry to say it does no work at all

Cheers

Hans

Peter Berggren
January 12th, 2010, 03:02 AM
I think most Swedish people between the age of 15 and 65 easier understand the English text than the machine translated to “Swedish”. The “Swedish” version didn’t make any sense.

Best regards, Peter

Alex Khachatryan
January 21st, 2010, 03:03 PM
Russian:

Иностранные (Non Английский говоря) Члены или возможных members.As простоя, Новый год День мысли (не то чтобы я был, другой тяжелый день в офисе, ER, работая в саду, и, пожалуйста, говорить, если у вас есть здесь раньше меня) мне пришло в голову, сколько именно, не английский (в качестве первого языка), выступая Члены есть на DVinfo.From там я получил на интересно только, сколько членов DVinfo действительно имеет в каких странах по всей планете? Я знаю, ты до примерно 26000 нечетные (некоторые из них действительно очень странный. Что, Кто, я?), Но где они? Оттуда оно было небольшим, но пропустить и перейти к, гм, как представляется, будет большая расхождения наблюдаются в плакатах между английской речи и без Английский выступивший частях планеты (Вы не говорите!). Теперь, учитывая оглушительный успех в функцию поиска Google добавил некоторое время назад (блестящая идея, почему я не думаю, его себе?), а в качестве помощи по март DVinfo завладеть известной Вселенной на тему, ER, DV, я подумал, может быть DVinfo браке с системой перевода Googles и стал почти универсальным? You'all должны быть знакомы с это: Язык ToolsThe вопрос, можно ли в браке с IP адресов для автоматического перевода сообщений / поиска для стран назначения, если необходимо или выключены для комфортного ESL * Член? Я не думаю, что вам нужно слишком беспокоиться о Элмер фадд (что может вырасти на вас Tho '), "Пираты, BORK и т.д., Hacker или клингонский переводов, особенно последний, Tho' Вы никогда не знаете, есть какие-то странные существа скрываются на эту site.The остальная часть (довольно много, на самом деле ) вполне может открыть новую дверь для DVinfo.Nope, я понятия не имею, насколько успешно система является, как только я говорю Языки являются Strine, киви, канадец и ПОМ, все из которых, как представляется, таинственным отсутствуют в списке - Хмм , есть теории заговора похоронены где-то там. И нет, это не двери назад DIG о региональных кодов (интересно тем, что он является) просто полный "думать" из коробки "момент (и первый, кто пришел с" New Years Day Out = деревьев, Не сейф, будет не слишком далеко .... ER, гм!). имеющими хорошую, Крис Г. и все, что вам DVinfo'ers. КСО: Это не drill.PPS: * = ESL Английский как Второй язык Последняя редакция Крис Soucy; 1 января 2010 в 06:58 PM. Причина: Whoops.

...

Anyone want to comment on the accuracy?

This is getting a lot of hits, but very little input, anyone want to chime in?


CS

There are just bunch of words but no meaning.
Note: I am native Russian speaker.

Chris Soucy
January 21st, 2010, 03:47 PM
Wow, 1,644 hits as of this post, I would never have thought such an esoteric subject could or would generate such a level of interest.

I am pretty well convinced the "system" as it exists is entirely unworkable, thanks for the further confirmation.

Quite how the various messageing networks think they can use it or something similar has me completely baffled.

Still waiting to see if there's any feedback from "up there" on the subject.


CS

Alex Khachatryan
January 21st, 2010, 09:55 PM
Chris, I do not want to act too critical, there are some parts when I read, I catch a slight meaning (3-4 words getting connected) but then, boom, it takes me nowhere again.
Over all, there is an improvement in results comparing with the ones I got 5 years ago, trying out automatic translation services.

Chris Soucy
February 10th, 2010, 11:07 PM
Still got interest, but no feedback from the hierachy, guess it's not an issue they wish to persue.

Just to keep the ball rolling tho', cop this titbit of news, well, speculation, if their current offerings are anything to go by........

A phone that translates 6,000 languages in real time, really? - News, Gadgets & Tech - The Independent (http://www.independent.co.uk/life-style/gadgets-and-tech/news/a-phone-that-translates-6000-languages-in-real-time-really-1893527.html)

Can't see it myself, but there you go.


CS

Ervin Farkas
February 11th, 2010, 06:27 AM
English is not going to be replaced any time soon by any other language, especially when it comes to technology. If anything, the position of English is going to get stronger and stronger as more and more people are involved with technology.

I come from a bi-lingual background - born Hungarian in Romania, I speak both languages at mother's tongue level... plus try my best in English. Recently I helped a relative in Budapest put her computer in order software-wise via remote access. I can't tell you how much frustration it was to try guessing what the different Hungarian terms might mean in English.

So this is what I say: we better help everyone learn English instead of translating.

Matt Davis
February 11th, 2010, 07:03 AM
we better help everyone learn English instead of translating.

English is such a silly, stupid, messy chopped up language full of arcane rules and spellings it makes Japanese look easy. Okay, so it has borrowed words from loads of languages, but there are so many subtleties and nuances that can trip people up. And have you ever experienced the tirade of abuse one can receive with a misplaced (aka 'Grocer's') apostrophe? I'm guilty of it myself.

If only it were Dutch that made it as 'the' lingua franca. Or something simple. I'd have been in favour of Esperanto, but always sounded a little odd to my ears...

But I digress. Have been filming and editing a few movies on the topic of networking, and Cisco recently held their big Euro event for networking engineers. The biggest topic was video. Cisco want to 'pwn' video plumbing.

They are almost ready with server-side smarts that analyse incoming video, does machine transcription for later searching, and then providing machine translation for subtitles/closed caption. This is not an 'idea' or a 'prototype', its what Cisco needs to make happen in order to make sense of this strange datatype which has, in the past, been treated like a month old skunk corpse by networking engineers.

It feels like there's other technology being sat on by the likes of these large organisations that do improve on current examples of transcription and translation.

Ervin Farkas
February 11th, 2010, 07:51 AM
Not surprisingly, the offer to translate comes from English speaking people (talking about this particular thread here), meaning, from what I've read here, from people who don't speak another language. You guys need to understand that there is English, and then there is specialty English. You can bring in the best human translator in the world, if he does not know "multimedia English", he will not be able to accurately translate. Factor in the hundreds of specialty terms used with cameras, NLEs, etc, etc...

At this point, machine translation barely makes it with general, every day language - let alone any specialties.

Translation is a FORM OF ART, not a skill one can easily learn, and pretty much imposible for machines at this point. Over the years I had the chance to listen to translators who were speaking both languages perfectly, yet their translation was horrible. A good translator is born, not educated (of course, skills may and need to be refined).

Machines will never "get there". Here's a poem by the greatest Hungarian poet Petőfi Sándor, very simple words, nothing hard - translated to English by Google.

FÜSTBEMENT PLAN

All the way - home --
I'm thinking:
How do I call
Had not seen my mother?

What will I say first of all
Nice, nice to him?
When that rocking cradles,
The arm extends.

And I thought many
Szebbnél-better idea
While the time seems to stand,
Although the truck was running.

And the little room toppanék ...
Flies up to me with my mother ...
And I csüggtem lips ... silence ...
As the fruit of the tree.

Sounds horrible in English... doesn't it? Now think about what it would look like if you try to translate say embedding metadata or bitrate, color grading, white balance, or AVCHD...

Do you see where I'm going with this? Let's revisit this topic 50 years from today, we may have better options in terms of technology... until then, as I already said, let's just keep it good ole' English.

Chris Soucy
March 11th, 2010, 04:38 PM
Franz Josef Och, Google's translation uber-scientist, talks about Google Translate | Technology | Los Angeles Times (http://latimesblogs.latimes.com/technology/2010/03/the-web-site-translategooglecom-was-done-in-2001-we-were-just--licensing-3rd-party-machine-translation-technologies-tha.html)

and

http://www.nytimes.com/2010/03/09/technology/09translate.html


CS

J. Stephen McDonald
March 11th, 2010, 07:43 PM
Here's a free translator that seems good to me-----but don't always trust the gender and singular/plural designations to be correct. These online translators work best if you have enough basic knowledge of a language, to notice the mistakes and be able to correct them. Many of them provide more complicated and often outdated phrasing, than would be used in contemporary conversations. Familiar forms are more commonly used these days with many languages, but the formal versions of verbs and pronouns are most often provided by the translators. When it comes to technical words or phrases, the English versions are often the ones learned and used by speakers of many other languages. In some cases, it might be better to use them in the translations, rather than a word or phrase from the other language, that may have been coined long before camcorders and computers were invented. This doesn't apply to French, of course. There's a government agency to keep that language free of anglicisms. Free Online Translator (http://www.worldlingo.com/en/products_services/worldlingo_translator.html)

Jim Andrada
March 21st, 2010, 05:16 PM
Couldn't resist a couple of longish comments

1) A hundred years ago I got tangentially involved in machine translation while I was an undegraduate at Harvard in the Sputnik era (I also worked for the Air Force tracking Spunik 2, but that's a separate story) As you can imagine (or remember if you're old enoug) there was a great and sudden interest in translating Russian technology texts to English and the academic community wasn't shy about getting government research funding for the purpose and teaching courses in it. And I wasn't shy about taking the courses to satisfy some long forgotten course requirement.

The results at the time with the limited computer power we had in the 1950's were pitiful (along the lines of "The spirit is willing but the flesh is weak" coming out as the Russian equivalent of "wine good, meat rotten")

Disclosure - I don't speak Russian but then again most people working on the project didn't speak it either.

2) Fast forward to 1965 or 1966. I was working in an ad-tech group at IBM and we were trying to build a natural language query system. The impetus for the project, aside from looking for a way to sell more computers was the belief that people would be more capable of extracting information from data bases if they could query in natural language. I was working on the front end, ie the syntactic analyzer and intermediate code generator, which would have been followed by the query engine itself. I also got to run around the country presenting papers on our project at groups like the Association for Computational Linguistics.

It was all great fun and we were sort of able to satisfy a simple query in English, German, or Persian along the lines of "are there any books in our library about mathematics". Interestingly enough, the correct answer would have been "yes" or "no" but we took as an assumption that every query should be answered with some kind of list. I don't remember if we distinguished between books and papers and magazines, or not, but I have the funny feeling that we just lumped them all together.

While it might seem that we might have just been parsing the input for keywords (a la a Google search etc) we were actually doing a fairly sophisticated syntactical analysis of the input stream - which is why we kept getting invited to present papers I think. We also had a bunch of MIT and Harvard grad students in linguistics working with us as part timers.

Anyhow, our conclusion was that what people really wanted wasn't as much the ability to query in natural language as the ability for the computer to read their minds and figure out what they were looking for - they didn't want to take care to disambiguate or precisely state what it was that they wanted in the first place, and if they had to do it anyhow, they were just as happy with a structured way of representing the query.

Another conclusion was that real machine translation was probably an unattainable goal, largely due to (as someone said earlier) the need for disambiguation. And not just disambiguation in the syntactical sense, but also in the much more difficult semantic sense - ie understading not just the syntactic structure of the input stream, but also understanding what a word meant by itself and what it might also mean in the context of the overall discourse, slang, local dialect, social environment, etc etc etc. For example, how should we know that a nice kettle of fish would be one thing if you were in someone's kitchen and a totally different thing if the conversation were about someone who had gotten into a lot of trouble - ie was in a real jam, and not the kind of real jam that you spread on your bread or was maybe participating in a jam session. Or that the meaning of "fine point" would be different if you were discussing a legal issue or knife sharpening.

And this is all in the context of a single language - now exponentiate the difficulty by trying translation - how should the computer know that the appropriate translation of "Thank you" from English to Japanese is in some cases the equivalent of "Excuse me" or maybe silence?

(When someone holds a door or goes out of their way to do something for you it's considered socially appropriate to excuse yourself for causing them trouble - hence the response should be "sumimasen" and not "arigatoh" - unless of course the nice person was in a position where it was their job to open the door for you in which the right response would be to say nothing at all)

Google Translator as I understands it works on the basis of being a statistical translator - ie it searches for an example of the same or similar sentences or phrase in some already translated material - something like "I love you" should be duck soup as it's been translated from and to English and almost every language on Earth and the line appears in almost every romance that has been translated.

But what on earth would it make of "Crush the blacks"? Is this some Apartheid era racial epithet or maybe an admonition to a NZ football team or maybe, just maybe, some obscure techno-speak from a videographer? I ran a test from English to Japanese and the system did a great job of avoiding the issue by just replacing the English "crush" with the phonetic representation in Japanese Katakana and "Blacks" with the characters for "black person"

It also translates "Warm blacks" as meaning nice toasty black people. Haven't tried cool whites yet. By the way it gives differing translations for "Crush" or "crush"

This is very dangerous stuff, particularly if you don't know both source and target languages quite well - in which case why would you bother!

Disclosure: I'm a native English speaker and speak Japanese reasonably well, but not native speaker level, in spite of which it's been my home language for 20 years and a language I use for business nearly every day. Considering that I was almost 50 when I started studying it, I don't feel too bad about not being perfect.

Chris Soucy
March 21st, 2010, 11:27 PM
An exceedingly well thought out and beautifuly written piece (wish I could be so eloquent).

Wow, you've certainly "got about" in your life.

You make some excellent points and the more of them I'm reading, the less I can reconcile what Google are saying, with the reality of the situation.

The thing that I find staggering is the sheer number of hits this post is getting, it must be hitting a nerve somewhere, somehow, but who is it attracting and why?

Any of those "hitters" want to 'fess up?, or are you lurkers and unable to?

Good reason to sign on boys and girls, and have you're own say on the subject, heck, it's free and the only down side is getting razzed by me when I'm in a grump.

Not really a problem, I'm harmless (don't ask the missus as she'll say "no such thing" but let's not quibble).

Thanks again Jim, that really was just about enough to get your Masters in "Verbal Translation Communications in the 21st Century".

Impressed? Oh yes.


CS

Jim Andrada
March 22nd, 2010, 12:26 AM
Thanks for the kind words - I spent about 5 years in total chasing this particular bluebird.

By the way, my (Japanese) wife and I also speak Italian for some strange reason or other. There's an interesting expression "In bocca di lupo!" which is literally "in the mouth of the wolf", But it's more common meaning is something like "Good Luck!" and kids use it to wish their peers well in school exams etc. Google does give the correct literal translation, but I think that's seldom what is meant. Similarly the (US at least) expression "Break a leg" seldom literally means what it says, rather it's also an expression for "Good Luck". So maybe a good translation of "Break a leg" would be "In bocca di lupo" and vice versa.

I believe that at some point the feds decided to not fund any more work on machine translation. Not sure about the last 20 or so years as I haven't kept up with it.

Anyhow, our team used to have a weekly contest to see who could come up with the most ambiguous expressions. For example, have you ever seen "Half roasted chicken" on a menu? Does it mean 1/2 of a roasted chicken or a chicken that's only half cooked?

There's a little store here that advertises "Watch batteries while you wait" - but somehow I'd rather watch girls while waiting. Batteries just don't turn me on. Anyone for "assault and battery" by the way - to say nothing of a bunch of big guns also known as a battery. (New York, New York, it's a wonderful town. The Bronx is up and the Battery's down - as in The Battery - used to be the fort where they had the cannons and it's at the southern end of Manhattan)

Once upon a time (Tried running that through Google as well - total disaster) a researcher named (IIRC) Jane Robinson tested the then prevalent assumption that the scientific literature would have many fewer such ambiguities, so she analyzed several articles (again IIRC) in a journal of chemistry. She identified one rather longish sentence that could be interpreted 106 ways if you didn't really understand the subject.

By the way, it isn't just machines that have trouble. When I was working in Japan for a US computer company we had a "small" glitch that took the biggest bank in Japan offline for over a day - not nice! After I endured a half dozen meetings that seemed more like ritual beheadings, I was able to get a VP from the US to come over and explain all the things we were going to do to make the customer happy. It was a BFD (not sure if you can Google that one or not!) so we hired a simultaneous translator for the day of the meeting with the top brass (not sure about that one either). She was fantastic - it was like hearing an echo. Japanese in, English out and vice versa.Very impressive indeed

Everything was going well until the US VP promised that systems engineers from our company would at no charge rewrite the offending software package. At which point I thought WW III was about to break out.

It's quite common in the US for a company (company A) talking to a customer (company B) to refer to the company A sales and support team as "your team" - meaning of course OUR team that is there to support YOU.

Guess how the young lady translated the promise that "YOUR team will work on this at no charge". Yup - she told the bank that their staff would have to fix the problem at no cost to us. Whoops! Sure glad I caught that one before they shot us. As good as she was (which was very very good indeed - I think she had once translated at the UN - she just happened to not know that particular usage of "your"

Damn - you've gotten me started! It's all coming back! I'll be seeing transformational grammars in my dreams!

Jim Andrada
March 22nd, 2010, 12:05 PM
Try this one (from the NY Times) with your languages of choice.

"Despite the limited rights and the spotty record track record for the largest record deals, Sony is confident it will come out ahead with its Jackson contract."

Chris Soucy
April 4th, 2010, 12:14 AM
Cop this, from this thread:

http://www.dvinfo.net/forum/dv-info-net-announcements/110328-region-specific-flags-somehow-2.html#post1509518

I think we may be going International, like it or not.

And at 3.7 k plus hits, this is the second most hit thread on this page, how about that, huh? Amazing. Staggered me.


CS

Hameed Aabid
April 17th, 2010, 02:28 AM
Hi Chris,
Just translated your opening post to Farsi (Persian) and didn't make much sense. The verbal translation is accurate to a degree, however it fails in the context.

For instance "As an idle, New Years Day thought" was translated to Farsi to mean
به عنوان غیر فعال ، در سال جدید روز فکر

Which means:
As an idel, in the new year's day thought.... this absolutely doesn't make sense.

What is more apalling, is the fact that words are in incorrect order gramatically.

.... and now I translated the above sentence and even that was wrong... not just contextually, but also gramatically.

Google doesn't have Urdu and Pushto translation... so I can only check Farsi.