AGW Observer

Observations of anthropogenic global warming

Non-English climate science

Posted by Ari Jokimäki on January 23, 2013

Today we are used to receiving new climate research written in English. That has not always been the case. There even was a time when English was a very minor language in science. Some time ago I started thinking that by concentrating on research written in English we might be missing lot of climate science, especially historically. I decided to take a look at the situation.

ClimLang

I used Google Scholar and Google Translator for searching papers containing the word “climate” in all languages supported by Google Translator. I recorded the number of hits for each language. Results of this are shown below in a table. Note that this analysis is very rough, so I suggest that the presented numbers should only be taken as directional, and that the big picture presented in the table is more meaningful. The resulting numbers have a lot of uncertainties, some of which I explain below. Here’s the result table:

Country/language Word Results
English/Latin climate 2550000
Spanish/Italian/Portuguese clima 954000
China-simple 气候 614000
Germany/Norway/Denmark klima 350000
France/Romania climat 318000
Russia/Serbia климат 93800
Japan 気候 49400
Turkey iklim 43600
Sweden/Poland klimat 34100
Korea 기후 33900
China-traditional 氣候 31100
Netherlands/Afrikaans klimaat 24100
Ukraine/Belarus клімат 23500
Albania klimë 7600
Arabic مناخ 6610
Lithuania klimatas 6270
Finland ilmasto 3980
Persia اقلیم 3850
Greece κλίμα 3500
Esperanto klimato 3480
Czech podnebí 2830
Vietnam khí hậu 1390
Azerbaijan iqlim 883
Hindi जलवायु 821
Estonia kliima 584
Slovenia podnebne 575
Slovakia podnebie 346
Thailand ภูมิอากาศ 468
Latvia klimats 255
Hebrew האקלים 244
Iceland loftslag 179
Swahili hali ya hewa 113
Yiddish קלימאַט 83
Welsh yn yr hinsawdd 28
Armenia կլիմա 18
Irish aeráide 12
Urdu آب و ہوا 3
Gujarati આબોહવા 1

There are 2,550,000 hits in the English/Latin. Non-English (excluding Latin of course) languages have 2,235,364 hits. So, it seems that almost an equal number of climate papers exist in English as in non-English languages. Some languages are missing from the table because they didn’t produce any hits (and of course lot of others that are not supported by Google Scholar).

Like I mentioned above, the numbers have a lot of uncertainties. Google Scholar returns a lot more search results than just peer-reviewed papers. There are books, reports, and even some blog posts. This distorts the resulting number of hits. This seems to be a substantial problem for example in the search results for my native language, Finnish.

Another source of error is that Google Scholar returns search results for both author names and journal names. This is a big issue for example in German results. There seems to be lot of papers published by many authors who have the last name “Klima”. 350,000 hits for the German language therefore seems to be off by quite a lot. A search for “Klimawandel” (climate change) resulted in 21,900 hits. English “climate change” gives 1,570,000 hits, so the resulting ratio of climate/climate change = 1.62 for English. Assuming the same ratio for German, it would result in 21,900 * 1.62 = 35,600 hits for “klima” (climate). However, this feels somewhat too low considering that German is a common language in science, and that other comparable languages have many more hits (for example, French has over 318,000 hits – but see below for the need to correct French results). Also, most of Hungary’s results seem to be from author’s names.

Yet another problem is that not all of the search results are in the language that was intended. This is partly due to the issue mentioned above about Google Scholar returning results both for author and journal names. There are also occasions where another language has the same word (or close enough for Google Scholar) in another meaning, or has an author’s name matching the search word. French search results, for example, includepapers in other languages. According to the first result page (yes, I know it’s not a very big sample…), French results are 20% non-French. This would reduce the number of French language hits to 254,400.

Albania’s word for climate is “klimë”, but almost all search results are for “klime”, so Google Scholar sometimes gives additional results for words that are close to the actual search.

Search results might also not be climate related. The word “climate” has other, non-meteorological, meanings. Such as the political climate, or a climate of fear. The possibility for this source of error might be even worse for some other languages.

There are also duplicate entries for some papers. And these probably are not all error sources. Some non-English papers have also been published in English (or vice versa), so the ratio of non-English and English papers (= 0.87) might not be accurate. Additionally, some non-English papers have English abstracts.

So, it seems that despite all of my search results, there are not 5 million climate papers out there. But there are a lot of them – and quite a few of them might be in a language other than the English and Finnish that I understand. It sure would be nice to be able to read all those papers when needed.

8 Responses to “Non-English climate science”

  1. jorgede said

    this is superb … how long (timewise) did it take ? Chapeau!

  2. Bill Everett said

    A very quick comparison of a few languages for a more restrictive search:

    climate atmosphere ocean sun cloud albedo — 14700
    Klima Atmosphäre Ocean Sun Wolkenalbedo — 7
    climatique atmosphère Ocean Sun albédo des nuages — 126
    ilmasto ilmapiiri valtameri aurinko pilvi albedo — 0
    Климат атмосферу океан солнце альбедо облаков — 180

    I cheated by changing the Google Translate result for Russian, which was Климат атмосферу Ocean Sun альбедо облаков. A similar change for the German and French versions would probably affect the search results.

  3. Ari Jokimäki said

    Jorgede, thanks! I didn’t time it, but it doesn’t take very long to do the searches and translations, less than one hour I think.

    Bill, as you already mentioned yourself, your search highlights some problems with this kind of approach. Google Translator has only limited abilities to translate different languages. This becomes more of a problem when you use more complex search terms (I have experimented with some myself, but not so complex you did). For example the Finnish search you did has one wrong word that you don’t expect to find in climate related papers – the “ilmapiiri” can be used as a translation of atmosphere but only if you describe what kind of feeling you have among your friends or in some other group of humans. Correct translation would be “ilmakehä”. Changing this would give you 3 search results instead of 0.

    There’s lot of interesting stuff you can do with Google Scholar. For example long time ago I checked the situation on temporal distirbution of “climate change” vs. “global warming” usage in scientific papers. Result of this is used in Skeptical Science’s rebuttal of the argument “they changed the name from global warming to climate change”. It’s this one:

  4. Ari Jokimäki said

    I checked Bill’s search results myself because I have been wondering if Google Scholar gives different amount of hits when it is used in different locations. Here are my results:

    climate atmosphere ocean sun cloud albedo 18900
    Klima Atmosphäre Ocean Sun Wolkenalbedo 3
    (Klima Atmosphäre ozean sonne Wolkenalbedo 24)
    climatique atmosphère Ocean Sun albédo des nuages 126
    (climatique atmosphère océan soleil albédo des nuages 541)
    (climatique atmosphère océan soleil albédo 730 – “albedo des nuages means cloud albedo”)
    Климат атмосферу океан солнце альбедо облаков 225

    There are some differences. Some of it might have to do with taking the result in different times.

  5. Bill Everett said

    Further searching with Google Scholar can also yield different numbers:

    climate atmosphere ocean sun cloud albedo

    About 14,700 results
    Page 2 of about 15,300 results
    Page 3 of about 15,500 results
    Page 4 of about 15,700 results
    Page 5 of about 15,700 results
    Page 6 of about 15,800 results
    Page 7 of about 15,800 results
    Page 8 of about 15,800 results
    Page 9 of about 15,900 results
    Page 10 of about 15,900 results

    climatique atmosphère océan soleil nuages albédo

    About 538 results
    Page 2 of about 514 results
    Page 3 of about 521 results
    Page 4 of about 525 results
    Page 5 of about 528 results
    Page 6 of about 521 results
    Page 7 of about 523 results
    Page 8 of about 525 results
    Page 9 of about 526 results
    Page 10 of about 510 results

    Климат атмосферу океан солнце альбедо облаков

    About 180 results
    Page 2 of about 187 results
    Page 3 of about 190 results
    Page 4 of about 190 results
    Page 5 of about 189 results
    Page 6 of about 189 results
    Page 7 of about 190 results
    Page 8 of about 190 results
    Page 9 of about 191 results
    Page 10 of about 191 results
    Page 14 of 192 results
    Page 18 of 192 results
    Page 19 of 192 results

    I also notice that the Google Scholar results are only an indication of scientific papers. For example, the last page of Russian results includes:
    Чудо Корана — Х Яхья, А Октар, H Yahya – books.google.com
    Home> Фотодизайн> Современные системы закаливания. Плюсы и минусы — СВ Попов – referats.net
    Современные системы закаливания. Плюсы и минусы — СВ Попов – referatcollection.ru
    История неба — К Иванов – Логос, 2003 – magazines.russ.ru
    Проектирование и реализация профессионально-ориентированной системы обучения в профильной школе — ТЕ Лапшина – 2005 – dl1.lib.ua-ru.net
    Без пощады — А Зорич – books.google.com
    (The last hit appears to be a science fiction novel. Chapter 1 is set in a prisoner of war camp on the planet Glagol in an unknown system in February 2622.)

  6. Ari Jokimäki said

    Bill: “I also notice that the Google Scholar results are only an indication of scientific papers.”

    Yes. My post above goes through some of the problems there. I think Google has gone a bit too far in what they consider to be “scholar” documents.

    It is interesting that your search still results in 14700 papers while I got far larger number. It is possible that Google Scholar returns different results for different locations.

  7. Halsu said

    “Ilmapiiri” means the emotionally perceived atmosphere as in the atmosphere of, say, a concert. It does not relate to climate in any way. We have a separate word for that, “Ilmakehä”. So, in Finnish the search should be:

    ilmasto ilmakehä valtameri aurinko pilvi albedo

    …which gives me 3 results, that’s not much but still better than zero. Albedo may be replaced with the finnish word for reflectivity, “heijastavuus” or “heijastuyskyky”, but with both of those the number of hits drops to 2.

  8. Ari Jokimäki said

    Yes, Halsu, we pretty much did that exercise above.

Leave a comment