Vance's CALL resources page | esl_home index
Return to Papyrus News Archive Main Page

Papyrus News
Website language stats

April 17, 2001: This message was distributed by Papyrus News. Feel free to forward this message to others, preferably with this introduction. For info on Papyrus News, including how to (un)subscribe or access archives, see <http://www.gse.uci.edu/markw/papyrus-news.html>.

Forwarded with permission....

mark

Date: Mon, 16 Apr 2001 16:28:54 -0400
From: Andy Carvin <acarvin@benton.org>
Subject: [DIGITALDIVIDE] website language stats - comparing web pages per language speakers

To: DIGITALDIVIDE@CDINET.COM

Hi everyone. Last year I posted a URL for a study by the Barcelona firm Vilaweb that attempted to calculate the percentages of different spoken languages used on the web. (http://cyberatlas.internet.com/big_picture/demographics/article/0,,5901_408 521,00.html)

The study identified the most popular languages, breaking them down like this:

Web Pages and Languages, ranked by Web Pages (source: Internet.com CyberAtlas/VilaWeb)

Rank Language # of webpages % of all webpages

1. English 214,250,996 68.39%
2. Japanese 18,335,739 5.85%
3. German 18,069,744 5.77%
4. Chinese 12,113,803 3.87%
5. French 9,262,663 2.96%
6. Spanish 7,573,064 2.42%
7. Russian 5,900,956 1.88%
8. Italian 4,883,497 1.56%
9. Portuguese 4,291,237 1.37%
10. Korean 4,046,530 1.29%
11. Dutch 3,161,844 1.01%
12. Sweden 2,929,241 0.93%
13. Danish 1,374,886 0.44%
14. Norwegian 1,259,189 0.40%
15. Finnish 1,198,956 0.38%
16. Czech 991,075 0.32%
17. Polish 848,672 0.27%
18. Hungarian 498,625 0.16%
19. Catalan 443,301 0.14%
20. Turkish 430,996 0.14%
21. Greek 287,980 0.09%
22. Hebrew 198,030 0.06%
23. Estonian 173,265 0.06%
24. Romanian 141,587 0.05%
25. Icelandic 136,788 0.04%
26. Slovenian 134,454 0.04%
27. Arabic 127,565 0.04%
28. Lithuanian 82,829 0.03%
29. Latvian 60,959 0.02%
30. Bulgarian 51,336 0.02%
31. Basque 36,321 0.01%

I recently got a chance to compare these statistics to the total number of speakers per language. For example, the study suggests that there are fairly similar numbers of Arabic-language pages and Slovenian-language pages (127,565 and 134,454 pages, respectively). Considering how many more Arabic speakers there are than Slovenian speakers (at least 202 million versus 2.2 million, respectively), I thought it would be interesting to calculate ratios comparing the number of speakers of any given language with the number of web pages in that language. Here's what I found:

Web Pages and Languages, ranked by the number of speakers per web page (language populations source: sil.org Ethnologue Database)

Rank Language # of Web Pages # of speakers people/web ratio

1. English 214,250,996 322,000,000 1.5 people/page
2. Icelandic 136,788 250,000 1.83 people/page
3. Sweden 2,929,241 9,000,000 3.07 people/page
4. Danish 1,374,886 5,292,000 3.85 people/page
5. Norwegian 1,259,189 5,000,000 3.86 people/page
6. Finnish 1,198,956 6,000,000 5.00 people/page
7. German 18,069,744 98,000,000 5.4 people/page
8. Dutch 3,161,844 20,000,000 6.3 people/page
9. Estonian 173,265 1,100,000 6.36 people/page
10. Japanese 18,335,739 125,000,000 6.8 people/page
11. Italian 4,883,497 37,000,000 7.58 people/page
12. French 9,262,663 72,000,000 7.77 people/page
13. Catalan 443,301 4,353,000 9.8 people/page
14. Czech 991,075 12,000,000 12.1 people/page
15. Basque 36,321 588,000 16.19 people/page
16. Slovenian 134,454 2,218,000 16.5 people/page
17. Korean 4,046,530 75,000,000 18.5 people/page
18. Latvian 60,959 1,550,000 25.4 people/page
19. Russian 5,900,956 170,000,000 28.8 people/page
20. Hungarian 498,625 14,500,000 29.1 people/page
21. Portuguese 4,291,237 170,000,000 39.6 people/page
22. Greek 287,980 12,000,000 41.67 people/page
23. Spanish 7,573,064 332,000,000 43.8 people/page
24. Lithuanian 82,829 4,000,000 48.29 people/page
25. Polish 848,672 44,000,000 51.8 people/page
26. Hebrew 198,030 12,000,000 60.6 people/page
27. Chinese 12,113,803 885,000,000 73.1 people/page
28. Turkish 430,996 59,000,000 136.9 people/page
29. Bulgarian 51,336 9,000,000 175.3 people/page
30. Romanian 141,587 26,000,000 183.6 people/page
31. Arabic 127,565 202,000,000 1583.5 people/page

The people-per-webpage is interesting because it shows how certain low-population languages have done relatively very well online compared to languages spoken by a much larger total population. Northern European languages do particularly well, with Icelandic having the second best ratio compared to English, and with Swedish, Danish, Norwegian, Finnish and Estonian not much further behind. Some of the Central European Slavic languages (Czech, Slovenian) appear to do better than their eastern linguistic cousins (Russian and Bulgarian), though Poland is a little behind its Central European neighbors.

Arabic ended up having the highest ratio of over 1583 people per webpage in Arabic - and this number is probably much higher considering the population estimate for Arabic speakers (202 million) is probably conservative. The largest linguistic group on the planet - Mandarin Chinese speakers - ranked near the bottom, with about 71 people per webpage.

One last item I found to be interesting was the state of Spanish-language content worldwide. Even though there's been a great increase in Spanish-language content as of late, when compared to the total population of Spanish speakers around the world, there are still about 44 Spanish speakers per web page.

At some point in the near future I hope to take this another step further by comparing the number of webpages to the number of language speakers who are also online, but that'll be a much more complex task......

Hope this might be of interest....

thanks,
ac
*****************************************
Andy Carvin
andy@benton.org
Senior Associate Benton Foundation
http://edweb.gsn.org/andy
http://www.DigitalDivideNetwork.org
*****************************************
Visit my new website, Anatolian Fortnight
http://edweb.gsn.org/anatolia
*****************************************



Use the navigator at the top of this page or your browser's BACK button to return to a previous page

For comments, suggestions, or further information on this site, contact Vance Stevens, webmaster. Regarding content of Papyrus-News, contact Mark Warschauer.

Last updated: April 22, 2001 in Hot Metal Pro 6.0