Sunday, December 26, 2010

Chinese: The New Dominant Language of the Internet

Chinese: The New Dominant Language of the Internet (Infographic, created by The Next Web Asia)

"China gained 36 million additional internet users last year meaning there are now over 440 million internet users in the country. English has long been the most widely used language on the internet but with Chinese Internet growth rising at the rate it is, it could be less than five years before Chinese becomes the dominant language on the internet."

This has been picked up by numerous websites and blogs as "news" although it's based on dated information from the "Internet World Stats" website.

The Atlantic reports it. The most interesting part is reader discussion.

In the same vein, you might find interesting what Google's Eric Schmidt had to say in 2009 on What the Web Will Look Like in 5 Years

"Google CEO Eric Schmidt envisions a radically changed internet five years from now: dominated by Chinese-language and social media content, delivered over super-fast bandwidth in real time.

Highlighted comments include:

Five years from now the internet will be dominated by Chinese-language content.
Today's teenagers are the model of how the web will work in five years - they jump from app to app to app seamlessly. Five years is a factor of ten in Moore's Law, meaning that computers will be capable of far more by that time than they are today.
Within five years there will be broadband well above 100MB in performance - and distribution distinctions between TV, radio and the web will go away. "We're starting to make significant money off of Youtube", content will move towards more video."

I personally think that Mr. Schmidt has lived on the bleeding edge (of technology) for far too long to have an intact sense of reality but there is some wisdom (and optimism) in his words and I'd certainly take the 100Mbps Internet connection. The FCC envisages the wonderful 100Mbps world in 2020: link

Going back to the source for the first Internet domination article, we can glean some interesting information:

Top Ten Languages Used "in" the Web


Number of Internet Users by language & Internet Penetration

English 536,564,837 42.0 %
Chinese 444,948,013 32.6 %
Spanish 153,309,074 36.5 %
Japanese 99,143,700 78.2 %
Portuguese 82,548,200 33.0 %
German 75,158,584 78.6 %
Arabic 65,365,400 18.8 %
French 59,779,525 17.2 %
Russian 59,700,000 42.8 %
Korean 39,440,000 55.2 %

I'll add these two as well:

Turkish 35,000,000 45%
Italian 30,000,000 51.7%

TOP 10 LANGUAGES 1,615,957,333 36.4 %
Rest of the Languages: 350,557,483, 14.6 %
WORLD TOTAL: 1,966,514,816, 28.7 %

It is interesting to compare these statistics with those from 1998:


Number of speakers (2010)

The Internet World Stats website states that "tallying the number of speakers of the world's languages is an increasingly complex task". I'd say that they have been very draconian with Russian and perhaps a bit too generous with Chinese, Arabic and French.

1 English 1,277,528,133
2 Chinese 1,365,524,982
3 Spanish 420,469,703
4 French 347,932,305
5 Arabic 347,002,991
6 Portuguese 250,372,925
7 Russian 139,390,205
8 Japanese 126,804,433
9 German 95,637,049
10 Korean 71,393,343

TOP 10 LANGUAGES 4,442,056,069
Rest of the Languages 2,403,553,891
World Total 6,845,609,960

One metric I find especially interesting is the number of pages available in a particular language. Unfortunately the available information is not very recent:

Chart of Web content (milions of webpages by language) in 2002

Language Percentage

English 56.4
German 7.7
French 5.6
Japanese 4.9
Spanish 3.0
Chinese 2.4
Italian 2.0
Dutch 1.9
Russian 1.7
Korean 1.5
Portuguese 1.5
Swedish 0.7
Polish 0.7
Danish 0.6
Czech 0.6
Turkish 0.2
Hungarian 0.2
Greek 0.1
Other 8.3

Source: Sprachen und ihre Verbreitung im World-Wide-Web (

Alternate Link (in German)

EDIT: A few things have changed from the now distant 2010 (and 2002)

1 English 53.6%
2 Russian 6.4%
3 German 5.6%
4 Japanese 5.1%
5 Spanish 4.9%
6 French 4.1%
7 Portuguese 2.5%
8      Italian 2.1%
9 Chinese 1.9%
10 Polish 1.8%
11 Turkish 1.8%
12 Dutch 1.4%
13 Persian 1.2%
14 Arabic 0.8%
15 Czech 0.8%

Estimated percentages of the top 10.1 million websites using various content languages as of May 6, 2016.