Monday, January 21, 2008


REPETITIO EST MATER STUDIORUM. There's a reason it's a Latin proverb and also a reason it's a proverb, and a reason that it was repeated ad nauseam to generations of pupils. Watching, reading or listening to something over and over again is very boring and consequently very hard work for most people. The brain tunes out. You need to trick it and give it something else to process while you're also absorbing the language. Because of the visual component movies and series are handy for this purpose. A possible course of action? In the beginning I'd choose a few favorite movies, the ones that I've seen "a thousand times" my favorites and "cult movies" where you can perhaps score a few points by knowing some trivia (that's obviously secondary). It should be something entertaining that I don't mind watching repeatedly. Movies have less dialogue than series but that might not be a bad thing for a beginner. Movies also generally make better and more entertaining repeated viewings. I am now watching in Italian, just for the blast, the Army of Darkness. Dammi un po di zucchero, baby :) Then I'd move on to my favorite series. I would not waste time "refreshing" my knowledge of the "original" vocabulary. What counts is understanding the plot and different situations. It's likely I've seen these episodes several times already, courtesy of the local station. It's been a while so I don't mind seeing it again. A perfect occasion to recuperate hundreds and thousands of hours of otherwise wasted time.

Sunday, January 20, 2008

Top IMDB languages titles by genre

Top 6 IMDB languages titles by genre

English 240336 titles
63642 Short
53207 Drama
47350 Comedy
32921 Documentary
28926 Adult
12770 Animation
12056 Action
11399 Thriller
11242 Family
9749 Adventure

Spanish 38892 titles
11950 Short
10394 Drama
6547 Comedy
6178 Documentary
2381 Action
2229 Romance
1744 Thriller
1403 Musical
1382 Crime
1135 Adventure

German 31825 titles
5877 Drama
5059 Documentary
4781 Comedy
4673 Short
1440 Crime
1353 Adult
1261 Romance
968 Thriller
958 Family
833 Music

French 31472 titles
8414 Short
6642 Drama
4870 Comedy
4334 Documentary
1214 Romance
1060 Crime
916 Thriller
889 Adult
779 Animation
727 Adventure

Italian 15470 titles
3471 Drama
2964 Comedy
1668 Short
1179 Documentary
621 Adventure
588 Music
588 Thriller
582 Crime
524 Romance
519 Action

Japanese 15449 titles
3955 Drama
2480 Animation
2058 Action
1743 Comedy
1521 Short
1031 Adventure
996 Sci-Fi
994 Fantasy
780 Romance
681 Crime

Saturday, January 19, 2008

favorite books by language

The previous post was about millions of books in different languages and translations. What about the most popular books to read? How do we rate languages according to how much good stuff is available to read in each language?

Modern Library's list of top 100 English-language novels of all time is interesting. The eggheads voted for James Joyce and Ulysses. The top book on the readers' list is Ayn Rand. Brrrr.

Time's list in similar vein.

Madison's 100 Best Novels English only.

British reading public has somewhat different tastes. The list includes some foreign authors.

Top 100 novels of all time voted by regular people

Guardian's The top 100 books of all time as determined from a vote by 100 noted writers from 54 countries.

There's more, but you get the idea. Now, for the cool part. LibraryThing advertised as the world's largest book club, lists over 22 million books (copies) catalogued by some 344,000 members. I say advertised as you can enter some 200 books for free and there's a membership fee for extra features etc. What's really interesting here is the catalogue that can be searched by language. The most popular languages ranked by the nuber of copies (translations or originals) on the members' bookshelves are:

French (403,907)
German (306,566)
Japanese (207,903)
Spanish (127,231)
Russian (111,171)
Italian (107,347)
Greek (Ancient) (96,300)
Latin (68,919)
Dutch (58,395)
Swedish (41,332)
Portuguese (34,583)
Chinese (29,015)
Norwegian (24,620)
Hebrew (22,920)
Danish (20,909)
Czech (17,141)
English (Middle) (17,046)
Arabic (14,924)
Polish (14,787)
Old English (7,386)
Finnish (7,085)
Sanskrit (6,031)
Turkish (5,813)
Persian (5,035)

Foreign languages by my estimate account for some 1.8 million copies, lol. Admittedly, the reading public is predominantly English-speaking and for Japanese you'll find "books" like Fruits basket and Death Note but it's a jury of some 344,000 people who generally like books and reading. And manga IS fun. Very funny situation with some languages where most copies are from very few writers.

Finally — "the intellectual works that have been judged to be worth owning by the "purchase vote" of libraries around the globe". In 2005 Worldcat was described as still heavily oriented toward North American libraries but the list includes 60,000 libraries many of which are from all around the world.

Top 1000 Books Owned by Libraries Around the World

information exaflood

According to a Berkeley study How much information? , published in 2003, the total quantity of information stored in all types of media amounts to about 5 exabytes of new information per year. One exabyte is roughly the equivalent of 50,000 years of DVD quality video. The exabyte flood is TV and radio programming, movies, books, DVDs, documents, Internet content etc. English is the dominant language:

"The United States produces about 40% of the world's new stored information, including 33% of the world's new printed information, 30% of the world's new film titles, 40% of the world's information stored on optical media, and about 50% of the information stored on magnetic media..."

Ninety-two percent of new information is stored on magnetic and optical media, primarily hard disks, CDs and DVDs. The Berkeley study mentions that "the U.S. produces 37% of the world's audio CD titles, 50% of the CD ROM titles, and 40% of the DVD titles." Accumulated stock of audio CDs worldwide is some 1.5 million titles (560,000 are original US titles).

Worldwide film production

"The number of motion pictures made around the world from 1890 to 2002 was approximately 328,530,divided into:

Animation Films and Series: 15,790
Documentary Films: 30,475
Silent Films: 49,417
Black and White Films: 113,992
Color: 254,538

Source: The International Film Index, 1895-1990.
How much information 2003

After accounting for some 10,000+ titles produced per year between 2003 and 2008 we come up with 400,000 titles.

"The most obvious trends are the phenomenal rise in filmmaking in the United States between 1991 and 2001, contrasted with the significant fall in filmmaking in at least three other film producing nations - Italy, the Soviet Union / Russia, and Mexico." Accounting for short films and documentaries US has "overtaken India as the major producer of film" with some 1740 films vs India's 1013. Accounting for this factor France is third and Germany fourth (2001).
How much information 2003


In 2001, the United States produced about 4,000 DVD titles per year (more than 100per week). According to the DVD Entertainment Group as reported by the study as of January 2003 there were some 20,000 DVD-video titles available worldwide. DVD is now a mature format. More recent figures are too much work to come by, but according to the Content Delivery & Storage Association CD & DVD Replication was segmented in 2005 in the following manner: North America 30%, Asia 30%, Europe 31%, Japan 5% South America 2% Australasia 1% and Middle East 1%. Language learners will note that DVDs produced in North America are likely to have only French and Spanish audio tracks. For some reason even Australian DVDs come with more interesting language options. The new Blueray format provides additional possibilities regarding extra language soundtracks. Hopefully publishers and movie studios will take advantage of this. Properly stored quality DVDs and CDs may have a life expectancy of up to 200 years.

Printed information and books

A note of caution here because a large percentage of printed information is plain old office junk. Without it, according to the Berkeley study, the U.S. accounts for around 10% of the world's original information flow in print. According to Graddol (based on UNESCO figures) in the 1990's English accounted for about 28% of the world's total book production. The total world stock of books, all books ever written is very difficult to measure, and estimates range between 65-100 million unique titles. More about this here

Sunday, January 6, 2008

The most widespread languages in the world

Literally and littorally. Languages officially spoken in countries covering the largest percentage of earth’s surface area:

English 39,500,000 km2, 29% of world’s surface, 53 countries
French 20,600,000 km2, 15% of world’s surface, 29 countries
Russian 20,500,000 km2, 15% of world’s surface 4 countries
Spanish 12,200,000 km2, 9% of world’s surface, 21 countries
Portuguese 10,750,000 km2, 8% of world’s surface, 9 countries
Chinese 9,650,000 km2, 7% of world’s surface, 3 countries
The Arab world stretches across some 12-13 million square kilometers.

World’s emerged surface: 135,000,000 km2 (192 countries - Antarctica not included).

The languages of countries claiming Antarctica (some 14,000,000 km2) are English, Spanish, Norwegian and French. In addition there’s some overlap between French and English. After taking this into account French would cover about the same area as Spanish. The former Soviet Union covered over 22,400,000 km2.

Some significant languages are nowhere to be seen in this list. However, a natural path to world domination is the control of the seas and Italy has a longer coastline than Brazil while Japan has the sixth-longest coastline in the world -longer than those of the US, Australia or China. Since we’ll be conquering the beaches in sandals rather than army boots, it is reassuring that these two countries are located in temperate climates. As for tourism, here’s the list of the most popular tourist destinations that’s as fickle as the target list of a prospective language learner. France, Italy and Germany were on the itinerary of the gentlemanly Grand Tour for over 300 years.

Top tourist destinations in 2006 (millions of arrivals)
1. France 79.1
2. Spain 58.5
3. United States 51.1
4. China 49.6
5. Italy 41.1
6. United Kingdom 30.7
7. Germany 23.6
8. Mexico 21.4
9. Austria 20.3
10. Russian Federation 20.2

Saturday, January 5, 2008

The most influential languages

George Weber’s classification first appeared in the article “Top Languages: The World’s 10 Most Influential Languages” in Language Today (Vol. 2, Dec 1997).

Weber’s taking into account and giving points (in parentheses) according to six factors: number of first language speakers, number of second language speakers, number and population size of countries where the language is used, number of major fields using the language internationally, economic power of countries using the languages, and socio-literary prestige.

The world's top ten most influential languages:

English (37)
French (23)
Spanish (20)
Russian (16)
Arabic (14)
Chinese (13)
German (12)
Japanese (10)
Portuguese (10)
Hindi/Urdu (9)

Even though based on now outdated information Weber writes in 2006 that the ranking remains unaffected.

An interesting classification of languages according to a composite index based on the number of speakers and the number of countries where a language is spoken:

1 English
2 Spanish
3 Chinese
4 Arabic
5 French
5 Hindi
7 Russian
7 Portuguese
9 Malay
10 German
11 Japanese

Louis-Jean Calvet Le marché aux langues (2002)