Translate the Web While Learning a New Language: A Profile of Luis von Ahn’s new company, Duolingo
This is the first in a series of profiles of entrepreneurs on The Huffington Post, a preview of my forthcoming book, Pursuit of the American Dream: Success Stories of Today’s Immigrant Entrepreneurs.
Luis von Ahn’s new company, Duolingo, has set two very large goals: help users learn a language for free while simultaneously translating the Web. The start up may very well transform both the translation and language-learning industries in the process.
Needless to say, von Ahn has a stellar track record for Internet start-ups. He sold two previous companies to Google and solved massive challenges in the process – reducing spam, digitizing books, and improving the quality of image search – all by applying his research in crowdsourcing.
Luis von Ahn moved from his native of Guatemala to the U.S. to study mathematics but found computer science more appealing, eventually obtaining his PhD in the subject from Carnegie Mellon University. Today, as an assistant professor in computer science at his alma mater, he juggles his entrepreneurial work with research and teaching.
With Duolingo, he hopes to solve two enormous problems for the developing world: bring down the incredibly high-price of computer language learning to nothing and improve the quality and quantity of online content in languages other than English.
Professor von Ahn says, “Growing up in Guatemala has definitely influenced my work. $500 dollars is an insane amount of money for somebody in Guatemala to spend on language-learning software, like Rosetta Stone. It isn’t possible unless you are super rich. It’s a lot here in the U.S. too, but most people can afford it. Because of that, I realized that the only way people would use it from countries like Guatemala is if it’s totally free. So we’re committed to making everything free.”
In addition, he also sees a need for more and better content online in Spanish. “If you ever use the web in Spanish, you realize it’s ten times worse: the websites, design, content. Everything’s a few years behind in comparison to the U.S.,” says von Ahn.
A quick glance at Wikipedia’s home page confirms a dearth of content in Spanish. The open source encyclopedia features over 3,831,000 articles in English but only 854,000 articles in Spanish. That equals only twenty two percent of the content in Spanish as of that in English.
Duolingo currently offers beta users classes in Spanish and German and plans to roll out French, Italian and Mandarin shortly.
If you have ever re-typed two squiggly words on websites like Craigslist, Facebook, or Ticketmaster during an online registration or purchase process, then you already know one of von Ahn’s previous start-ups. reCAPTCHA reduces spam on email and social media sites as well as e-commerce fraud by preventing automated bots from signing up for thousands of new accounts, hence the acronym CAPTCHA, which stands for “Completely Automated Public Turing test to tell Computers and Humans Apart.”
Little do web users know that as they re-type those annoying, squiggly words, they are actually digitizing old printed material, one hard-to-read word at time. Computer scanning systems for old books or newspapers, like those used by the New York Times, fail to recognize around ten percent of the words on scanned pages due to water or pencil marks or even scratches and tears. Fortunately, humans can interpret the words. More than 40,000 Web sites utilize the reCAPTCHA widget to collectively transcribe over two hundred million words every day.
By “outsourcing” this tedious job to the “crowds” on the web (hence the name “crowdsourcing”) his system has been able to digitize old texts far more cost effectively and quickly than any other previously existing system. Google found von Ahn’s company so useful that it bought reCAPTCHA both for its security features and to facilitate the digitization of its Google Books project.