Posts Tagged ‘machine translation’
Sunday, February 8th, 2009
The iPhone has proved a game-changer in many regards and speech is no exception. Both Google and Yahoo (with vlingo) have deployed mobile speech applications for the iPhone.
Today I came across another sighting of iPhone speech recognition, Vocalia by Creaceed, employing open-source ASR engine Julius for back-end technology. There is no “push to talk” button but a “shake to retry”, which may prove useful when recognition goes awry. The app supports French, English and German for now and costs €2.99. Dictation is not available at this point, though Julius is certainly capable of it from an architecture point of view.
Other speech and language related iPhone apps:,
Has anyone used these extensively? What is your experience with speech on the iPhone?
Posted in Brands, Vendors | Tags: Apple, ASR, Google, iPhone, machine translation, open-source, TTS, vlingo, Vocalia | No Comments »
Tuesday, November 18th, 2008
Google released a new feature for its Google Mobile iPhone Application yesterday: voice search. Users speak a query and the application returns search results formatted for the iPhone. This is similar to the GOOG411 directory assistance application, which allows users to call a phone number, speak a query and receive information about local listings in voice or SMS formats. However the new application apparently performs recognition locally on the iPhone, meaning it comes bundled with an embedded speech recognition engine.
Aside from GOOG411, during the US presidential Google released Gaudi, a voice indexing technology for video. That makes the iPhone app the third official service the company releases, making use of speech recognition, leaving one guessing when Google’s speech technology becomes available as API, like the Google AJAX Language API for translation and transliteration, rather than bundled as software services. Also, an Android version is probably in the works, one would guess.
All applications are available in US English for now.
Posted in Brands, Services | Tags: Apple, ASR, Google, iPhone, machine translation | No Comments »
Monday, September 8th, 2008
In the wake of Google’s release of its Chrome web-browser, speculation on plans for Chrome on other platforms, including Android have drifted ashore. Naturally this has washed aside much recent IE8 news, which, though not a game-changer, is said to introduce many of the much-needed improvements everyone has been looking for from Microsoft.
In light of the browser war raging, a little add-on for Microsoft’s Live Messenger may not stir many waters, even if it promises real-time chat translation between English and 14 other languages. However it is still refreshing to read about technology, which is geared at opening channels of communication, rather than capturing market shares.
What are Google’s plans with Chrome and Android viz. Microsoft IE on Windows Mobile? Will Microsoft leverage its non-browser language services such as translation and speech recognition like Google has been?
Posted in Brands, Services | Tags: Android, Chrome, Google, machine translation, Microsoft | No Comments »
Monday, May 5th, 2008
The not-so-subtle truth is, of course, that we all speak English. Yet localization and internationalization are at once prerequisite and stumbling stone for many web-based endeavors.
In my own backyard, two examples illustrate the effect and need for of internationalization, respectively. German professional social network XING has internationally outperformed competitors like LinkedIn through early and aggressive internationalization. StudiVZ – the “German Facebook” has gained much of the student social network market before Facebook decided to release a German version of its web app, making this a tough-to-crack market.
Ironically, as these two examples underline, the need for localization remains in cases where the demands on usability are low (join group/contact person/send message) and the target audience can largely be expected to speak sufficient English (read this for an interesting take on the same issues and solutions in online gaming.) Moreover, localization is an effort far greater than providing an interface in the local language.
As one expects, localization and internationalization and speech technology are inextricably linked – in a sense developing speech technologies is internationalization. And using such technology in professional service projects is akin to building a internationalized web application. Here are some of the oddities I’ve observed while working with speech technologies in an international environment:
Translation is not enough. When you write software that speaks or wants to be spoken to, there is more at stake than providing interface text. Can you expect all your users to spell input when your system doesn’t understand the raw speech input? Can you be sure that all your translated content will generate well-formed speech-synthesis output? Language and culture are sensitive issues, so a well-localized speech application must do more than provide translated user interface. Employing local staff is usually a minimum to building a speech application for a new market.
The cost shifts. Re-usability of resources from previous speech projects is usually low. So unlike localizing a web application, porting a speech application requires grunt work that you thought you had done the first time around. Moreover, speech applications in new languages almost always come with additional licensing burdens and questions about the appropriate technology partner. Expect to pay for things you didn’t expect.
There is no long tail. The buy-in costs for developing a new language in almost any speech or language technology (recognition, synthesis, translation) remain constant. This makes every newly developed language a strategic decision and translates into a two-tier localization effort: one developing basic technologies, one employing such technology in professional service projects.
As an example, the world’s most successful dictation software packages: Dragon Naturally Speaking ships in five flavors of English and six European languages. Philip’s Speech Magic ships in 23 dialects of 11 languages. Both a far cry from world-coverage.
The enormous cost of development has a decided effect on developing speech technology for lesser-spoken languages. And it has posed a significant hurdle as well for open-source initiatives of speech technologies to provide such resources for free.
Posted in How To, Research | Tags: ASR, Facebook, internationalization, LinkedIn, machine translation, Nuance, open-source, Philips, TTS, Voiceforge, Voxforge, XING | No Comments »
Tuesday, December 4th, 2007
I stumbled across some “traditional” news bits this week for speech and language technologies, representing most of the major and a few interesting minor market players . Yahoo is offering some kind of NLP-driven structured search for e-commerce solutions starting next year. A new bundled automatic translation software with automatic learning capabilities was announced by across Systems GmbH and Language Weaver. Loquendo is sponsoring a speech-for-in-car-navigation industry event. Persay, maker of voice authentication software, is shipping solutions securing Planet Payment’s voice-enabled payment processing. Lastly Nuance, continuing its acquisition spree, buys Viecore, a contact-center integration consulting company, indicating a clear focus on strengthening its traditional speech and telephony market position.
Recently I stumbled across and blogged about VoiceGlue, an integration of various GPL-licensed pieces of software, providing full IVR capabilities (including rudimentary speech synthesis but not recognition.) Well, last night, together with Christoph, I finally had a stab at it myself.
Our test setup involved running Fedora 9 virtualized in Mac OS X. Our Fedora installation was missing a few pieces of software beyond the indicated prerequisites, but after about an hour everything was under way.
The trickiest bit proved to be building various modules required for the XML parser (I presume needed later for VoiceGlue-customized DTMF grammar parser.) For some reason CPAN’s console kept conking out on us (claiming inexplicably missing/unbuildable prereqs), so after wrestling with that for some time, we decided to manually build all the modules ourself (hoorah, makefiles).
This worked like a charm, though we hit a snag with the Module::Build perl module, which required C_Support, which in turn required another perl module (ExtUtils-CBuilders), not mentioned in any documentation (scant across the board, though that’s half the fun, isn’t it).
After that, the VoiceGlue installation completed swiftly and all services started running after a minimal bit of configuration.
Next week we’ll be back with some test calls and our first impressions. In the meanwhile we’ll keep our eyes peeled for ASR integration (LumenVox/Sphinx), which will make this a truly valuable stab at open sourcing some of the most expensive carrier-grade technology out there.
Posted in How To, News | Tags: across Systems, ASR, Language Weaver, Loquendo, machine translation, Nuance, open-source, Persay, TTS, Viecore, VoiceGlue, Yahoo | 1 Comment »
Wednesday, July 25th, 2007
Very quiet recently. No big acquisitions, no no speech-tech revolution.
Most interesting: Google announced Mike Cohen (of formerly Nuance) will appear as keynote speaker at SpeechTek in August to reveal Google’s speech technology strategy. Google has already moved into the speech application market with GOOG411, an automatic directory assistance application leveraging business search and Google Maps.
UBC researchers announce speech learning system that doesn’t use traditional data-driven model to learn the sounds of a language. Instead it is said to represent more experience driven learning, much like infants. So far, the system has acquired English and Japanese vowels.
Some product reviews/announcements: a quick history of desktop dictation, uses of TextAloud for the iPhone, and Nuance’s new South African voice “Tessa”.
Also on the web: NIST evaluates DARPA automatic translation software in military contexts, and What Semantic Search is Not.
I may post less frequently in coming weeks. Stay tuned.
Posted in Brands, News, Services | Tags: Acapela, ASR, Google, machine translation, Nuance, semantic web, TTS | No Comments »
Friday, April 20th, 2007
Posted in News | Tags: embedded, machine translation, mobile, TTS | No Comments »
Thursday, April 5th, 2007
Posted in News | Tags: machine translation, TTS, Wizzard | No Comments »
Tuesday, April 3rd, 2007
Daily News Redux:
Questions of the day:
- Web X.0 IEEE workshop. What role will NLP play?
- Are GPS navigation systems driving the TTS market (links randomly chosen from recent navigation system releases)?
Posted in News | Tags: ASR, Cognition, machine translation, NLP, search engines, TTS, Wizzard, ZoomInfo | 3 Comments »
Tuesday, April 3rd, 2007
On the WWW today:
- CallMiner announces Eureka product for call center speech analytics and QA.
- Envox CT Connect 7 VXML/CTI plattform now Avaya telephony compliant
- Some blogging about the role of symbolic vs brute-force statistics in articificial intelligence, NLP, Google‘s machine translation vision.
Posted in News | Tags: ASR, machine translation, NLP, search engines | No Comments »