<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Okko in Speech &#187; ASR</title>
	<atom:link href="http://www.okkoblog.com/tag/asr/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.okkoblog.com</link>
	<description>Working with speech and language technology</description>
	<lastBuildDate>Tue, 20 Jul 2010 08:09:21 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0</generator>
		<item>
		<title>A More Optimistic Outlook on the Future of Speech</title>
		<link>http://www.okkoblog.com/2010/06/30/a-more-optimistic-outlook-on-the-future-of-speech/</link>
		<comments>http://www.okkoblog.com/2010/06/30/a-more-optimistic-outlook-on-the-future-of-speech/#comments</comments>
		<pubDate>Wed, 30 Jun 2010 09:47:04 +0000</pubDate>
		<dc:creator>Okko</dc:creator>
				<category><![CDATA[News]]></category>
		<category><![CDATA[Research]]></category>
		<category><![CDATA[AI]]></category>
		<category><![CDATA[ASR]]></category>
		<category><![CDATA[Microsoft]]></category>
		<category><![CDATA[search engines]]></category>
		<category><![CDATA[Siri]]></category>
		<category><![CDATA[usability]]></category>

		<guid isPermaLink="false">http://www.okkoblog.com/?p=187</guid>
		<description><![CDATA[The speech application industry got some critical press in recent months (here are some spirited responses, respectively.) All the more refreshing to come across this New York Times article presenting current work in speech and artificial intelligence. The article highlights broadly what kind of AI applications have moved into the mainstream (or have potential to [...]]]></description>
			<content:encoded><![CDATA[<p>The speech application industry got some <a href="http://robertfortner.posterous.com/the-unrecognized-death-of-speech-recognition" target="_blank">critical</a> <a href="http://www.signalprocessingsociety.org/technical-committees/list/sl-tc/spl-nl/2010-04/suendermann/">press</a> in recent months (here are some <a href="http://robertopieraccini.blogspot.com/2010/05/un-rest-in-peas-unrecognized-life-of.html">spirited</a> <a href="http://languagelog.ldc.upenn.edu/nll/?p=2275">responses</a>, respectively.)</p>
<p>All the more refreshing to come across this New York Times <a href="http://www.nytimes.com/2010/06/25/science/25voice.html">article</a> presenting <a href="http://research.microsoft.com/en-us/um/people/horvitz/">current</a> <a href="http://siri.com/">work</a> in speech and artificial intelligence. The article highlights broadly what kind of AI applications have moved into the mainstream (or have potential to do so). Speech and natural language understanding, the article claims, have gone furthest.</p>
<p>One thing that is generalizable from both criticisms above is that development of speech-enabled applications has stagnated, in various ways<sup>1</sup>. The underlying technology – speech recognition (ASR) – has gone as far as it can. Application designers and developers haven&#8217;t adopted. Dictation has learned to understand doctors and lawyers better, but still struggles with conversational speech.</p>
<p>This point may have to be conceded. In terms of commercial applications however, especially speech-enabled voice (IVR) systems, the root cause for stagnation is not necessarily a failure of AI, rather than a maturing of standards and best-practices. Fulfilling expectations that voice applications, much like websites, behave according to certain rules is much to the advantage of the millions who interact with such systems every day.</p>
<p>What I walk away with from the generalized critical, as well as the Times&#8217; optimistic perspective is that, short of a revolution in underlying technologies (which hardly anyone expects), filling practical, everyday niches is where things can still move forward for speech and language processing.  These niches have certainly not been fully uncovered.</p>
<p>Thoughts?</p>
<hr /><sup>1</sup> Roughly summarized, Robert Fostner: &#8220;development in speech technology has flat-lined since 2001&#8243;; David Suendermann: &#8220;(statistical) engineering methods are more efficient than traditional symbolic linguistic approaches to language processing.&#8221;</p>
]]></content:encoded>
			<wfw:commentRss>http://www.okkoblog.com/2010/06/30/a-more-optimistic-outlook-on-the-future-of-speech/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Speech and Dialog Conferences / Speech for iPhone and Android</title>
		<link>http://www.okkoblog.com/2009/07/11/speech-and-dialog-conferences-speech-for-iphone-and-android/</link>
		<comments>http://www.okkoblog.com/2009/07/11/speech-and-dialog-conferences-speech-for-iphone-and-android/#comments</comments>
		<pubDate>Sat, 11 Jul 2009 08:24:00 +0000</pubDate>
		<dc:creator>Okko</dc:creator>
				<category><![CDATA[Brands]]></category>
		<category><![CDATA[Research]]></category>
		<category><![CDATA[Services]]></category>
		<category><![CDATA[Vendors]]></category>
		<category><![CDATA[Android]]></category>
		<category><![CDATA[ASR]]></category>
		<category><![CDATA[iPhone]]></category>
		<category><![CDATA[TTS]]></category>

		<guid isPermaLink="false">http://okkoblog.com/blog/?p=54</guid>
		<description><![CDATA[Conference time: I will be spending a couple of days in London and Brighton from September 5th attending Interspeech, SIGDIAL as well as a researcher round-table. Anyone interested in meeting up, feel free to get in touch. Also, here are some more or less recent, interesting news for Android (at about 6:20, thanks Schamai) and [...]]]></description>
			<content:encoded><![CDATA[<p><!-- AddThis Bookmark Button BEGIN -->Conference time:  I will be spending a couple of days in London and Brighton from September 5th attending <a href="http://www.interspeech2009.org/">Interspeech</a>, <a href="http://www.sigdial.org/workshops/workshop10/index.html">SIGDIAL</a> as well as a researcher<a href="http://www.yrrsds.org/"> round-table</a>.  Anyone interested in meeting up, feel free to <a href="http://www.voxarca.de/app/main/contact">get in touch</a>.</p>
<p>Also, here are some more or less recent, interesting news for <a href="http://www.youtube.com/watch?v=uX9nt8Cpdqg">Android</a> (at about 6:20, thanks Schamai) and <a href="http://prmac.com/release-id-6453.htm">iPhone</a> speech developers.<br /><!-- AddThis Bookmark Button END --></p>
]]></content:encoded>
			<wfw:commentRss>http://www.okkoblog.com/2009/07/11/speech-and-dialog-conferences-speech-for-iphone-and-android/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Tim O&#8217;Reilly: Google Voice Search Key Technology</title>
		<link>http://www.okkoblog.com/2009/04/02/tim-oreilly-google-voice-search-key-technology/</link>
		<comments>http://www.okkoblog.com/2009/04/02/tim-oreilly-google-voice-search-key-technology/#comments</comments>
		<pubDate>Thu, 02 Apr 2009 09:47:00 +0000</pubDate>
		<dc:creator>Okko</dc:creator>
				<category><![CDATA[Services]]></category>
		<category><![CDATA[ASR]]></category>
		<category><![CDATA[Gaudi]]></category>
		<category><![CDATA[Google]]></category>
		<category><![CDATA[vlingo]]></category>
		<category><![CDATA[Yahoo]]></category>

		<guid isPermaLink="false">http://okkoblog.com/blog/?p=52</guid>
		<description><![CDATA[ReadWriteWeb reports Tim O&#8217;Reilly addressed attendees at the San Francisco Web 2.0 Expo this week, talking about key technologies for the Web >2.0. Voice search (Google iPhone App), he claimed was a tipping point in terms &#8220;sensor based interfaces&#8221;. While not the only vendor to provide voice search (i.e. Yahoo oneSearch powered by Vlingo) Google [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.readwriteweb.com/archives/five_applications_tim_oreilly_says_point_past_web20.php">ReadWriteWeb reports</a> Tim O&#8217;Reilly addressed attendees at the San Francisco Web 2.0 Expo this week, talking about key technologies for the Web >2.0.  Voice search (<a href="http://googlesystem.blogspot.com/2008/11/google-voice-search-for-iphone.html">Google iPhone App</a>), he claimed was a <a href="http://radar.oreilly.com/2008/11/voice-in-google-mobile-app-tipping-point.html">tipping point</a> in terms &#8220;sensor based interfaces&#8221;.</p>
<p>While not the only vendor to provide voice search (i.e. <a href="http://mobile.yahoo.com/onesearch">Yahoo oneSearch</a> <a href="http://gigaom.com/2008/04/02/vlingo-gets-20m-and-exclusive-yahoo-deal/">powered by Vlingo</a>) Google certainly seems ahead in the game in what appears to be a gradual unfolding of a broad voice strategy, such as Voice Search and recently rebranding a feature-enhanced GrandCentral as <a href="http://www.google.com/voice/about">Google Voice</a>.  Future work on the voice front we can expect includes  promotion of its own speech recognition capacities through <a href="http://code.google.com/android/">Android</a>, <a href="http://gears.google.com/">Google Gears</a> <a href="http://www.chromeexperiments.com/detail/browsertalk/">bringing speech capacities to all browers</a>, tighter integration of <a href="http://labs.google.com/gaudi">Gaudi</a> (audio indexing) with other services and perhaps one day opening up voice services over APIs.</p>
<p>As I&#8217;ve <a href="http://okkobuss.blogspot.com/2008/01/goog-we-need-more-data.html">previously pointed out</a>, to Google voice is just another form of data, but what&#8217;s slowly beginning to emerge is a central role for speech and voice technologies to play in coming developments for the web and how we search and interface with it.</p>
<p><script type="text/javascript"><br />  addthis_url    = location.href; addthis_title  = document.title;  <br />  addthis_pub    = 'okkobuss';</script></p>
]]></content:encoded>
			<wfw:commentRss>http://www.okkoblog.com/2009/04/02/tim-oreilly-google-voice-search-key-technology/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Language Technology April Fools</title>
		<link>http://www.okkoblog.com/2009/04/01/language-technology-april-fools/</link>
		<comments>http://www.okkoblog.com/2009/04/01/language-technology-april-fools/#comments</comments>
		<pubDate>Wed, 01 Apr 2009 09:57:00 +0000</pubDate>
		<dc:creator>Okko</dc:creator>
				<category><![CDATA[Fun]]></category>
		<category><![CDATA[ASR]]></category>
		<category><![CDATA[NLP]]></category>

		<guid isPermaLink="false">http://okkoblog.com/blog/?p=51</guid>
		<description><![CDATA[Just posting some gems from today concerning speech and language technology, such as natural language generation, speech recognition and natural language processing. Have you found any others?]]></description>
			<content:encoded><![CDATA[<p>Just posting some gems from today concerning speech and language technology, such as <a href="http://mail.google.com/mail/help/autopilot/index.html">natural language generation</a>, <a href="http://www.thinkgeek.com/stuff/41/buzzword.shtml?cpg=cj">speech recognition</a> and <a href="http://www.google.com/intl/en/landing/cadie/index.html">natural language processing</a>.</p>
<p>Have you found any others?</p>
]]></content:encoded>
			<wfw:commentRss>http://www.okkoblog.com/2009/04/01/language-technology-april-fools/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Microsoft Recite Preview &#8211; Note Dictation and Voice Search</title>
		<link>http://www.okkoblog.com/2009/02/16/microsoft-recite-preview-note-dictation-and-voice-search/</link>
		<comments>http://www.okkoblog.com/2009/02/16/microsoft-recite-preview-note-dictation-and-voice-search/#comments</comments>
		<pubDate>Mon, 16 Feb 2009 07:43:00 +0000</pubDate>
		<dc:creator>Okko</dc:creator>
				<category><![CDATA[Brands]]></category>
		<category><![CDATA[Services]]></category>
		<category><![CDATA[ASR]]></category>
		<category><![CDATA[Microsoft]]></category>
		<category><![CDATA[mobile]]></category>

		<guid isPermaLink="false">http://okkoblog.com/blog/?p=49</guid>
		<description><![CDATA[Arstechnica reports today on the release of Microsoft Recite &#8220;Technology Preview&#8221; for Windows Mobile. The applications lets users record short notes as audio snippets, which can later be searched for content by speaking key words. Apparently it does not entail speech recognition rather than simpler pattern matching, meaning it cannot be searched in text form [...]]]></description>
			<content:encoded><![CDATA[<p>Arstechnica <a href="http://arstechnica.com/microsoft/news/2009/02/microsoft-recite-for-windows-mobile-previewed.ars">reports</a> today on the release of <a href="http://recite.microsoft.com/">Microsoft Recite</a> &#8220;Technology Preview&#8221; for Windows Mobile.   The applications lets users record short notes as audio snippets, which can later be searched for content by speaking key words.  Apparently it does not entail speech recognition rather than simpler pattern matching, meaning it cannot be searched in text form but may work more robustly, eliminating the effort of training for speaker-independency.</p>
<p>While not a full product yet, this sounds like a nifty little application for cognitive off-loading.</p>
<p>Have you tried Microsoft Recite?</p>
<p><!-- AddThis Bookmark Button BEGIN --><br /><script type="text/javascript"><br />  addthis_url    = location.href;   <br />  addthis_title  = document.title;  <br />  addthis_pub    = 'okkobuss';     <br /></script><script type="text/javascript" src="http://s7.addthis.com/js/addthis_widget.php?v=12"></script><br /><!-- AddThis Bookmark Button END --></p>
]]></content:encoded>
			<wfw:commentRss>http://www.okkoblog.com/2009/02/16/microsoft-recite-preview-note-dictation-and-voice-search/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>More speech on the iPhone</title>
		<link>http://www.okkoblog.com/2009/02/08/more-speech-on-the-iphone/</link>
		<comments>http://www.okkoblog.com/2009/02/08/more-speech-on-the-iphone/#comments</comments>
		<pubDate>Sun, 08 Feb 2009 09:34:00 +0000</pubDate>
		<dc:creator>Okko</dc:creator>
				<category><![CDATA[Brands]]></category>
		<category><![CDATA[Vendors]]></category>
		<category><![CDATA[Apple]]></category>
		<category><![CDATA[ASR]]></category>
		<category><![CDATA[Google]]></category>
		<category><![CDATA[iPhone]]></category>
		<category><![CDATA[machine translation]]></category>
		<category><![CDATA[open-source]]></category>
		<category><![CDATA[TTS]]></category>
		<category><![CDATA[vlingo]]></category>
		<category><![CDATA[Vocalia]]></category>

		<guid isPermaLink="false">http://okkoblog.com/blog/?p=48</guid>
		<description><![CDATA[The iPhone has proved a game-changer in many regards and speech is no exception. Both Google and Yahoo (with vlingo) have deployed mobile speech applications for the iPhone.Today I came across another sighting of iPhone speech recognition, Vocalia by Creaceed, employing open-source ASR engine Julius for back-end technology. There is no &#8220;push to talk&#8221; button [...]]]></description>
			<content:encoded><![CDATA[<p>The iPhone has proved a game-changer in many regards and speech is no exception.  Both Google and Yahoo (with vlingo) have deployed mobile speech applications for the iPhone.<br />Today I came across another sighting of iPhone speech recognition, <a href="http://www.creaceed.com/vocalia/">Vocalia</a> by Creaceed, employing open-source ASR engine <a href="http://julius.sourceforge.jp/en_index.php">Julius</a> for back-end technology.  There is no &#8220;push to talk&#8221; button but a &#8220;shake to retry&#8221;, which may prove useful when recognition goes awry.  The app supports French, English and German for now and costs €2.99.  Dictation is not available at this point, though Julius is certainly capable of it from an architecture point of view.</p>
<p>Other speech and language related iPhone apps:,
<ul>
<li><a href="http://googlemobile.blogspot.com/2008/11/google-mobile-app-for-iphone-now-with.html">Google Mobile</a> &#8211; voice search app</li>
<li><a href="http://vlingo.com/">Vlingo</a> &#8211; speech-enables your phone</li>
<li><a href="http://www.innovativelanguage.com/products/pocket">Pocket</a> &#8211; language learning app</li>
<li><a href="http://www.makayama.com/iphonevoicedial.html">Voice Dial</a> &#8211; speech-enabled dialer</li>
<li><a href="http://www.voicethis.com/">VoiceThis</a> &#8211; speech-enabled dialer</li>
<li><a href="http://www.future-apps.net/iSpeak/iSpeak.html">iSpeak</a> &#8211; multi-language translator with synthesized output</li>
<li>A <a href="http://www.crunchgear.com/2009/02/03/iphone-app-helps-reduce-stuttering/">stuttering aid</a> (not yet available)</li>
</ul>
<p>Has anyone used these extensively?  What is your experience with speech on the iPhone?</p>
]]></content:encoded>
			<wfw:commentRss>http://www.okkoblog.com/2009/02/08/more-speech-on-the-iphone/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Zumba Lumba &#8211; iPhone killer or simply a hoax?</title>
		<link>http://www.okkoblog.com/2009/02/02/zumba-lumba-iphone-killer-or-simply-a-hoax/</link>
		<comments>http://www.okkoblog.com/2009/02/02/zumba-lumba-iphone-killer-or-simply-a-hoax/#comments</comments>
		<pubDate>Mon, 02 Feb 2009 13:56:00 +0000</pubDate>
		<dc:creator>Okko</dc:creator>
				<category><![CDATA[Brands]]></category>
		<category><![CDATA[accessibility]]></category>
		<category><![CDATA[Apple]]></category>
		<category><![CDATA[ASR]]></category>
		<category><![CDATA[embedded]]></category>
		<category><![CDATA[iPhone]]></category>
		<category><![CDATA[mobile]]></category>
		<category><![CDATA[security]]></category>
		<category><![CDATA[usability]]></category>

		<guid isPermaLink="false">http://okkoblog.com/blog/?p=47</guid>
		<description><![CDATA[A no-frills phone with the unlikely name of Zumba Lumba has recently received some attention by the BBC. The phone is said to be top-secret, developed by a defense-aviation company. It does without frills like a camera or an applications platform, but touts some interesting security and computational features, (not only) related to speech technology: [...]]]></description>
			<content:encoded><![CDATA[<p>A no-frills phone with the unlikely name of <a href="http://www.zumbalumba.com/">Zumba Lumba</a> has recently received some <a href="http://news.bbc.co.uk/2/hi/uk_news/england/7859562.stm">attention by the BBC</a>.  The phone is said to be top-secret, developed by a <a href="http://www.iatechnology.co.uk/">defense-aviation company</a>.  It does without frills like a camera or an applications platform, but touts some interesting security and computational features, (not only) related to speech technology:
<ul>
<li>Cloud computing &#8211; the phone uses no local storage for contacts, data.</li>
<li>Network speech recognition &#8211; user input is recognized over the internet.  This should avoid hardware intensive local computing for voice input, but requires internet access.</li>
<li>Voice identification &#8211; enhanced security, because the phone will only respond to a single user&#8217;s voice.</li>
</ul>
<p><a href="http://www.itpro.co.uk/609731/top-secret-zumba-phone-could-boost-comms">Some</a> <a href="http://uk.i4u.com/article23027.html">seem</a> to think this is a potential iPhone killer at least in terms of making use of innovative input modalities (though Google already released a<a href="http://okkobuss.blogspot.com/2008/11/google-mobile-iphone-app-with-speech.html"> speech recognition app for the iPhone</a>.)  <a href="http://www.crunchgear.com/2009/01/30/bbc-suckered-by-some-crazy-fake-cellphone/">Others</a> simply thinks it&#8217;s a hoax.</p>
<p>Either way, the idea of joining mobile with cloud computing is interesting.  Using voice identification for security has its appeal as well, even if it&#8217;s unclear whether keeping data in the cloud and sending voice data over the internet is any more secure than simply keeping data on your phone, locally.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.okkoblog.com/2009/02/02/zumba-lumba-iphone-killer-or-simply-a-hoax/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>SVOX purchases Siemens AG speech-related IP</title>
		<link>http://www.okkoblog.com/2009/01/26/svox-purchases-siemens-ag-speech-related-ip/</link>
		<comments>http://www.okkoblog.com/2009/01/26/svox-purchases-siemens-ag-speech-related-ip/#comments</comments>
		<pubDate>Mon, 26 Jan 2009 18:59:00 +0000</pubDate>
		<dc:creator>Okko</dc:creator>
				<category><![CDATA[Brands]]></category>
		<category><![CDATA[News]]></category>
		<category><![CDATA[Vendors]]></category>
		<category><![CDATA[ASR]]></category>
		<category><![CDATA[IBM]]></category>
		<category><![CDATA[Microsoft]]></category>
		<category><![CDATA[Nuance]]></category>
		<category><![CDATA[Siemens]]></category>
		<category><![CDATA[SVOX]]></category>
		<category><![CDATA[TellMe]]></category>
		<category><![CDATA[TTS]]></category>

		<guid isPermaLink="false">http://okkoblog.com/blog/?p=46</guid>
		<description><![CDATA[Following Nuance&#8217;s acquisition of IBM speech technology intellectual property two weeks ago, Zurich-based SVOX today announced the purchase of the Siemens AG speech recognition technology group. The deal gears at creating &#8220;obvious synergies of developing TTS, ASR and speech dialog solutions&#8221; and enhances SVOX&#8217;s portfolio of technologies, which to date included only highly specialized speech [...]]]></description>
			<content:encoded><![CDATA[<div style="text-align: justify;">Following <a href="http://okkobuss.blogspot.com/2009/01/nuance-acquires-ibm-speech-patents.html">Nuance&#8217;s acquisition of IBM speech technology</a> intellectual property two weeks ago,  Zurich-based <a href="http://www.svox.com/News-Items-SVOX-acquires-Speech-Processing-unit-of-Siemens-AG.aspx">SVOX today announced</a> the purchase of the Siemens AG  speech recognition technology group.  The deal gears at creating &#8220;<span id="lblBodyText">obvious synergies of developing TTS, ASR and speech dialog solutions&#8221; and enhances SVOX&#8217;s portfolio of technologies, which to date included only highly specialized speech synthesis solutions,</span><span id="lblBodyText"> to now entail speech recognition.</span><br /><span id="lblBodyText">Like the Nuance-IBM deal (and unlike the <a href="http://www.microsoft.com/presspass/press/2007/mar07/03-14powerofspeechpr.mspx">Microsoft acquisition of TellMe</a>), this merger breaks with the obvious big-fish small-fish paradigm.  Here, </span><span id="lblBodyText">a larger company&#8217;s (IBM, Siemens) R&amp;D</span><span id="lblBodyText"> division was sold to a smaller, more specialized company (SVOX, Nuance).<br />Both transactions come with an intend to pursue development of novel interactive voice applications.  However while Nuance announced the potential development of applications across platforms and environment with IBM expertise and IP, SVOX appears to stay on course with its successful line of automotive solutions to build </span>&#8220;a commanding market share in speech solutions for premium cars<span id="lblBodyText">&#8220;.</span><br /><span id="lblBodyText"></span><br />This deal adds SVOX to a list of companies offering network and embedded speech recognition technologies, also including <a href="http://www.nuance.com/">Nuance</a>, <a href="http://www.telisma.com/">Telisma</a>, <a href="http://www.loquendo.com/">Loquendo</a> and <a href="http://www.microsoft.com/">Microsoft</a>.  Financial terms of the deal were not announced.<br /><script type="text/javascript"><br />  addthis_url    = location.href;   <br />  addthis_title  = document.title;  <br />  addthis_pub    = 'okkobuss';</script><br /><!-- AddThis Bookmark Button END --></div>
]]></content:encoded>
			<wfw:commentRss>http://www.okkoblog.com/2009/01/26/svox-purchases-siemens-ag-speech-related-ip/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>IBM Predicts Talking Web</title>
		<link>http://www.okkoblog.com/2008/11/28/ibm-predicts-talking-web/</link>
		<comments>http://www.okkoblog.com/2008/11/28/ibm-predicts-talking-web/#comments</comments>
		<pubDate>Fri, 28 Nov 2008 08:19:00 +0000</pubDate>
		<dc:creator>Okko</dc:creator>
				<category><![CDATA[Brands]]></category>
		<category><![CDATA[Vendors]]></category>
		<category><![CDATA[ASR]]></category>
		<category><![CDATA[IBM]]></category>
		<category><![CDATA[TTS]]></category>

		<guid isPermaLink="false">http://okkoblog.com/blog/?p=44</guid>
		<description><![CDATA[addthis_url = location.href; addthis_title = document.title; addthis_pub = 'okkobuss'; IBM&#8217;s annual crystal ball list of Innovations That Will Change Our Lives in the Next Five Years includes a forecast of a voice-enabled talking web. &#8220;You will be able to sort through the Web verbally to find what you are looking for and have the information [...]]]></description>
			<content:encoded><![CDATA[<p><!-- AddThis Bookmark Button BEGIN --><script type="text/javascript"><br />  addthis_url    = location.href;   <br />  addthis_title  = document.title;  <br />  addthis_pub    = 'okkobuss';     </script>IBM&#8217;s annual crystal ball list of <a href="http://www-03.ibm.com/press/us/en/pressrelease/26170.wss">Innovations That Will Change Our Lives in the Next Five Years</a> includes a forecast of a voice-enabled talking web.  &#8220;<span>You will be able to sort through the Web verbally to find what you are looking for and have the information read back to you,&#8221; the article predicts.<br />IBM itself has launched several voice-enabled products and initiatives over the years, most notably the <a href="http://www-01.ibm.com/software/voice/">WebSphere Voice</a> family of web servers, which adds various voice functionality to its flagship WebSphere platform, leveraging it in areas such as unified messaging and call-center automation.<br />Some problems exist with a vision as the one advocated by the article.  Speech recognition accuracy and noise filtering have obviously come a long way and may only pose a minor impediment.</span> The user&#8217;s desire to speak rather than type or click is another problem. Issuing voice commands in the presence of others may not always be desirable and can be disruptive, for instance at work on public transport.  Lastly, there are usability concerns, beyond the quality of speech technology, when converting a visual 2- or even 3-dimensional representation of information into a 1-dimensional audio stream.  The cognitive load increases significantly with tasks more complex than, for instance, obtaining time-table information or finding the nearest Italian restaurant.<br />The effort that stands behind the vision, to put voice technology to uses beyond call-center automation, is laudable.  Mobile internet access and computing on-the-road may indeed do their parts to make this vision come true.  And clearly, there are use cases, such as improved accessibility for users with impairments, that on their own accord merit making the web voice-accessible.  Wide-spread usage of a voice-enabled web, however, may be more than five years off.<br /><!-- AddThis Bookmark Button END --></p>
]]></content:encoded>
			<wfw:commentRss>http://www.okkoblog.com/2008/11/28/ibm-predicts-talking-web/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Google Mobile iPhone App with Speech Recognition</title>
		<link>http://www.okkoblog.com/2008/11/18/google-mobile-iphone-app-with-speech-recognition/</link>
		<comments>http://www.okkoblog.com/2008/11/18/google-mobile-iphone-app-with-speech-recognition/#comments</comments>
		<pubDate>Tue, 18 Nov 2008 06:51:00 +0000</pubDate>
		<dc:creator>Okko</dc:creator>
				<category><![CDATA[Brands]]></category>
		<category><![CDATA[Services]]></category>
		<category><![CDATA[Apple]]></category>
		<category><![CDATA[ASR]]></category>
		<category><![CDATA[Google]]></category>
		<category><![CDATA[iPhone]]></category>
		<category><![CDATA[machine translation]]></category>

		<guid isPermaLink="false">http://okkoblog.com/blog/?p=43</guid>
		<description><![CDATA[addthis_url = location.href; addthis_title = document.title; addthis_pub = 'okkobuss'; Google released a new feature for its Google Mobile iPhone Application yesterday: voice search. Users speak a query and the application returns search results formatted for the iPhone. This is similar to the GOOG411 directory assistance application, which allows users to call a phone number, speak [...]]]></description>
			<content:encoded><![CDATA[<p><!-- AddThis Bookmark Button BEGIN --><script type="text/javascript"><br />  addthis_url    = location.href;   <br />  addthis_title  = document.title;  <br />  addthis_pub    = 'okkobuss';     <br /></script><a href="http://www.google.com/">Google</a> released a new feature for its <a href="http://googlemobile.blogspot.com/2008/11/google-mobile-app-for-iphone-now-with.html">Google Mobile iPhone Application</a> yesterday: voice search.  Users speak a query and the application returns search results formatted for the iPhone.  This is similar to the <a href="http://www.google.com/goog411/">GOOG411</a> directory assistance application, which allows users to call a phone number, speak a query and receive information about local listings in voice or SMS formats. However  the new application apparently performs recognition locally on the iPhone, meaning it comes bundled with an embedded speech recognition engine.</p>
<p>Aside from GOOG411, during the US presidential Google released <a href="http://labs.google.com/gaudi">Gaudi</a>, a voice indexing technology for video.  That makes the iPhone app the third official service the company releases, making use of speech recognition, leaving one guessing when Google&#8217;s speech technology becomes available as API, like the <a href="http://code.google.com/apis/ajaxlanguage/">Google AJAX Language API</a> for translation and transliteration, rather than bundled as software services.  Also, an Android version is probably in the works, one would guess.</p>
<p>All applications are available in US English for now.<br /><!-- AddThis Bookmark Button END --></p>
]]></content:encoded>
			<wfw:commentRss>http://www.okkoblog.com/2008/11/18/google-mobile-iphone-app-with-speech-recognition/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
