<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Okko in Speech &#187; Vendors</title>
	<atom:link href="http://www.okkoblog.com/category/brands/vendors/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.okkoblog.com</link>
	<description>Working with speech and language technology</description>
	<lastBuildDate>Thu, 29 Sep 2011 12:37:20 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0.1</generator>
		<item>
		<title>Roger Ebert TTS</title>
		<link>http://www.okkoblog.com/2010/03/10/roger-ebert-tts/</link>
		<comments>http://www.okkoblog.com/2010/03/10/roger-ebert-tts/#comments</comments>
		<pubDate>Wed, 10 Mar 2010 16:57:50 +0000</pubDate>
		<dc:creator>Okko</dc:creator>
				<category><![CDATA[Brands]]></category>
		<category><![CDATA[News]]></category>
		<category><![CDATA[Research]]></category>
		<category><![CDATA[Vendors]]></category>
		<category><![CDATA[accessibility]]></category>
		<category><![CDATA[Cepstral]]></category>
		<category><![CDATA[CereProc]]></category>
		<category><![CDATA[NeoVoice]]></category>
		<category><![CDATA[TTS]]></category>

		<guid isPermaLink="false">http://www.okkoblog.com/?p=181</guid>
		<description><![CDATA[Roger Ebert, who lost his lower jaw to cancer, has been his old voice back. Or at least a version of it. Edinburgh-based CereProc has build a custom voice for its own speech synthesis engine based on old recordings such as TV appearances and DVD commentary tracks. This is of course not the first case [...]]]></description>
			<content:encoded><![CDATA[<p><object classid="clsid:d27cdb6e-ae6d-11cf-96b8-444553540000" width="425" height="350" codebase="http://download.macromedia.com/pub/shockwave/cabs/flash/swflash.cab#version=6,0,40,0"><param name="src" value="http://www.youtube.com/v/lJI87Ivk0PM&amp;feature" /><embed type="application/x-shockwave-flash" width="425" height="350" src="http://www.youtube.com/v/lJI87Ivk0PM&amp;feature"></embed></object></p>
<p>Roger Ebert, who lost his lower jaw to cancer, has been his old voice back. Or at least a version of it. Edinburgh-based <a href="http://www.cereproc.com/" target="_blank">CereProc</a> has build a custom voice for its own speech synthesis engine based on old recordings such as TV appearances and DVD commentary tracks.</p>
<p>This is of course not the first case of text-to-speech (TTS) being used for essential day-to-day communication. Most prominently, Professor Stephen Hawkins has been doing so since 1985, initially using <a href="http://en.wikipedia.org/wiki/DECtalk" target="_blank">DECTalk</a>, since 2009 <a href="http://www.neospeech.com" target="_blank">NeoSpeech</a>. The poor quality of his voice prior to the switch was of course a bit of a trademark. The anecdote goes that Professor Hawkins stuck with his old voice out of attachment. While many speech and language technologies suffer a wow-but-who-really-needs-it existence, these cases are wonderful examples exhibiting real utility.</p>
<p>Mr. Ebert&#8217;s voice is novel in one regard: he got his own voice back. I have half-seriously mused in the past whether this wasn&#8217;t becoming a real option. Typically, new voice development for general purpose speech synthesis is a costly affair, mostly due to time and labor intensive data preprocessing (studio recording, annotation, hand alignment, etc.) However as the &#8220;grunt work&#8221; is getting more streamlined and automatized the buy-in costs for a new voice lowers. Mr. Ebert was &#8220;lucky&#8221; in the sense that large amounts of his voice had already been recorded in good enough quality to enable building his custom voice. Another player on the TTS market, <a href="http://www.cepstral.com" target="_blank">Cepstral</a>, has recently launched its <a href="http://www.voiceforge.com" target="_blank">VoiceForge</a> offering, which aims to lower the entry threshold for home-grown TTS developers.</p>
<p>Another option that seems to be more and more realistic is employing &#8220;voice-morphing&#8221; and &#8220;voice transformation&#8221;. The idea here is to simply apply changes to an already existing, high-quality TTS voice. The following is a demonstration of how the latter can be done by changing purely acoustic properties (timbre, pitch, rate) of a voice signal:</p>
<p><object classid="clsid:d27cdb6e-ae6d-11cf-96b8-444553540000" width="425" height="350" codebase="http://download.macromedia.com/pub/shockwave/cabs/flash/swflash.cab#version=6,0,40,0"><param name="src" value="http://www.youtube.com/v/-pA7cW0UV88" /><embed type="application/x-shockwave-flash" width="425" height="350" src="http://www.youtube.com/v/-pA7cW0UV88"></embed></object></p>
<p>Voice morphing changes one voice to another. A Cambridge University <a href="http://mi.eng.cam.ac.uk/~hy216/VoiceMorphingPrj" target="_blank">research project</a> demonstrated how recordings of one speaker could be made to sound like that of another using relatively little training data. The following are some examples:</p>
<p>Original Speaker 1:</p>
<p><object style="width: 100px; height: 25px;" classid="clsid:02bf25d5-8c17-4b23-bc80-d3488abddc6b" width="100" height="25" codebase="http://www.apple.com/qtactivex/qtplugin.cab#version=6,0,2,0"><param name="autoplay" value="false" /><param name="src" value="http://mi.eng.cam.ac.uk/~hy216/prjwaves/src01.wav" /><embed style="width: 100px; height: 25px;" type="video/quicktime" width="100" height="25" src="http://mi.eng.cam.ac.uk/~hy216/prjwaves/src01.wav" autoplay="false"></embed></object></p>
<p>Target Speaker 2:</p>
<p><object style="width: 100px; height: 25px;" classid="clsid:02bf25d5-8c17-4b23-bc80-d3488abddc6b" width="100" height="25" codebase="http://www.apple.com/qtactivex/qtplugin.cab#version=6,0,2,0"><param name="autoplay" value="false" /><param name="src" value="http://mi.eng.cam.ac.uk/~hy216/prjwaves/tgt01.wav" /><embed style="width: 100px; height: 25px;" type="video/quicktime" width="100" height="25" src="http://mi.eng.cam.ac.uk/~hy216/prjwaves/tgt01.wav" autoplay="false"></embed></object></p>
<p>Converted Speaker 1 to Speaker 2:</p>
<p><object style="width: 100px; height: 25px;" classid="clsid:02bf25d5-8c17-4b23-bc80-d3488abddc6b" width="100" height="25" codebase="http://www.apple.com/qtactivex/qtplugin.cab#version=6,0,2,0"><param name="autoplay" value="false" /><param name="src" value="http://mi.eng.cam.ac.uk/~hy216/prjwaves/vc01.wav" /><embed style="width: 100px; height: 25px;" type="video/quicktime" width="100" height="25" src="http://mi.eng.cam.ac.uk/~hy216/prjwaves/vc01.wav" autoplay="false"></embed></object></p>
<p>Similar technology was also <a href="http://www.interspeech2009.org/conference/programme/session.php?id=2710" target="_blank">show cast extensively</a> during the 2009 Interspeech Conference. Perhaps this will one day enable those that have lost their voice without hours (or days) of recordings of it at their disposal to have their own custom voices to talk to their loved ones.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.okkoblog.com/2010/03/10/roger-ebert-tts/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
<enclosure url="http://mi.eng.cam.ac.uk/~hy216/prjwaves/vc01.wav" length="159788" type="audio/x-wav" />
<enclosure url="http://mi.eng.cam.ac.uk/~hy216/prjwaves/src01.wav" length="200748" type="audio/x-wav" />
<enclosure url="http://mi.eng.cam.ac.uk/~hy216/prjwaves/tgt01.wav" length="159788" type="audio/x-wav" />
		</item>
		<item>
		<title>SpinVox, Voice-to-Text and Some Terminology</title>
		<link>http://www.okkoblog.com/2010/01/18/spinvox-voice-to-text-and-some-terminology/</link>
		<comments>http://www.okkoblog.com/2010/01/18/spinvox-voice-to-text-and-some-terminology/#comments</comments>
		<pubDate>Mon, 18 Jan 2010 11:14:45 +0000</pubDate>
		<dc:creator>Okko</dc:creator>
				<category><![CDATA[Brands]]></category>
		<category><![CDATA[News]]></category>
		<category><![CDATA[Services]]></category>
		<category><![CDATA[Vendors]]></category>
		<category><![CDATA[Nuance]]></category>
		<category><![CDATA[SpinVox]]></category>

		<guid isPermaLink="false">http://www.okkoblog.com/?p=156</guid>
		<description><![CDATA[The recent acquisition of SpinVox by Nuance not only represents another major step towards market consolidation by the latter company, but also prompted me have a look at the voice-to-text market. Being a &#8220;late adopter power user&#8221; – out of some combination of complacency with existing work flows – and refusing to pay for certain [...]]]></description>
			<content:encoded><![CDATA[<p>The recent <a href="http://www.nuance.com/spinvox/" target="_blank">acquisition</a> of <a href="http://www.spinvox.com" target="_blank">SpinVox</a> by <a href="http://www.nuance.com" target="_blank">Nuance</a> not only represents another major step towards market consolidation by the latter company, but also prompted me have a look at the voice-to-text market.  Being a &#8220;late adopter power user&#8221; – out of some combination of complacency with existing work flows – and refusing to pay for certain conveniences, I have refrained from using such services until now. Shameful for one who&#8217;s bread and butter is working with speech technology, I admin.</p>
<p>Luckily I came across some <a href="http://www.readwriteweb.com/archives/voice-to-text-speech-to-text.php" target="_blank">useful</a> <a href="http://baratunde.posterous.com/this-is-a-test-of-the-google-voice-messaging" target="_blank">reviews</a> of the most prominent providers to get me up to snuff. I won&#8217;t go into them, as I&#8217;m sure others have more to say about the actual user experience. However as &#8220;mobile&#8221; is the way speech and langauge technology seems to want to go, and as I finally plan to use more personal mobile computing resources (especially various gadgets starting with &#8220;i&#8221;) for speech technology, I may give some of these a whirl in the near future…</p>
<p>SpinVox caused somewhat of a stir when launching their voice-to-text service in 2004 and another when the BBC &#8220;<a href="http://news.bbc.co.uk/2/hi/8163511.stm" target="_blank">uncovered</a>&#8221; that the company used a combination of human and machine intelligence. To anyone working in speech and language technology this would have been obvious from the get-go, as well as to anyone reading the company&#8217;s patent or patent applications, in which the use of human operators is mentioned explicitly. However regular users would probably have been duped into thinking a machine was doing all the typing.  Failure to understand/communicate this caused a wholly avoidable privacy debacle.</p>
<p>One thing that&#8217;s clear from last years privacy debacle is that there&#8217;s a bit of mess of terminology when it comes to voice and speech technologies.  So here&#8217;s an attempt at shedding some light on what&#8217;s what:</p>
<p style="padding-left: 30px;"><em>Speech Recognition</em> &#8211; also <em>ASR</em> (automatic speech recognition) for short. This is the general term used to refer to the technology that automatically turns spoken words into machine-readable text. However there are different dimensions to describe this technology, such as models employed (HMM-based vs connectionist), who it&#8217;s for  (one single speaker or all speakers of a dialect or language).  Also, there is a host of applications that employ it (dictation, IVR/telephone systems, voice-to-text services), each with different requirements. Hence ASR is really an umbrella term.</p>
<p style="padding-left: 30px;"><em>Voice Recognition</em> &#8211; often confused with speech recognition.  Usually voice recognition refers to software that works for only a single speaker.  However this is anecdotal and in marketing the two are used synonymously.</p>
<p style="padding-left: 30px;"><em>Voice-to-Text</em> &#8211; a service that converts spoken words into text. Some ASR may be used to help to do so, as well as human transcribers, however the label itself makes no claim as to whether the process is fully automated.</p>
<p style="padding-left: 30px;"><em>Speaker Recognition</em> &#8211; this is a security technology typically used to perform one of two tasks: (1) identifying a speaker from a group of known speakers or (2) determining whether a speaker is really who s/he claims. These are very similar tasks that people often confuse.  Think of the first one as picking a person out of a crowd and the second as a kind of &#8220;voice fingerprint matching&#8221;.</p>
<p style="padding-left: 30px;"><em>Text-to-Speech</em> &#8211; or short <em>TTS</em>, another term for speech synthesis.  This technology is used to turn written text into an audio signal (such as an MP3).  This should be an obvious label, but surprisingly people seem to <a href="http://www.youtube.com/watch?v=N9GyPXJGZsU" target="_blank">confuse</a> it with Voice-to-Text services frequently (purely my own anecdote).</p>
<p>I&#8217;m also told SpinVox&#8217;s sales price of $102m is a bit of a disappointment, representing just over 50% of the initial $200m that SpinVox raised in 2003. But that&#8217;s something I&#8217;ll let others address. Let&#8217;s see where Nuance goes with this, in terms of trying to fully automate the whole transcription process…</p>
]]></content:encoded>
			<wfw:commentRss>http://www.okkoblog.com/2010/01/18/spinvox-voice-to-text-and-some-terminology/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Speech and Dialog Conferences / Speech for iPhone and Android</title>
		<link>http://www.okkoblog.com/2009/07/11/speech-and-dialog-conferences-speech-for-iphone-and-android/</link>
		<comments>http://www.okkoblog.com/2009/07/11/speech-and-dialog-conferences-speech-for-iphone-and-android/#comments</comments>
		<pubDate>Sat, 11 Jul 2009 08:24:00 +0000</pubDate>
		<dc:creator>Okko</dc:creator>
				<category><![CDATA[Brands]]></category>
		<category><![CDATA[Research]]></category>
		<category><![CDATA[Services]]></category>
		<category><![CDATA[Vendors]]></category>
		<category><![CDATA[Android]]></category>
		<category><![CDATA[ASR]]></category>
		<category><![CDATA[iPhone]]></category>
		<category><![CDATA[TTS]]></category>

		<guid isPermaLink="false">http://okkoblog.com/blog/?p=54</guid>
		<description><![CDATA[Conference time: I will be spending a couple of days in London and Brighton from September 5th attending Interspeech, SIGDIAL as well as a researcher round-table. Anyone interested in meeting up, feel free to get in touch. Also, here are some more or less recent, interesting news for Android (at about 6:20, thanks Schamai) and [...]]]></description>
			<content:encoded><![CDATA[<p><!-- AddThis Bookmark Button BEGIN -->Conference time:  I will be spending a couple of days in London and Brighton from September 5th attending <a href="http://www.interspeech2009.org/">Interspeech</a>, <a href="http://www.sigdial.org/workshops/workshop10/index.html">SIGDIAL</a> as well as a researcher<a href="http://www.yrrsds.org/"> round-table</a>.  Anyone interested in meeting up, feel free to <a href="http://www.voxarca.de/app/main/contact">get in touch</a>.</p>
<p>Also, here are some more or less recent, interesting news for <a href="http://www.youtube.com/watch?v=uX9nt8Cpdqg">Android</a> (at about 6:20, thanks Schamai) and <a href="http://prmac.com/release-id-6453.htm">iPhone</a> speech developers.<br /><!-- AddThis Bookmark Button END --></p>
]]></content:encoded>
			<wfw:commentRss>http://www.okkoblog.com/2009/07/11/speech-and-dialog-conferences-speech-for-iphone-and-android/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Kindle Speech Synthesis</title>
		<link>http://www.okkoblog.com/2009/02/26/kindle-speech-synthesis/</link>
		<comments>http://www.okkoblog.com/2009/02/26/kindle-speech-synthesis/#comments</comments>
		<pubDate>Thu, 26 Feb 2009 13:35:00 +0000</pubDate>
		<dc:creator>Okko</dc:creator>
				<category><![CDATA[Brands]]></category>
		<category><![CDATA[Vendors]]></category>
		<category><![CDATA[amazon]]></category>
		<category><![CDATA[audio books]]></category>
		<category><![CDATA[TTS]]></category>

		<guid isPermaLink="false">http://okkoblog.com/blog/?p=50</guid>
		<description><![CDATA[News about speech and language technology tend to be an in-industry affair, interesting largely to those who need and use it on a daily basis or those who produce (develop or market) it. Every so often however, mainstream news surface that raise issues of broad interest. Google&#8217;s efforts with speech recognition are an example of [...]]]></description>
			<content:encoded><![CDATA[<p>News about speech and language technology tend to be an in-industry affair, interesting largely to those who need and use it on a daily basis or those who produce (develop or market) it.  Every so often however, mainstream news surface that raise issues of broad interest.  Google&#8217;s efforts with speech recognition are an example of this. Last month, Amazon&#8217;s Kindle 2 e-book reader created a <a href="http://www.nytimes.com/2009/02/25/opinion/25blount.html?_r=1&amp;partner=rss&amp;emc=rss&amp;pagewanted=all">buzz</a> <a href="http://news.cnet.com/8301-1023_3-10172412-93.html">with</a> its text-to-speech &#8220;audio book&#8221; functionality.</p>
<p>The underlying issue is that Amazon is selling e-books, which can be listened to using speech synthesis, without owning the rights to produce audio book versions.  The Authors&#8217;s Guild argues that this undermines the lucrative audio book market.  While it is arguable that a synthesized voice is comparable to the experience of  listening to a well-produced audio book, Amazon decided <a href="http://www.crunchgear.com/2009/02/28/authors-guild-successfully-kills-kindle-2-text-to-speech-feature-its-now-optional-for-publishers/">not to fight this one out</a>.</p>
<p>What do you think?  Can synthesized audio books provide an experience comparable to real voice productions?</p>
]]></content:encoded>
			<wfw:commentRss>http://www.okkoblog.com/2009/02/26/kindle-speech-synthesis/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>More speech on the iPhone</title>
		<link>http://www.okkoblog.com/2009/02/08/more-speech-on-the-iphone/</link>
		<comments>http://www.okkoblog.com/2009/02/08/more-speech-on-the-iphone/#comments</comments>
		<pubDate>Sun, 08 Feb 2009 09:34:00 +0000</pubDate>
		<dc:creator>Okko</dc:creator>
				<category><![CDATA[Brands]]></category>
		<category><![CDATA[Vendors]]></category>
		<category><![CDATA[Apple]]></category>
		<category><![CDATA[ASR]]></category>
		<category><![CDATA[Google]]></category>
		<category><![CDATA[iPhone]]></category>
		<category><![CDATA[machine translation]]></category>
		<category><![CDATA[open-source]]></category>
		<category><![CDATA[TTS]]></category>
		<category><![CDATA[vlingo]]></category>
		<category><![CDATA[Vocalia]]></category>

		<guid isPermaLink="false">http://okkoblog.com/blog/?p=48</guid>
		<description><![CDATA[The iPhone has proved a game-changer in many regards and speech is no exception. Both Google and Yahoo (with vlingo) have deployed mobile speech applications for the iPhone.Today I came across another sighting of iPhone speech recognition, Vocalia by Creaceed, employing open-source ASR engine Julius for back-end technology. There is no &#8220;push to talk&#8221; button [...]]]></description>
			<content:encoded><![CDATA[<p>The iPhone has proved a game-changer in many regards and speech is no exception.  Both Google and Yahoo (with vlingo) have deployed mobile speech applications for the iPhone.<br />Today I came across another sighting of iPhone speech recognition, <a href="http://www.creaceed.com/vocalia/">Vocalia</a> by Creaceed, employing open-source ASR engine <a href="http://julius.sourceforge.jp/en_index.php">Julius</a> for back-end technology.  There is no &#8220;push to talk&#8221; button but a &#8220;shake to retry&#8221;, which may prove useful when recognition goes awry.  The app supports French, English and German for now and costs €2.99.  Dictation is not available at this point, though Julius is certainly capable of it from an architecture point of view.</p>
<p>Other speech and language related iPhone apps:,
<ul>
<li><a href="http://googlemobile.blogspot.com/2008/11/google-mobile-app-for-iphone-now-with.html">Google Mobile</a> &#8211; voice search app</li>
<li><a href="http://vlingo.com/">Vlingo</a> &#8211; speech-enables your phone</li>
<li><a href="http://www.innovativelanguage.com/products/pocket">Pocket</a> &#8211; language learning app</li>
<li><a href="http://www.makayama.com/iphonevoicedial.html">Voice Dial</a> &#8211; speech-enabled dialer</li>
<li><a href="http://www.voicethis.com/">VoiceThis</a> &#8211; speech-enabled dialer</li>
<li><a href="http://www.future-apps.net/iSpeak/iSpeak.html">iSpeak</a> &#8211; multi-language translator with synthesized output</li>
<li>A <a href="http://www.crunchgear.com/2009/02/03/iphone-app-helps-reduce-stuttering/">stuttering aid</a> (not yet available)</li>
</ul>
<p>Has anyone used these extensively?  What is your experience with speech on the iPhone?</p>
]]></content:encoded>
			<wfw:commentRss>http://www.okkoblog.com/2009/02/08/more-speech-on-the-iphone/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>SVOX purchases Siemens AG speech-related IP</title>
		<link>http://www.okkoblog.com/2009/01/26/svox-purchases-siemens-ag-speech-related-ip/</link>
		<comments>http://www.okkoblog.com/2009/01/26/svox-purchases-siemens-ag-speech-related-ip/#comments</comments>
		<pubDate>Mon, 26 Jan 2009 18:59:00 +0000</pubDate>
		<dc:creator>Okko</dc:creator>
				<category><![CDATA[Brands]]></category>
		<category><![CDATA[News]]></category>
		<category><![CDATA[Vendors]]></category>
		<category><![CDATA[ASR]]></category>
		<category><![CDATA[IBM]]></category>
		<category><![CDATA[Microsoft]]></category>
		<category><![CDATA[Nuance]]></category>
		<category><![CDATA[Siemens]]></category>
		<category><![CDATA[SVOX]]></category>
		<category><![CDATA[TellMe]]></category>
		<category><![CDATA[TTS]]></category>

		<guid isPermaLink="false">http://okkoblog.com/blog/?p=46</guid>
		<description><![CDATA[Following Nuance&#8217;s acquisition of IBM speech technology intellectual property two weeks ago, Zurich-based SVOX today announced the purchase of the Siemens AG speech recognition technology group. The deal gears at creating &#8220;obvious synergies of developing TTS, ASR and speech dialog solutions&#8221; and enhances SVOX&#8217;s portfolio of technologies, which to date included only highly specialized speech [...]]]></description>
			<content:encoded><![CDATA[<div style="text-align: justify;">Following <a href="http://okkobuss.blogspot.com/2009/01/nuance-acquires-ibm-speech-patents.html">Nuance&#8217;s acquisition of IBM speech technology</a> intellectual property two weeks ago,  Zurich-based <a href="http://www.svox.com/News-Items-SVOX-acquires-Speech-Processing-unit-of-Siemens-AG.aspx">SVOX today announced</a> the purchase of the Siemens AG  speech recognition technology group.  The deal gears at creating &#8220;<span id="lblBodyText">obvious synergies of developing TTS, ASR and speech dialog solutions&#8221; and enhances SVOX&#8217;s portfolio of technologies, which to date included only highly specialized speech synthesis solutions,</span><span id="lblBodyText"> to now entail speech recognition.</span><br /><span id="lblBodyText">Like the Nuance-IBM deal (and unlike the <a href="http://www.microsoft.com/presspass/press/2007/mar07/03-14powerofspeechpr.mspx">Microsoft acquisition of TellMe</a>), this merger breaks with the obvious big-fish small-fish paradigm.  Here, </span><span id="lblBodyText">a larger company&#8217;s (IBM, Siemens) R&amp;D</span><span id="lblBodyText"> division was sold to a smaller, more specialized company (SVOX, Nuance).<br />Both transactions come with an intend to pursue development of novel interactive voice applications.  However while Nuance announced the potential development of applications across platforms and environment with IBM expertise and IP, SVOX appears to stay on course with its successful line of automotive solutions to build </span>&#8220;a commanding market share in speech solutions for premium cars<span id="lblBodyText">&#8220;.</span><br /><span id="lblBodyText"></span><br />This deal adds SVOX to a list of companies offering network and embedded speech recognition technologies, also including <a href="http://www.nuance.com/">Nuance</a>, <a href="http://www.telisma.com/">Telisma</a>, <a href="http://www.loquendo.com/">Loquendo</a> and <a href="http://www.microsoft.com/">Microsoft</a>.  Financial terms of the deal were not announced.<br /><script type="text/javascript"><br />  addthis_url    = location.href;   <br />  addthis_title  = document.title;  <br />  addthis_pub    = 'okkobuss';</script><br /><!-- AddThis Bookmark Button END --></div>
]]></content:encoded>
			<wfw:commentRss>http://www.okkoblog.com/2009/01/26/svox-purchases-siemens-ag-speech-related-ip/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Nuance acquires IBM speech patents</title>
		<link>http://www.okkoblog.com/2009/01/16/nuance-acquires-ibm-speech-patents/</link>
		<comments>http://www.okkoblog.com/2009/01/16/nuance-acquires-ibm-speech-patents/#comments</comments>
		<pubDate>Fri, 16 Jan 2009 07:42:00 +0000</pubDate>
		<dc:creator>Okko</dc:creator>
				<category><![CDATA[Brands]]></category>
		<category><![CDATA[News]]></category>
		<category><![CDATA[Vendors]]></category>
		<category><![CDATA[IBM]]></category>
		<category><![CDATA[Nuance]]></category>

		<guid isPermaLink="false">http://okkoblog.com/blog/?p=45</guid>
		<description><![CDATA[Nuance yesterday announced the acquisition of speech-related patents from IBM. The deal encompasses a &#8220;licensing and technical services agreement&#8221;, with IBM continuing to support existing customers. Integrated solutions of the two companies&#8217; technologies are expected in two years time, according to the press release. This deal represents a further step in market consolidation, which Nuance [...]]]></description>
			<content:encoded><![CDATA[<p>Nuance yesterday <a href="http://www.nuance.com/news/pressreleases/2009/20090115_ibm.asp">announced the acquisition of speech-related patents from IBM</a>.  The deal encompasses a &#8220;licensing and technical services agreement&#8221;, with IBM continuing to support existing customers.  Integrated solutions of the two companies&#8217; technologies are expected in two years time, according to the press release.</p>
<p>This deal represents a further step in market consolidation, which Nuance has pursued via a number of mergers and acquisitions over the past years.  Friends in the industry tell me IBM has been trying to market their suite of IVR voice application server software more aggressively, however speech research activity, once part of the company&#8217;s &#8220;pervasive computing&#8221; vision, has declined lately.</p>
<p>Perhaps the IBM vision will bear fruit at Nuance, as the announcement comes with a commitment &#8221; to proliferate advanced speech capabilities across a broad range of devices and environments&#8221;.  One thing is sure:  much like <a href="http://okkobuss.blogspot.com/2008/10/nuance-buys-philips-speech-recognition.html">Nuance&#8217;s recent acquisition of Philips voice products</a>, years after taking over Philips IVR products and solutions, this deal represents another closure, as Nuance has been marketing and supporting IBM&#8217;s ViaVoice product line for years.  The de facto number of competitors on the speech and voice technology market is shrinking, as applications become more mainstream.</p>
<p><script type="text/javascript">. />  addthis_url    = location.href;   <br />  addthis_title  = document.title;  <br />  addthis_pub    = 'okkobuss';     <br /></script><script type="text/javascript" src="http://s7.addthis.com/js/addthis_widget.php?v=12"></script>.<br /><!-- AddThis Bookmark Button END --></p>
]]></content:encoded>
			<wfw:commentRss>http://www.okkoblog.com/2009/01/16/nuance-acquires-ibm-speech-patents/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>IBM Predicts Talking Web</title>
		<link>http://www.okkoblog.com/2008/11/28/ibm-predicts-talking-web/</link>
		<comments>http://www.okkoblog.com/2008/11/28/ibm-predicts-talking-web/#comments</comments>
		<pubDate>Fri, 28 Nov 2008 08:19:00 +0000</pubDate>
		<dc:creator>Okko</dc:creator>
				<category><![CDATA[Brands]]></category>
		<category><![CDATA[Vendors]]></category>
		<category><![CDATA[ASR]]></category>
		<category><![CDATA[IBM]]></category>
		<category><![CDATA[TTS]]></category>

		<guid isPermaLink="false">http://okkoblog.com/blog/?p=44</guid>
		<description><![CDATA[addthis_url = location.href; addthis_title = document.title; addthis_pub = 'okkobuss'; IBM&#8217;s annual crystal ball list of Innovations That Will Change Our Lives in the Next Five Years includes a forecast of a voice-enabled talking web. &#8220;You will be able to sort through the Web verbally to find what you are looking for and have the information [...]]]></description>
			<content:encoded><![CDATA[<p><!-- AddThis Bookmark Button BEGIN --><script type="text/javascript"><br />  addthis_url    = location.href;   <br />  addthis_title  = document.title;  <br />  addthis_pub    = 'okkobuss';     </script>IBM&#8217;s annual crystal ball list of <a href="http://www-03.ibm.com/press/us/en/pressrelease/26170.wss">Innovations That Will Change Our Lives in the Next Five Years</a> includes a forecast of a voice-enabled talking web.  &#8220;<span>You will be able to sort through the Web verbally to find what you are looking for and have the information read back to you,&#8221; the article predicts.<br />IBM itself has launched several voice-enabled products and initiatives over the years, most notably the <a href="http://www-01.ibm.com/software/voice/">WebSphere Voice</a> family of web servers, which adds various voice functionality to its flagship WebSphere platform, leveraging it in areas such as unified messaging and call-center automation.<br />Some problems exist with a vision as the one advocated by the article.  Speech recognition accuracy and noise filtering have obviously come a long way and may only pose a minor impediment.</span> The user&#8217;s desire to speak rather than type or click is another problem. Issuing voice commands in the presence of others may not always be desirable and can be disruptive, for instance at work on public transport.  Lastly, there are usability concerns, beyond the quality of speech technology, when converting a visual 2- or even 3-dimensional representation of information into a 1-dimensional audio stream.  The cognitive load increases significantly with tasks more complex than, for instance, obtaining time-table information or finding the nearest Italian restaurant.<br />The effort that stands behind the vision, to put voice technology to uses beyond call-center automation, is laudable.  Mobile internet access and computing on-the-road may indeed do their parts to make this vision come true.  And clearly, there are use cases, such as improved accessibility for users with impairments, that on their own accord merit making the web voice-accessible.  Wide-spread usage of a voice-enabled web, however, may be more than five years off.<br /><!-- AddThis Bookmark Button END --></p>
]]></content:encoded>
			<wfw:commentRss>http://www.okkoblog.com/2008/11/28/ibm-predicts-talking-web/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Nuance buys Philips Speech Recognition Systems</title>
		<link>http://www.okkoblog.com/2008/10/02/nuance-buys-philips-speech-recognition-systems/</link>
		<comments>http://www.okkoblog.com/2008/10/02/nuance-buys-philips-speech-recognition-systems/#comments</comments>
		<pubDate>Thu, 02 Oct 2008 09:11:00 +0000</pubDate>
		<dc:creator>Okko</dc:creator>
				<category><![CDATA[Brands]]></category>
		<category><![CDATA[News]]></category>
		<category><![CDATA[Vendors]]></category>
		<category><![CDATA[ASR]]></category>
		<category><![CDATA[healthcare]]></category>
		<category><![CDATA[Nuance]]></category>
		<category><![CDATA[Philips]]></category>

		<guid isPermaLink="false">http://okkoblog.com/blog/?p=42</guid>
		<description><![CDATA[Nuance announced this week its acquisition of Philips Speech Recognition Systems. This represents another step in a series of acquisition by the speech technology giant towards market and portfolio expansion. In 2002, Scansoft Inc., which through further mergers and acquisitions became today&#8217;s Nuance, already acquired Philips&#8217; network speech processing group, though not its dictation unit. [...]]]></description>
			<content:encoded><![CDATA[<p><!-- AddThis Bookmark Button BEGIN --><a href="http://www.nuance.com/">Nuance</a> announced this week its <a href="http://www.nuance.com/news/pressreleases/2008/20081001_philips.asp">acquisition of Philips Speech Recognition Systems</a>.  This represents another step in a series of acquisition by the speech technology giant towards market and portfolio expansion.  <a href="http://www.commsdesign.com/news/market_news/showArticle.jhtml?articleID=16505979">In 2002, Scansoft Inc.</a>, which through further mergers and acquisitions became today&#8217;s Nuance, already acquired Philips&#8217; network speech processing group, though not its dictation unit.  With this weeks acquisition, the dictation unit will be incorporated into Nuance&#8217;s already strong dictation portfolio, expanding especially on European healthcare markets, the company announced.  Highlights of the purchase include increasing customer base, language &amp; solutions portfolios, distribution channels as well as a great leap forward in international expansion.<br /><script type="text/javascript"><br />  addthis_url    = location.href;   <br />  addthis_title  = document.title;  <br />  addthis_pub    = 'okkobuss';     <br /></script><script type="text/javascript" src="http://s7.addthis.com/js/addthis_widget.php?v=12"></script><br /><!-- AddThis Bookmark Button END --></p>
]]></content:encoded>
			<wfw:commentRss>http://www.okkoblog.com/2008/10/02/nuance-buys-philips-speech-recognition-systems/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>OnMobile buys Telisma</title>
		<link>http://www.okkoblog.com/2008/05/19/onmobile-buys-telisma/</link>
		<comments>http://www.okkoblog.com/2008/05/19/onmobile-buys-telisma/#comments</comments>
		<pubDate>Mon, 19 May 2008 08:30:00 +0000</pubDate>
		<dc:creator>Okko</dc:creator>
				<category><![CDATA[Brands]]></category>
		<category><![CDATA[News]]></category>
		<category><![CDATA[Vendors]]></category>
		<category><![CDATA[ASR]]></category>
		<category><![CDATA[IBM]]></category>
		<category><![CDATA[India]]></category>
		<category><![CDATA[internationalization]]></category>
		<category><![CDATA[Microsoft]]></category>
		<category><![CDATA[ndia]]></category>
		<category><![CDATA[OnMobile]]></category>
		<category><![CDATA[Telisma]]></category>

		<guid isPermaLink="false">http://okkoblog.com/blog/?p=37</guid>
		<description><![CDATA[OnMobile Global Ltd today acquired France-based Telisma, a producer of speech recognition software for network/telephony environments.The acquisition comes at a time after OnMobile recently partnered with Nuance, a Telisma competitor for speech recognition markets, to deploy voice search applications for its home market, India. India&#8217;s multilingual market has made it a tough one to crack [...]]]></description>
			<content:encoded><![CDATA[<div style="text-align: justify;"><!-- AddThis Bookmark Button BEGIN --> <a href="http://www.onmobile.com/">OnMobile Global Ltd</a>  <a href="http://www.topnews.in/onmobile-global-acquires-100-stake-telisma-sa-france-241964">today</a> <a href="http://www.business-standard.com/common/storypage_c_online.php?leftnm=10&amp;bKeyFlag=IN&amp;autono=37650">acquired</a> France-based <a href="http://www.telisma.com/">Telisma</a>, a producer of speech recognition software for network/telephony environments.<br />The acquisition comes at a time after OnMobile <a href="http://www.onmobile.com/news-202.html">recently partnered</a> with <a href="http://www.nuance.com/">Nuance</a>, a Telisma competitor for speech recognition markets, to deploy voice search applications for its home market, India.  India&#8217;s multilingual market has made it a tough one to crack for speech technology companies, though a lucrative one as India has <a href="http://gigaom.com/2008/05/18/yulop/">recently surpassed</a> the U.S. as the second largest mobile market in the world, according to Om Malik at <a href="http://www.gigaom.com/">GigaOm</a>.<br />I suspect issues specific to speech technology and India&#8217;s multilingualism have something to do with this deal.  As I <a href="http://okkobuss.blogspot.com/2008/05/internationalization-and-speech.html">recently pointed out</a>, internationalization of speech and language technologies comes at a steep entry cost, due to the high demands on expertise and data required for building language-specific models.  In addition, speech recognition companies like Nuance have long kept their language models under wraps.  In other words, if your language isn&#8217;t catered to, reaching that language&#8217;s customer base becomes a very pricey affair.<br />While <a href="http://www.voxforge.org/">open-source aspirations</a> to build freely availably language models for speech recognition exist,  Telisma has opted on middle-ground in this matter by allowing partners/customers to <a href="http://www.telisma.com/Language_Development_Kit.html">build their own models</a><span style="text-decoration: line-through">, but selling the tools to do so at a price</span>.  In a market like India, the ability to cater to a multi-lingual customer base without purchase of expensive proprietary software (or paying someone else to develop proprietary software for you to purchase) may have made a big difference in this deal.</p>
<p>On a different note, this acquisition is the latest in a series of acquisitions consolidating the speech technology market.  While five years ago telephony speech technology was a  highly redundant market of small companies building similar products, today they have largely been acquired by or merged with bigger players.  In the meantime, companies like Microsoft, IBM, Siemens and Google are making their own moves to enter the market.</p>
<p><span style="font-weight: bold;">Update:</span><br />Telismas acoustic modelling toolkit is indeed not for sale, but for free, as one reader has pointed out.  Thanks!</div>
]]></content:encoded>
			<wfw:commentRss>http://www.okkoblog.com/2008/05/19/onmobile-buys-telisma/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
	</channel>
</rss>

