Friday, April 13, 2007

Google launches voice-controlled search engine


So just after I mention that Google needs to push into this space Google has unveiled the latest venture from its Labs, a voice-based local search tool for phones.

Google Voice Local Search lets you search for a local business in the US simply by speaking the search term. Users call a freephone number and can then be connected to business or can choose to receive the details via a text message.

The company notes that the service is still experimental as most of Googles initial launches are and may not be available at all times and may not work for all users (example those with think Irish accents). It is currently restricted to the US, and only returns results for related businesses, but like the desktop version of Google Local search is likely to become more widely available as testing progresses.

Check it out at labs.google.com/goog411/

The world's largest internet players (Google, Yahoo! and Microsoft) are jockeying for position in voice-activated search services as they strive to extend their reach beyond computers to mobile phones. Lets face it, there are a lot more phones than computers out there and often the need for information is required on the go.

Who can blame them for aggresively pursueing this strategy; with mobile advertising revenues forecast to grow eightfold in the next four years, to $11.5 billion and the market for directory enquiries worth $8 billion a year in the US alone.

GigaOM has reported that Gary Clayton, the chief creative officer of TellMe and Victor Chen, a senior TellMe executive have joined Yahoo! recently. Gary Clayton has joined as a vice president in R&D division of Yahoo! This is a coup for Yahoo! and must annoy Microsoft to no end.

It seems that Yahoo! is trying to develop its own TellMe inhouse. It seems clear that Yahoo! is planning to add voice to its web services.

This space is now reaching boiling point.

Thursday, April 12, 2007

Circumventing the standards


Recently I encountered an annoyance with the VXML standard. I was unable to use <if> within a <prompt>. The reason why I wanted to do this was to adjust the prompt based on a variable. The prompt was fairly long with just some variance based on the value of this variable. It was possible to do <if><prompt>...</prompt><else if/><prompt>...</prompt><else if/> etc etc. But this just made the document bigger and bigger and it just seemed to me that this abitrary limitation was ridiculous.
So imagine my surprise when I was able to circumvent this limitation while remaining within the standard. How was this possible? Simple, I used the <foreach> tag which is allowed to be a child of <prompt>. I created a dummy array of one element enabling me to do the following:

<prompt>
   this is the start of this prompt
   <foreach array="dummyArrayOfOneElement" item="foo">
      <if cond="cond1">
         middle prompt 1
      <elseif cond="cond2"/>
         middle prompt 2
      <else/>
         middle prompt default
      </if>
   </foreach>
   this is the end of this prompt
</prompt>

Can someone tell me why the limitation exists on <if> and why the limitation is pointless if it can be circumvented and still remain within the standard?

Wednesday, April 04, 2007

The VXML, CCXML Great Divide, oops I mean Great Collide


So in a prior post I went through some of the great pains when trying to get CCXML and VXML to play nicely together. Well I have news from a reliable source that this pain is to soon be a thing of the past, at least on the Voxeo platform.
Voxeo have taken great strides in this area and will soon be introducing some great new features that will certainly make my life a lot easier.
Soon you will be able to transparently move data from CCXML to VXML and back again transparently. With the use of JSON; complex scripted objects (arrays, structures, etc) will be able to be passed and the arbitrary size limits of these data structures will be removed. What does this mean to the developer? Well a lot in fact. Now voice applications will seamlessly be able to incorporate all the features of CCXML and VXML . In fact the seam is going to be so tight that as a developer you can more or less consider them one. Ah, the complete control of call and dialog. The voice application holy grail has been found.

Microsoft Acquires Tellme


Microsoft announced the acquisition of Tellme. The Price, previously rumored in the $800 million range, is undisclosed.
This is an interesting development and continues to show the consolidation in this space (Nuance snapped up BeVocal
and Genesys Lab/Alcatel-Lucent snapped up VoiceGenie). Some of the big companies are realizing that the phone has much more penetration than computers and with cell phones there use is ubitquitous.
Tellme has done some interesting things recently with cell phones blurring the lines between voice user interfaces and visual user interfaces.
Also Microsoft can use Tell Me to do the same thing as free411.com with Live Local.
Now are there more acquisitions ahead from the likes of Google who certainly don't want to miss the boat. In fact at the moment Google is falling further and further behind Microsoft and Yahoo (who seem to be leading the charge in this area).

Friday, August 04, 2006

VXML, does it speak many tongues?

According to the VXML 2.0 spec the grammar tag has the attribute xml:lang. The spec describes this attribute as "The language identifier of the grammar". Does this mean I can have an English document with a Spanish grammar? I'm not sure, but the description is vague! Why else would the xml:lang attribute exist if this is not the case. But anyway I haven't come to my point yet. If the grammar tag has an xml:lang attribute, why doesn't the field tag? Does this mean I can only have fields in the language of the document? Since many fields are just built-in grammars (e.g. boolean, date) shouldn't it make sense that the attributes match?

Much of this depends on the first point being true, but can you imagine having a VXML document that starts in English but upon recipient request becomes Spanish ("press 1 to continue in Spanish"). Thus all future grammars are Spanish (using xml:lang="es-MX" or something like that). Now we need to grab a date. Does this mean we have to jump to a new document with Spanish as the default language?

Very messy indeed.

Wednesday, July 12, 2006

Forget the horse, just give me a king DOM

So here's the problem. You start a VXML session and set up all the application scope variables you'll need for the entire length of the application. One of these application variables is an array of colors the recipient likes. On a subsequent document in the application, you need to create a grammar that asks the recipient what their absolute favorite color is. You'd like to create the grammar based on the application variable described earlier, but alas you cannot.
Of course in HTML one could do this using a combination of JavaScript and DOM manipulation. So should something like this be possible in VXML?

Monday, July 10, 2006

The VXML, CCXML Great Divide

Often I ask myself why CCXML and VXML exist. Couldn't one magnificent XML language support both the advance call control as well as the dialog control language? As anyone knows the dependence of different languages on disparate systems is an open invitation for disaster. Consider the following conditions,

1. CCXML invokes a VXML session, next this session exits and a transfer occurs in CCXML to another recipient. When this transfer is finished, another request for a VXML session is started. But before this session can start, a VXML session is required. What happens if all VXML resources are exhausted? Thus are now faced with the dilemma of dealing with an existing call that has to terminate abruptly. It is this dependency on resources that are not controlled as a single entity that leads to disaster.

2. CCXML invokes a VXML session. This VXML session exits for some call control reason (a transfer etc). Next a *NEW* VXML session is started to continue on with the dialog the original VXML session was handling. But hold on a minute, we now have committed an application cardinal sin. Sessions are becoming woven into a tapestry of confusion (yes I did make up this line). The power of VXML application and session management has been disregarded. Could something this simple have been overlooked? Should the VXML session remain active somehow? Since CCXML create the dialog, shouldn't it be responsible for terminating it. What happened to separation of powers? So how can one solve this? Well a new VXML tag that pauses it specifically so control can be returned to parent owner session. So now we are introducing tags specifically to deal with the separate layers.

So back to my original point, if we need to introduce VXML tags to support two disparate systems, why not just put all the tags there in the first place?

Monday, July 03, 2006

Did somebody say bonjour Voxeo?

Europeans must be licking their licks with the recent arrival of Voxeo to their shores (see Voxeo Launches European Subsidiary). It is good news indeed that Voxeo are spreading their ever increasing wings. While I'm sure many readers out there may favor one provider or another in this space, Voxeo has really become the de-facto.
Voxeo’s laser like focus on new technologies has really shown others how in this space you must innovate or be eaten. To my knowledge Voxeo is the only standards based IVR Company that is profitable. Calling them an IVR company is probably a great disservice to the company as they are and continue to be a great innovator in the "interactive voice" space.
At the same time a relationship between Skype and Voxeo has made it possible for a revolutionary service that gives Skype users the ability to talk instantly to anyone, anywhere in the world, at any time, regardless of language. This power cannot be underestimated, in fact in the bible the speaking of many tongues was seen as a miracle. Now the European Union can have a single language as well as a single currency. Perhaps the United Nations can just use Skype/Voxeo now, why the need for expensive personal translators.
I don't mean to be personally endorsing Voxeo, but it is hard to see anybody else competing with them. Now while Tim O'Reilly may have coined the term "Web 2.0", I herby take credit in coining the term "Voice 2.0". And what is Voice 2.0, well it's Voxeo.