Quotes by BrendanQuote:
Another example:
U: Givus sum nopad
NP: Huh
U: I wanna write a letta
NP: A letta?
U: A letta for me, see?
NP: Let us format C? Ok, formatting...
LOL... Looks like a deafs' conversation! I don't know if the current Speech recognition tecnologies make this type of mistakes, but it's funny!
If you are really afraid of that, an Esperanto speech recogniser could be implemented and everyone forced to use an esperanto-like pronounciation (which is VERY regular and is not difficult to use after some practise)...
Apart from that, I can only say that the code that generated the text would make it's *best* to find out how the user would write it's orders (including by comparing the words' sound with common ones' and so), with no fear of consuming much time or so...
Quote:
Speech recognition alone is hard enough (even though there has been some impressive advances in this area - you can probably do a patent search to find details). AFAIK the current "state of the art" for Windows systems is a product called NaturallySpeaking. Here's
someone else's comments...
I'll take a look at that...
Quote:
The next problem is parsing the english language and forming sensible english language responses. To date, this has been a major stumbing block. This is party because english is unstructured and ambiguous, and partly because it relies on context a lot.
It's the same in every language...
Quote:
For e.g.:
U: Make the title bold
NP: Which title
U: The main title
NP: The main title?
U: Make it bold
NP: Make what bold?
U: Make the main title bold
NP: Ok - changing main title to "Bold"
U: Undo
NP: Ok
U: Use bold lettering for the main title
NP: Ok - changing main title to "Bold Lettering"
Either the user is disallowed to use this type of ambiguous sentences or the kernel would do just like human beings do: store the last noun explicitly specified and feed the sentence to the program with the original pronoun replaced by it...
Quote:
To solve these problems you need to make the language structured - like a programming language rather than normal speech:
U: Select main title
NP: Ok
U: Enable bold
NP: Ok
You could use a less "technical" language even if it's not free-form normal speech... you don't really need to select text to bold it, just consider the following example:
U: Apply Bold to the main title.
Here, the aplication, that is expected to identify nouns such as "main title" as objects and adjectives like "bold" as attributes, and the verb "apply" as an assigment verb, would make *all the words of the main title* bold...
It knew perfectly that the main title is a set of words, and that words are sets of letters... so it would make all the words bold, and consequentially all the letters bold...
Quote:
This would involve having unique names for everything that can be selected or changed (icons, scroll bars, buttons, etc), which wuld mean forcing all applications to provide suitable names for everything.
Yes, of course... and if they had a widget libraries, like in any "normal" operating system, they could have some functions to help identifying the objects that are being called...
Quote:
Lastly, if you get everything working perfectly people will play with it for 10 minutes, say "That was cool" and then disable it.
Not if it's really useful...
Quote:
Why? Because it takes much longer to say something and get an audible response than it does to click or type,
No! I'm much more speedy when I speak than when I write... You aren't?
And your response could be some GUI event or something like that... If you are dictating text, it would appear on the aplication's screen/window (which would not disapear as an UI object)...
Quote:
home users like to listen to music or TV (which would be picked up by the microphone)
The OS could be designed to allow also a "classic" GUI-only interface too...
But the immediate solution would be typing in the sentences...
JJ