Talk:Speech Recognition: Difference between revisions

From Citizendium
Jump to navigation Jump to search
imported>Howard C. Berkowitz
(I'm finding consistently inconsistent usage in different contexts. :-()
imported>Howard C. Berkowitz
(Ambiguous definitions seem everywhere.)
Line 22: Line 22:
::With respect to Donna Summer, voice is definitely the musical term. In analog telephony, we'd speak of "toll quality" voice as a 4 kHz bandwidth channel, where a dedicated line for a FM radio station to pick up a concert for live broadcast has 16 kHz of bandwidth.
::With respect to Donna Summer, voice is definitely the musical term. In analog telephony, we'd speak of "toll quality" voice as a 4 kHz bandwidth channel, where a dedicated line for a FM radio station to pick up a concert for live broadcast has 16 kHz of bandwidth.


:Perhaps we need a speech-vs.-voice article.  I've already put up a disambiguation page for voice as in VoIP as opposed the musical context.[[User:Howard C. Berkowitz|Howard C. Berkowitz]] 18:56, 26 July 2008 (CDT)
::Perhaps we need a speech-vs.-voice article.  I've already put up a disambiguation page for voice as in VoIP as opposed the musical context.[[User:Howard C. Berkowitz|Howard C. Berkowitz]] 18:56, 26 July 2008 (CDT)

Revision as of 17:56, 26 July 2008

This article is developed but not approved.
Main Article
Discussion
Related Articles  [?]
Bibliography  [?]
External Links  [?]
Citable Version  [?]
Gallery [?]
 
To learn how to update the categories for this article, see here. To update categories, edit the metadata template.
 Definition The ability to recognize and understand human speech, especially when done by computers. [d] [e]
Checklist and Archives
 Workgroup categories Computers and Linguistics [Categories OK]
 Talk Archive none  English language variant American English

This article was started in July, 2008, as part of the Eduzendium Project. We will complete the article by August 14, 2008.

Consistency of terms

Right now, we have articles dealing with "spoken language" rather than "speech", at least definitions of musical-quality and telephone-quality voice, etc. Shall we try to get some consistency before too much is embedded?

I wonder if this would better be titled "voice recognition", both to be consistent with VoIP, and also for using individual voice for biometric identification rather than communications? Howard C. Berkowitz 15:17, 25 July 2008 (CDT)

This subject is a new area for me. In the research that I have done so far, the professionals divide it up as "speech recognition" for what I am talking about, and "speaker recognition" for biometric identification. No expert appears to use "voice recognition", though that is the term that I was originally going to use. Perhaps it is a Brontosaurus/Apatosaurus sort of divide.
Just to add to the confusion, AI people talk about natural language processing, which appears to be the equivalent of what I am calling "computer speech technology"; that is, speech recognition plus responding to the recognized speech; and also including speech synthesis.
As for "speech" versus "spoken language", I cannot thing of a distinction between the two, so I would prefer the shorter term. I do see a distinction between speech and voice, where I would consider voice to be a broader term encompassing all the sounds that a human could generate--Donna Summer singing "I Feel Love" is voice, but not really speech.

Samuel C. Smith 14:37, 26 July 2008 (CDT)

On doing some reference checking, the situation appears to be very confused. You're right that the specific person identification term is "speaker". On checking some recent literature on products that do "mouth noise recognition" for computer input, DragonDictate and NaturallySpeaking (http://www.ddwin.com/overview.htm) seem to use speech rather than voice. You seem to be right that "speech" is the current term in computer-based recognition, since the last document I have on VoiceNavigator, the maker's website being down, calls it "speech recognition".
On the other hand, the standard term for sending "mouth noise" over an IP-based telephone system is definitely "voice over IP". The International Telecommunications Union appears still to use "voice" as the source of that which Mean Opinion Score is calculated: http://www.slac.stanford.edu/comp/net/wan-mon/tutorial.html#mos.
With respect to Donna Summer, voice is definitely the musical term. In analog telephony, we'd speak of "toll quality" voice as a 4 kHz bandwidth channel, where a dedicated line for a FM radio station to pick up a concert for live broadcast has 16 kHz of bandwidth.
Perhaps we need a speech-vs.-voice article. I've already put up a disambiguation page for voice as in VoIP as opposed the musical context.Howard C. Berkowitz 18:56, 26 July 2008 (CDT)