Researchers at MIT have developed the first integrated-circuit vocal tract, which could eventually make it’s way into high-end PDAs. It’s biologically inspired and combined with a bionic-ear processor in a feedback loop. This means it can not only be used for producing speech but also recognizing it: the vocal tract can help to model what the ear thinks it is hearing to verify whether it’s likely to have it right or not.
Essentially, the system is a biological model of our own method of producing speech that has been implemented in silicon. It is not just good at synthesizing speech but (in conjunction with the ear/feedback) but in interpreting it because it can literally figure out what muscles etc. would have to be used to produce a particular sound and so can ‘reverse engineer’ what sound was actually intended. F and S may sound similar, but the way the sounds are produced in terms of muscles are quite different. So if you’re able to get into the physiology from small differences in what you hear then you can make much better guesses at what’s being said.
The chips used (one for the vocal tract and one for the ear) are both based on human have been implemented in custom analog circuitry. Digital computers can be used, but the computational complexity of the problem means that the analog solutions are drastically smaller, faster, and less power-hungry.
There are a number of really interesting commercial applications. The most obvious is robust speech/speaker/language recognition in noisy environments, but they are also building a glove that can drive the chip and a brain machine interface that can be implanted in the brain for speech impaired subjects, and are building muscle interfaces that would allow silent phone calls (you talk silently on the train, the system figures out what sounds you are trying to make and makes them for you down the phone).
Originally posted on Brains and Machines.