Speech or voice recognition (often abbreviated ASR) is the recognition and the conversion of spoken language into text.
SpeechFoundry™, Inferret’s commercial voice software, integrates the latest techniques from pioneering AI research such as Deep Learning. We use Deep Neural Network (DNN) Acoustic Models and our proprietary, patented Weighted Finite-State Transducer compression method to provide high-accuracy voice recognition with the fastest response and incredibly low-memory footprint.
Natural Voice Interface
Speech recognition enables human users to speak to machines, but it doesn’t, per se, enable machines to understand the meaning of what is being spoken. This gap is closed by Natural Language Understanding (NLU) also known as Casual Speech; at Inferret, we call this technology Natural Voice Interface (NVI).
NVI lets human users talk to a machine the same way they talk to a human: They can say what they want , as it comes to their mind, and expect the machine to properly “react”: activate a function, provide them with information etc. In many voice recognition systems, users do not have this degree of freedom: they can only use a number of specific commands and actually need to remember what they can say in order to be able to use voice with a machine.