What is Sphinx?
Sphinx is an open source project by Carnegie Mellon University that deals with Natural Language Processing. I primarily use it for speech recognition.
Read more about it on the CMU Sphinx project page.
A four page PDF overview of the sphinx four system.
There are several versions, the latest being written in Java, which is what I’m going to walk through below.
Getting Sphinx
You can get the sphinx code or binaries from sourceforge. If you’re feeling really lucky, get the source, or check it out from subversion, but if you just want to use the engine for speech recognition, just download the binaries. It comes in a jar. We’ll step through that in this post.
When you go to the above sourceforce link, select sphinx4 then download the sphinx4 bin file. Once you download and unzip it, you’ll see a few jars in the bin folder, a demo folder, some documentation, and a lib folder (among other things). The lib folder has the sphinx4 jar in it. What you want is the entire lib folder because it has everything you need in it to do some speech recognition.
Running Sphinx
Get an IDE. You can use whatever IDE you want (IntelliJ, NetBeans) but I will step you through eclipse, which is free. You can read about how to get eclipse in my previous post. In eclipse, create a new Java project (file->new->Java Project) and give it a name (I called mine sphinx4). You’ll see that it made a src folder. Copy the lib folder from the sphinx4 download folder you just unzipped by pasting it into the root folder of the project. Also go into the demo folder and copy the wavfile folder and paste it into the src folder in your eclipse project.
There’s one more file you need. The jsapi.jar file is necessary, but it doesn’t show up anywhere. There is a legal issue about just downloading the jar file, so in the lib folder you’ll see the jsapi.exe file. Run that and the jaspi.jar file will magically appear in the same folder as the jsapi.exe file. In linux, run the jsapi.sh file and it should have the same result. If you can’t get it, Google for it and you should be able to find it. If all else fails, let me know and I’ll help you get it. It must be in your lib folder before we move on.
With the wavfile folder in the src folder and the lib folder under your project root (and with the jsapi.jar file in the lib folder),you can start to link in the jars that you will need to do some simple speech recognition. Expand the lib folder and you’ll see the following jar files in it:
js.jar
jsapi.jar
sphinx4.jar
TIDIGITS_.jar
Right-click on the sphinx4 jar->build path->add to build path. This adds (links) the jar to your build path allows the IDE to use code from the jar for your project. Do the same for all of the jars above.
When that is done, your folder structure should look something like this:
Notice that there are a few wav files, a .gram file, and a config.xml file. You’ll need to open the config.xml file (right-click on the file->open with->text editor otherwise it’ll open some xml editor that is hard to understand. Find the part of the file that looks like this:
<component name=”jsgfGrammar” type=”edu.cmu.sphinx.jsapi.JSGFGrammar”>
<property name=”dictionary” value=”dictionary”/>
<property name=”grammarLocation”
value=”resource:/demo.sphinx.wavfile.WavFile!/demo/sphinx/wavfile/”/>
<property name=”grammarName” value=”digits”/>
<property name=”logMath” value=”logMath”/>
</component>
It’s about half way into the file. You need to make some changes here. In stead it should look like this (you can paste this in or just remove the demo.sphinx from the first part and the /demo/sphinx from the second part of the middle line):
<component name=”jsgfGrammar” type=”edu.cmu.sphinx.jsapi.JSGFGrammar”>
<property name=”dictionary” value=”dictionary”/>
<property name=”grammarLocation”
value=”resource:/wavfile.WavFile!/wavfile/”/>
<property name=”grammarName” value=”digits”/>
<property name=”logMath” value=”logMath”/>
</component>
Save the file (ctrl-s). Now you’re ready to run sphinx and recognize some simple speech.
Go ahead and open up the wav file named 12345.wav and listen to it. Notice that the spoken words are just that: one two three four five. If all goes well, that’s what sphinx should recognize.
You can run the program by right-clicking on the WavFile.java file (located in the src/wavfile folder)->Run As->Java Application. This should run the recognizer. After a few seconds, you’ll see the text “one two three four five” show up in the Console portion of eclipse. If you got that far, nice work. You were able to perform some speech recognition with sphinx.

{ 61 } Comments
Good work! I was able to get the WavFile working on Linux using NetBeans.
At first, I ran into a NullPointerException, choking the program right there. This line:
URL configURL = WavFile.class.getResource(“config.xml”);
was returning null, because it couldn’t actually find the “config.xml” file in the directory my WavFile.class compiled class file was being run from. This may just be a NetBeans idiosyncrasy. I’m not sure why the .wav and other files ended up being placed in a folder different from the .class file.
I figured out where NetBeans was placing everything after the compile, and I changed the package declaration at the top of the WavFile.java file to read:
package wavfile;
This placed the .class file in the same directory as the config.xml file, and voila! “one two three four five”
I’m glad to hear that things worked out. Something about the working path in NetBeans was probably the culprit. When you type WavFile.class it references from the folder where the class file is located, so the config.xml file should have been in there, but you knew that. It was probably putting class files into a compiled folder and not moving the config.xml because it wasn’t importing anything from it. I’m glad to hear you got it working.
I got “one two three four five”.
Thanks a lot!
Right now, I want to integrate Asterisk, Festival, and Sphinx. Any suggestion?
Integrating those three things is no easy task! I know little about Festival and even less about Asterisk. I did download the latter. Neither look too hard to get running alone, but since they are different programming languages, the integration of the three will be tricky. Give me some time to get to know Asterisk and Festival and I’ll get back to you.
You right!
Special sphinx4+Asterisk.
Now I am learning Perl.
hi,
it was a nice tutorial, thanks for uploading.
I just want to know that has any one of you had successfully integrated Asterisk with Sphinx 4..?
Plz help me doing that..i would be very thankful to you..
Regards,
I got this error.Can anyone help me to remove it.
kundan@dev-desktop:~/wavefile$ java FileRecognizer 12345.wav
file:/home/kundan/wavefile/12345.wav
URL:file:/home/kundan/wavefile/config.xml
Loading Recognizer…
cm:edu.cmu.sphinx.util.props.ConfigurationManager@1
Recognise: Recognizer: recognizer State: Deallocated
Recognizer : Recognizer: recognizer State: Ready
reader: null
Exception in thread “main” java.lang.NoClassDefFoundError: com/jcraft/jogg/SyncState
at org.tritonus.sampled.file.jorbis.JorbisAudioFileReader.getAudioFileFormat(JorbisAudioFileReader.java:73)
at org.tritonus.share.sampled.file.TAudioFileReader.getAudioInputStream(TAudioFileReader.java:366)
at org.tritonus.share.sampled.file.TAudioFileReader.getAudioInputStream(TAudioFileReader.java:283)
at javax.sound.sampled.AudioSystem.getAudioInputStream(AudioSystem.java:1128)
at FileRecognizer.main(FileRecognizer.java:101)
Caused by: java.lang.ClassNotFoundException: com.jcraft.jogg.SyncState
at java.net.URLClassLoader$1.run(URLClassLoader.java:200)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:188)
at sun.misc.Launcher$ExtClassLoader.findClass(Launcher.java:229)
at java.lang.ClassLoader.loadClass(ClassLoader.java:303)
at java.lang.ClassLoader.loadClass(ClassLoader.java:248)
at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:316)
… 5 more
Mu code is:
/*
* Copyright 1999-2004 Carnegie Mellon University.
* Portions Copyright 2004 Sun Microsystems, Inc.
* Portions Copyright 2004 Mitsubishi Electric Research Laboratories.
* All Rights Reserved. Use is subject to license terms.
*
* See the file “license.terms” for information on usage and
* redistribution of this file, and for a DISCLAIMER OF ALL
* WARRANTIES.
*
*/
import edu.cmu.sphinx.frontend.util.StreamDataSource;
import edu.cmu.sphinx.recognizer.Recognizer;
import edu.cmu.sphinx.result.Result;
import edu.cmu.sphinx.util.props.ConfigurationManager;
import edu.cmu.sphinx.util.props.PropertyException;
import java.io.File;
import java.io.IOException;
import java.net.URL;
import javax.sound.sampled.AudioFileFormat;
import javax.sound.sampled.AudioFormat;
import javax.sound.sampled.AudioInputStream;
import javax.sound.sampled.AudioSystem;
import javax.sound.sampled.UnsupportedAudioFileException;
/**
* A simple Sphinx-4 application that decodes a .WAV file containing
* connnected-digits audio data. The audio format
* itself should be PCM-linear, with the sample rate, bits per sample,
* sign and endianness as specified in the config.xml file.
* “${file_prompt}”
*
* Set up the default eclipse jre to be the one included
*
* Classpath lib order
* JRE System Lib (custom 5.x version)
* jl1.0.jar
* tritonus.jar
* tritonus_share.jar
* tritonus_remaining.jar
* tritonus_mp3.jar
* sphinx4.jar
* tools.jar
* jsapi.jar
* junit4.1.jar
* javalayer.jar
* corpora
* sphinx4
*
*/
public class FileRecognizer {
/**
* Main method for running the WavFile demo.
*
*
*/
public boolean convertedFile = false;
public static void main(String[] args) {
try {
URL audioFileURL;
if (args.length > 0) {
audioFileURL = new File(args[0]).toURI().toURL();
System.out.println(audioFileURL);
} else {
//if the ${file_prompt} isn’t in the program arguments, it’ll go with this:
audioFileURL = FileRecognizer.class.getResource(“”);
}
URL configURL = FileRecognizer.class.getResource(“config.xml”);
System.out.println(“URL:”+configURL);
System.out.println(“Loading Recognizer…\n”);
ConfigurationManager cm = new ConfigurationManager(configURL);
System.out.println(“cm:”+cm);
Recognizer recognizer = (Recognizer) cm.lookup(“recognizer”);
System.out.println(“Recognise: “+recognizer);
/* allocate the resource necessary for the recognizer */
recognizer.allocate();
System.out.println(“Recognizer : “+recognizer);
// System.out.println(“Decoding ” + audioFileURL.getFile());
// System.out.println(AudioSystem.getAudioFileFormat(audioFileURL));
StreamDataSource reader = (StreamDataSource) cm.lookup(“streamDataSource”);
System.out.println(“reader: “+reader);
AudioInputStream ais = AudioSystem.getAudioInputStream(audioFileURL);
System.out.println(“ais: “+ais);
FileRecognizer wavFile = new FileRecognizer();
System.out.println(wavFile);
// Convert it to the proper format
AudioFormat targetFormat =
new AudioFormat(16000f,
16, // sample size in bits
1, // mono
true, // signed
true);
System.out.println(targetFormat);
//new AudioFormat(AudioFormat.Encoding.PCM_SIGNED, 16000, 16, 1, 2, 16000, false);
AudioInputStream convertedAis = wavFile.convertAudioInputStream(ais, targetFormat);
File newFile = null;
if (wavFile.convertedFile)
{
newFile = wavFile.writeConvertedFile(convertedAis, audioFileURL.toString());
audioFileURL = newFile.toURI().toURL();
ais = AudioSystem.getAudioInputStream(audioFileURL);
}
/* set the stream data source to read from the audio file */
reader.setInputStream(ais, audioFileURL.getFile());
/* decode the audio file */
Result result = recognizer.recognize();
/* print out the results */
if (result != null) {
System.out.println(“\nRESULT: ” +
result.getBestFinalResultNoFiller() + “\n”);
} else {
System.out.println(“Result: null\n”);
}
if (newFile != null)
newFile.delete();
} catch (IOException e) {
System.err.println(“Problem when loading WavFile: ” + e);
e.printStackTrace();
} catch (PropertyException e) {
System.err.println(“Problem configuring WavFile: ” + e);
e.printStackTrace();
}
// catch (InstantiationException e) {System.err.println(“Problem creating WavFile: ” + e); e.printStackTrace();}
catch (UnsupportedAudioFileException e) {
System.err.println(“Audio file format not supported: ” + e);
e.printStackTrace();
}
}
private AudioInputStream convertAudioInputStream(AudioInputStream sourceAis, AudioFormat targetFormat) {
AudioFormat baseFormat = sourceAis.getFormat();
AudioFormat intermediateFormat;
AudioInputStream convertedAis = sourceAis;
// First convert the encoding, if necessary
if (!baseFormat.getEncoding().equals(targetFormat.getEncoding())) {
intermediateFormat = new AudioFormat(
targetFormat.getEncoding(),
baseFormat.getSampleRate(), baseFormat.getSampleSizeInBits(), baseFormat.getChannels(),
baseFormat.getChannels() * 2, baseFormat.getSampleRate(),
false);
convertedAis = AudioSystem.getAudioInputStream(intermediateFormat, sourceAis);
//this.writeConvertedFile(convertedAis, “C:\\encoding.wav”);
baseFormat = intermediateFormat;
sourceAis = convertedAis;
convertedFile = true;
}
// Then convert the sample rate
if (baseFormat.getSampleRate() != targetFormat.getSampleRate()) {
intermediateFormat = new AudioFormat(
baseFormat.getEncoding(),
targetFormat.getSampleRate(), baseFormat.getSampleSizeInBits(), baseFormat.getChannels(),
baseFormat.getChannels() * 2, targetFormat.getSampleRate(),
false);
convertedAis = AudioSystem.getAudioInputStream(intermediateFormat, sourceAis);
//this.writeConvertedFile(convertedAis, “C:\\sample.wav”);
baseFormat = intermediateFormat;
sourceAis = convertedAis;
convertedFile = true;
}
// Then convert the number of channels
if (baseFormat.getChannels() > targetFormat.getChannels()) {
intermediateFormat = new AudioFormat(
baseFormat.getEncoding(),
baseFormat.getSampleRate(), baseFormat.getSampleSizeInBits(), targetFormat.getChannels(),
targetFormat.getChannels() * 2, baseFormat.getSampleRate(),
false);
convertedAis = AudioSystem.getAudioInputStream(intermediateFormat, sourceAis);
//this.writeConvertedFile(convertedAis, “C:\\channels.wav”);
baseFormat = intermediateFormat;
sourceAis = convertedAis;
convertedFile = true;
}
return convertedAis;
}
private File writeConvertedFile(AudioInputStream sourceAis, String fileName)
{
File tempfile = null;
fileName = “tempwavfile.wav”;
//fileName = fileName.substring(6, fileName.length()-4) + “_new.wav”;
try
{
//This just takes an audio stream, writes it to disk, then plays it the way TALL usually does.
//it’s a test to see if the input stream is readable by the Java audio providers like Tritonus
//System.out.println(fileName);
tempfile = new File(fileName);
AudioSystem.write(sourceAis, AudioFileFormat.Type.WAVE, tempfile);
}
catch (Exception e)
{
System.out.println(e);
}
return tempfile;
}
}
The exception looks like an audio provider problem, like your ogg vorbis audio provider jar isn’t in your classpath. Add that (link to the site in the MP3s in Sphinx post) and you should be good to go.
i tried configuring it. but not working.
it gives a error like this
java.lang.ExceptionInInitializerError
Caused by: java.lang.RuntimeException: Uncompilable source code – package edu.cmu.sphinx.frontend.util does not exist
at wavfile.WavFile.(WavFile.java:15)
Could not find the main class: wavfile.WavFile. Program will exit.
Exception in thread “main” Java Result: 1
BUILD SUCCESSFUL (total time: 0 seconds)
pls help me if you can
now my error is in
Exception in thread “main” java.lang.Error: Unresolved compilation problem:
at wavfile.WavFile.main(WavFile.java:28)
can someone help me out
thanks.. its working after one day of configuration..
i want to know how to develop it to the words. only numbers are recognising. and like 4 out of 20 numbers are only correct also.
hey i did as you said above but i m havin a problem…the jsapi.exe file runs in eclipse bt the jsapi.jar file does not appear….if i copy the jar file from lib (by extracting the jar file there in the lib folder and then copying it into eclipse) it does not build path….plz can you help me out with this….i know very little about this….
First, you just run the jsapi.exe file to extract the jar. Then you put the jar in a folder somewhere in your eclipse project. Then in eclipse, you may need to right-click on the folder you put it in and hit “refresh”. Then go to project->properties->build path->and add the jar.
is there any changes that we have to make in wavefile.java….its showing me an error there
Descriptio The declared package “demo.sphinx.wavfile” does not match the expected package
“wavfile”
Resource WavFile.java
Path /speechreg/src/wavfile
Location line 13
Type Java Problem
This just means that your package layout seems to be different from what it should be. It looks like you could remove the demo.sphinx and leave wavfile and it might work.
thx for yr help…as you said i hd tried the same thing removing the demo.sphinx and keeping wavfile as it is and still it hasnt worked…
I have done the changes in the configuration as above but still after running I am getting this exception please help me out..
Exception in thread “main” java.lang.RuntimeException: java.io.FileNotFoundException: D:\sphinx4-1.0beta3-src\speech\config.xml (The system cannot find the file specified)
at edu.cmu.sphinx.util.props.ConfigurationManager.(ConfigurationManager.java:61)
at edu.cmu.sphinx.demo.wavfile.WavFile.main(WavFile.java:40)
Nice job, but after doing all steps Im getting this error message:
Exception in thread “main” Property Exception component:’grammarLocation’ property:’grammarLocation’ – Can’t locate resource:/wavfile.WavFile
edu.cmu.sphinx.util.props.InternalConfigurationException: java.lang.ClassNotFoundException: wavfile.WavFile
…..
Im newbie to eclipse and sphinx so it is possible Im missing something, Ive checked file locations and everything seems to be in right place.
Thx for any advice!
pradeep:
now my error is in:
Exception in thread “main” java.lang.Error: Unresolved compilation problem:
at wavfile.WavFile.main(WavFile.java:28)
Please, help me.. how you fix your problem??
I’m sorry.. already fixed my problem.. into conf.xml had a location wrong
resource:/wavefile.WavFile!/wavfile/” //
resource:/wavfile.WavFile!/wavfile/” // ok
@naxo
is it working now. it should work. but have to learn how to make acoustic model. otherwise the accuracy is pretty low
Hi, Anybody found how to integrate sphinx-4 with asterisk ?
Thanks!!!!
hi,
i don’t find sphinx4.jar
Check the CMU sphinx website, it has more up-to-date information and jars. I have to admit that my post on sphinx4 is somewhat out of date. I hope to post an update soon.
Hi, i got the following error:
Problem when loading WavFile: java.io.IOException: Error while parsing line 102 of file:/D:/Books%20&%20Notes/MCA/SEM%205/Project/sphinx4/bin/wavfile/config.xml: Open quote is expected for attribute “{1}” associated with an element type “name”.
java.io.IOException: Error while parsing line 102 of file:/D:/Books%20&%20Notes/MCA/SEM%205/Project/sphinx4/bin/wavfile/config.xml: Open quote is expected for attribute “{1}” associated with an element type “name”.
at edu.cmu.sphinx.util.props.SaxLoader.load(SaxLoader.java:70)
at edu.cmu.sphinx.util.props.ConfigurationManager.loader(ConfigurationManager.java:383)
at edu.cmu.sphinx.util.props.ConfigurationManager.(ConfigurationManager.java:115)
at wavfile.WavFile.main(WavFile.java:61)
What could be wrong?I really need this for a project.Please help.
Look at your config xml file. It looks like something in there on line 102 that is causing the error.
My mistake,i didn’t correct the quotes aftr copying, it worked.thankyou
THANKS A LOT!
i’ve been searchin from a long time how to make it work,its working now
I know this is a fairly old thread but I REALLY need help getting this error fixed. Thank you for any and all help in advance
Exception in thread “main” java.lang.NullPointerException
at edu.cmu.sphinx.util.props.SaxLoader.load(SaxLoader.java:74)
at edu.cmu.sphinx.util.props.ConfigurationManager.(ConfigurationManager.java:58)
at robottest.Robottest.main(Robottest.java:28)
74: InputStream is = url.openStream();
58: rawPropertyMap = new SaxLoader(url, globalProperties).load();
28: cm = new ConfigurationManager(Robottest.class.getResource(“robottest.config.xml”));
s/n: I am using netbeans to build and compile and my configuration file is there.
@msmoon – Try using an absolute path to your config file. It’s probably a netbeans relative path issue.
Thanks for the fast response. Unfortunately that did not correct the error.
/home/dekita/NetBeansProjects/robottest/build.xml:13: The following error occurred while executing this line:
/home/dekita/NetBeansProjects/robottest/robottest.config.xml:11: Unexpected element “{}config” {antlib:org.apache.tools.ant}config
13:
11:
?????
I even tried spelling out the word configuration, taking out the configuration.
13:
11:
sorry, it didnt show up(tried to copy and paste from virtual box to Mac)
Thank you again!
line 13 is the
and line 11 is the
ok, one more time..I apologize again for this 13 is import and 11 is just the config line
Any chance you can paste in the lines in your build.xml files? Paste in the surrounding lines, as well.
Hi,i m doin g project on sphinx4.0 with netbeans.My project is about recognizing two nubers and wheni m speak next it will add that nuber to the textarea.My problem is that it will continuous recgniges even i m not speak anything.so please help me to make the config file i am using wsjLoader.please replay soon.
Romika, I sent you an email, but the short answer is that there is a while loop that just picks up microphone input continuously. You need to remove the while loop. For an example, look in HelloWorld.java.
hi,thanks for the answer but it will not work.My problem is that i have to speek first two numbers and then speek next to display that two numbers as one entire number
in textarea and again speek two numbers and speek next to display that number in the textarea and so on.I send my code to you please refer it replay me soon because i have to demonstrate it tomorrow.Please replay as early as possible.
package demo.sphinx;
import java.awt.event.ActionEvent;
import javax.swing.JFrame;
import javax.swing.JTextField;
import javax.swing.JTextArea;
import edu.cmu.sphinx.frontend.util.Microphone;
import edu.cmu.sphinx.linguist.dictionary.Pronunciation;
import edu.cmu.sphinx.linguist.dictionary.Word;
import edu.cmu.sphinx.recognizer.Recognizer;
import edu.cmu.sphinx.result.Result;
import edu.cmu.sphinx.util.props.ConfigurationManager;
import edu.cmu.sphinx.util.props.PropertyException;
import java.awt.ComponentOrientation;
import java.awt.Dimension;
import java.awt.GridBagConstraints;
import java.awt.GridBagLayout;
import java.awt.GridLayout;
import java.awt.event.ActionEvent;
import javax.swing.DefaultListModel;
import javax.swing.JList;
import javax.swing.JLabel;
import javax.swing.JButton;
import javax.swing.JFrame;
import javax.swing.JOptionPane;
import javax.swing.JPanel;
import javax.swing.JTabbedPane;
import javax.swing.JScrollPane;
import javax.swing.JSplitPane;
import java.awt.event.ActionListener;
import java.io.IOException;
import java.util.ArrayList;
import javax.swing.JTable;
import javax.swing.JTextArea;
public class Testing extends JFrame implements ActionListener{
JList ul,ll,fi;
DefaultListModel d1,d2,d3;
JLabel msg,lblul,lblfi;
JButton btnst,btnso,btnclr;
JTable tb1;
JTextArea ta;
JPanel mainpane,mainpane1,fc1,h1,fp;
JSplitPane sp,sp1;
JTabbedPane main;
JScrollPane listll;
Recognizer recognizer;
ConfigurationManager cm;
Result result;
String resultText;
Testing()
{
mainpane=new JPanel();
mainpane1=new JPanel();
main=new JTabbedPane();
sp=new JSplitPane();
btnst=new JButton(“Start”);
btnso=new JButton(“Stop”);
btnclr=new JButton(“Show”);
btnst.addActionListener(this);
btnso.addActionListener(this);
btnclr.addActionListener(this);
ta=new JTextArea(10,10);
ta.setEnabled(true);
ta.setEditable(true);
sp1=new JSplitPane(JSplitPane.VERTICAL_SPLIT);
add(sp);
sp.setLeftComponent(sp1);
sp.setRightComponent(main);
sp1.setTopComponent(mainpane);
sp1.setBottomComponent(mainpane1);
mainpane1.add(new JLabel(“HELPSpeak :Start to start the Microphone.”
+ “L For Lower Limit.U For Upper Limit.F for Frequency.”
+ “N for Next.C for Clear.”
+ “Stop to stop the Microphone.”));
mainpane1.add(ta);
mainpane.setLayout(new GridBagLayout());
GridBagConstraints c = new GridBagConstraints();
d1=new DefaultListModel();
d2=new DefaultListModel();
d3=new DefaultListModel();
ul = new JList(d2);
ll = new JList(d1);
fi = new JList(d3);
JScrollPane listul = new JScrollPane(ul);
listll = new JScrollPane(ll);
JScrollPane listfi = new JScrollPane(fi);
listul.setPreferredSize(new Dimension(60, 200));
listll.setPreferredSize(new Dimension(60, 200));
listfi.setPreferredSize(new Dimension(60, 200));
c.gridx = 0;
c.gridy = 0;
mainpane.add(new JLabel(“Lower Limit “),c);
c.gridx = 1;
c.gridy = 0;
mainpane.add(new JLabel(“Upper Limit “),c);
c.gridx = 2;
c.gridy = 0;
mainpane.add(new JLabel(“Frequency”),c);
c.gridx = 0;
c.gridy = 1;
mainpane.add(listll,c);
c.gridx = 1;
c.gridy = 1;
mainpane.add(listul,c);
c.gridx = 2;
c.gridy = 1;
mainpane.add(listfi,c);
c.gridx = 0;
c.gridy = 2;
mainpane.add(btnst,c);
c.gridx = 1;
c.gridy = 2;
mainpane.add(btnso,c);
c.gridx = 2;
c.gridy = 2;
mainpane.add(btnclr,c);
}
public void actionPerformed(ActionEvent ae)
{
String str=ae.getActionCommand();
String digit=”";
if(str.equals(“Start”))
{
ta.append(“hi”);
}
//ta.append(digit);
if(str.equals(“Show”))
{
btnclr.setEnabled(false) ;
main.addTab(“Histogram”,new Histogram());
main.addTab(“Frequency Polygon”,new FrequencyPolygon());
}
}
public void Voice()
{
try{
ConfigurationManager cm = new ConfigurationManager(Testing.class.getResource(“gui.config.xml”));
Recognizer recognizer = (Recognizer) cm.lookup(“recognizer”);
Microphone microphone = (Microphone) cm.lookup(“microphone”);
recognizer.allocate();
if (microphone.startRecording()) {
while(true){
Result result = recognizer.recognize();
if (result != null) {
String resultText = result.getBestFinalResultNoFiller();
ta.append(“You said: ” + resultText + “\n”);
} else {
ta.append(“I can’t hear what you said.\n”);
}
}
} else {
ta.append(“Cannot start microphone.”);
recognizer.deallocate();
System.exit(1);
}
} catch (IOException e) {
System.err.println(“Problem when loading HelloWorld: ” + e);
e.printStackTrace();
} catch (PropertyException e) {
System.err.println(“Problem configuring HelloWorld: ” + e);
e.printStackTrace();
} catch (InstantiationException e) {
System.err.println(“Problem creating HelloWorld: ” + e);
e.printStackTrace();
}
}
public static void main(String[] args)
{
Testing t=new Testing();
t.setVisible(true);
t.setSize(500,500);
t.setTitle(“Statistical Voice Calculator for Graph Plotting”);
t.setExtendedState(JFrame.MAXIMIZED_BOTH);
t.setDefaultCloseOperation(JFrame.EXIT_ON_CLOSE);
t.Voice();
}
}
I would need your config.xml file and your grammar. You should just be using a grammar that has all the numbers separated by |. That will fitler out any input as numbers.
hi,i’m sending you the config and grammar files you asked for.please reply soon
gui.config.xml
accuracyTracker
speedTracker
memoryTracker
microphone
premphasizer
windower
fft
melFilterBank
dct
liveCMN
featureExtraction
microphone
speechClassifier
speechMarker
nonSpeechDataFilter
premphasizer
windower
fft
melFilterBank
dct
liveCMN
featureExtraction
grammar file:=
#JSGF V1.0;
/**
* JSGF Grammar for Hello World example
*/
grammar Test_1;
public = ( NEXT | BACK | CLEAR | CURVE | HISTOGRAM | UPPER | LOWER | FREQUENCY | SHOW );
public = ( ONE | TWO | THREE | FOUR | FIVE | SIX | SEVEN | EIGHT | NINE | ZERO );
hi, m sending you the config file again…
accuracyTracker
speedTracker
memoryTracker
microphone
premphasizer
windower
fft
melFilterBank
dct
liveCMN
featureExtraction
microphone
speechClassifier
speechMarker
nonSpeechDataFilter
premphasizer
windower
fft
melFilterBank
dct
liveCMN
featureExtraction
I still think it’s a problem with the while loop. It will recognize speech until there is an aparent pause, then it transribes that, then does it all over again. Can you add some action events to you code that starts the microphone, then another button to turn it off when you are done? That’s the way it’s usually done.
hi, thnx for the reply.Can u send me the code 4 that coz we tried to do that only but it didn’t work out.We tried all possible ways but none is working.my problem is that if don’t say anything then it automatically takes 8 and doesn’t recognize 6 and 7.We even tried to remove the while loop but then it recognizes only a single digit.
hi,
i’ve tried to execute the program but i’m getting the following errors..plz help me
Loading Recognizer as defined in ‘file:/C:/Program%20Files/Java/speechtotext/sphinx4/bin/wavfile/config.xml’…
Exception in thread “main” java.lang.NoSuchFieldError: engineListeners
at com.sun.speech.engine.recognition.BaseRecognizer.fireRecognizerSuspended(BaseRecognizer.java:922)
at com.sun.speech.engine.recognition.BaseRecognizer.dispatchSpeechEvent(BaseRecognizer.java:1262)
at com.sun.speech.engine.SpeechEventUtilities.postSpeechEvent(SpeechEventUtilities.java:201)
at com.sun.speech.engine.SpeechEventUtilities.postSpeechEvent(SpeechEventUtilities.java:132)
at com.sun.speech.engine.recognition.BaseRecognizer.postRecognizerSuspended(BaseRecognizer.java:912)
at com.sun.speech.engine.recognition.BaseRecognizer.commitChanges(BaseRecognizer.java:358)
at edu.cmu.sphinx.jsapi.JSGFGrammar.commitChanges(JSGFGrammar.java:536)
at edu.cmu.sphinx.jsapi.JSGFGrammar.createGrammar(JSGFGrammar.java:243)
at edu.cmu.sphinx.linguist.language.grammar.Grammar.allocate(Grammar.java:101)
at edu.cmu.sphinx.linguist.flat.FlatLinguist.allocate(FlatLinguist.java:229)
at edu.cmu.sphinx.decoder.search.SimpleBreadthFirstSearchManager.allocate(SimpleBreadthFirstSearchManager.java:603)
at edu.cmu.sphinx.decoder.AbstractDecoder.allocate(AbstractDecoder.java:67)
at edu.cmu.sphinx.recognizer.Recognizer.allocate(Recognizer.java:157)
at wavfile.WavFile.main(WavFile.java:46)
It means that “engineListeners” is not defined in your config.xml file. Can you search the text of your config file and find engineListeners? You can look at some of the demos in sphinx for some other sample config.xml files.
hi,
this is our config.xml file for wavFile demo
audioFileDataSource
preemphasizer
dither
windower
fft
melFilterBank
dct
batchCMN
featureExtraction
we didn’t find the engine listeners
Okay, the problem might be in the Java code that is getting the elements from the .xml file. I would suggest modeling your code after the latest sphinx4 code. In fact, get the latest sphinx4 with subversion, then run the edu.cmu.sphinx.demo.hellongram.HelloNGram file. That shows you how to access a microphone directly and how the config xml file should be.
Hi,
we have tried using sphinx4-.0-beta6 version andexecuted hellongram program but getting the following error
Exception in thread “main” Property exception component:’trigramModel’ property:’location’ – Can’t locate resource:/edu/cmu/sphinx/demo/hellongram/hellongram.trigram.lm
edu.cmu.sphinx.util.props.InternalConfigurationException: Can’t locate resource:/edu/cmu/sphinx/demo/hellongram/hellongram.trigram.lm
at edu.cmu.sphinx.util.props.ConfigurationManagerUtils.getResource(ConfigurationManagerUtils.java:483)
at edu.cmu.sphinx.linguist.language.ngram.SimpleNGramModel.newProperties(SimpleNGramModel.java:93)
at edu.cmu.sphinx.util.props.PropertySheet.getOwner(PropertySheet.java:505)
at edu.cmu.sphinx.util.props.PropertySheet.getComponent(PropertySheet.java:287)
at edu.cmu.sphinx.linguist.lextree.LexTreeLinguist.newProperties(LexTreeLinguist.java:311)
at edu.cmu.sphinx.util.props.PropertySheet.getOwner(PropertySheet.java:505)
at edu.cmu.sphinx.util.props.PropertySheet.getComponent(PropertySheet.java:287)
at edu.cmu.sphinx.decoder.search.WordPruningBreadthFirstSearchManager.newProperties(WordPruningBreadthFirstSearchManager.java:204)
at edu.cmu.sphinx.util.props.PropertySheet.getOwner(PropertySheet.java:505)
at edu.cmu.sphinx.util.props.PropertySheet.getComponent(PropertySheet.java:287)
at edu.cmu.sphinx.decoder.AbstractDecoder.newProperties(AbstractDecoder.java:65)
at edu.cmu.sphinx.decoder.Decoder.newProperties(Decoder.java:37)
at edu.cmu.sphinx.util.props.PropertySheet.getOwner(PropertySheet.java:505)
at edu.cmu.sphinx.util.props.PropertySheet.getComponent(PropertySheet.java:287)
at edu.cmu.sphinx.recognizer.Recognizer.newProperties(Recognizer.java:90)
at edu.cmu.sphinx.util.props.PropertySheet.getOwner(PropertySheet.java:505)
at edu.cmu.sphinx.util.props.ConfigurationManager.lookup(ConfigurationManager.java:161)
at hellongram.HelloNGram.main(HelloNGram.java:38)
Did you just download the jar, or check it out from svn? I didn’t have any problems checking it out directly in eclipse with svn. You can download the trigram langauge model (along with everything else that HelloNGram needs) directly from the svn repository:
https://cmusphinx.svn.sourceforge.net/svnroot/cmusphinx/trunk/sphinx4/src/apps/edu/cmu/sphinx/demo/hellongram/
And you can use your own language model, just open the config xml file and change where it points to.
i have just downloaded the sphinx4-1.0beta6 bin and unzipped it and got the hellongram folder in the src folder.
it has
HelloNGram.java,HelloNGram.config,HelloNGram.manifest,HelloNgram.test,HelloNgram.trigram.lm files
Right, so follow the instructions on the CMU Sphinx site to run it:
http://cmusphinx.sourceforge.net/sphinx4/src/apps/edu/cmu/sphinx/demo/hellongram/README.html
I just downloaded it and it ran just fine. But this goes back to your original problem with your config. My suggestion is that you look at the HelloNGram.config file and use that as a basis for your config file.
what ever you told is the execution of jar folder in the bin directory but how to execute the program in the src\apps folder
For that, you need to get the source code. I suggest downloading eclipse, adding eclipse subversive (see previous post about eclipse) and then checking out the sphinx4 code from https://cmusphinx.svn.sourceforge.net/svnroot/cmusphinx/trunk/sphinx4/. You can follow similar instructions as in this post. If you download and extract eclipse, run it, install subversive, then check out the project you will have all the sphinx4 code and you should be able to run the HelloNGram.java file directly from eclipse.
i have tried it using the subversive but still i’m getting the same errors,the error is
Exception in thread “main” Property exception component:’trigramModel’ property:’location’ – Can’t locate resource:/edu/cmu/sphinx/demo/hellongram/hellongram.trigram.lm
edu.cmu.sphinx.util.props.InternalConfigurationException: Can’t locate resource:/edu/cmu/sphinx/demo/hellongram/hellongram.trigram.lm
at edu.cmu.sphinx.util.props.ConfigurationManagerUtils.getResource(ConfigurationManagerUtils.java:483)
at edu.cmu.sphinx.linguist.language.ngram.SimpleNGramModel.newProperties(SimpleNGramModel.java:93)
at edu.cmu.sphinx.util.props.PropertySheet.getOwner(PropertySheet.java:505)
at edu.cmu.sphinx.util.props.PropertySheet.getComponent(PropertySheet.java:287)
at edu.cmu.sphinx.linguist.lextree.LexTreeLinguist.newProperties(LexTreeLinguist.java:311)
at edu.cmu.sphinx.util.props.PropertySheet.getOwner(PropertySheet.java:505)
at edu.cmu.sphinx.util.props.PropertySheet.getComponent(PropertySheet.java:287)
at edu.cmu.sphinx.decoder.search.WordPruningBreadthFirstSearchManager.newProperties(WordPruningBreadthFirstSearchManager.java:204)
at edu.cmu.sphinx.util.props.PropertySheet.getOwner(PropertySheet.java:505)
at edu.cmu.sphinx.util.props.PropertySheet.getComponent(PropertySheet.java:287)
at edu.cmu.sphinx.decoder.AbstractDecoder.newProperties(AbstractDecoder.java:65)
at edu.cmu.sphinx.decoder.Decoder.newProperties(Decoder.java:37)
at edu.cmu.sphinx.util.props.PropertySheet.getOwner(PropertySheet.java:505)
at edu.cmu.sphinx.util.props.PropertySheet.getComponent(PropertySheet.java:287)
at edu.cmu.sphinx.recognizer.Recognizer.newProperties(Recognizer.java:90)
at edu.cmu.sphinx.util.props.PropertySheet.getOwner(PropertySheet.java:505)
at edu.cmu.sphinx.util.props.ConfigurationManager.lookup(ConfigurationManager.java:161)
at hellongram.HelloNGram.main(HelloNGram.java:38)
hey i got the execution for the program..that’s a problem in config.xml
hii,
my program got executed in ecllipse but the speech is not recognised and there’s no output
I want to create an application which takes a .wav (or any other standard audio file format) and converts it to text.
For Speech Recognition I have decided to use sphinx4, I have tried to run the demo Transcriber.jar provided with sphinx. Its good but That only works for a specific Grammar (written in .gram and .gxml files). How can I develop a similar program for US English ?
How do I proceed? Where do I get the language model for US English that can be used with Sphinx4?
Any Step by step tutorial/blog/post/answer will be of great help,
You can either make your own, or get an existing one. You can make a simple trigram language model with the online tool by CMU: http://www.speech.cs.cmu.edu/tools/lmtool-adv.html
You can use SRILM, IRSTLM, or whatever else to train an n-gram language model on a corpus of English text and output it into arpa format which is readable by Sphinx. There are language models that go with the corresponding acoustic models that you can download (like Hub4 and WSJ). Those can be found at the bottom of the list here:http://sourceforge.net/projects/cmusphinx/files/Acoustic%20and%20Language%20Models/
Make sure the pronunciation dictionaries, etc, are all there too. If you download the acoustic model and language model, then you should have everything you need. Just make sure the config.xml points to the right places.
{ 1 } Trackback
[...] recognizes the file and prints the results. Assuming you have the Sphinx4 jar file linked in (see this post if you don’t