About

The Japanese word “bakuzen” means “vague” in English.

This site is meant to be an NLP and Computational Linguistics resource and how-to site. If you have requests or questions, please let me know and I’ll try to post answers as soon as possible.

The target audience is anyone who wants to get into NLP. It’s for those linguists who want to get more into the computer side of things. It’s for those programmers who want to contribute to the open source NLP community. Some posts will have more detail than others and other posts will assume you have more computer knowledge than other posts. If you do have questions, post them in the comments area and I’ll answer them as soon as I can.

The primary languages I will continue to add to on this site are Japanese, French, and German. If you have anything to contribute (links, how-to’s) I’m happy to post them here.

2 Comments

  1. Meir:

    Hi,

    I tried to use the tutorial in http://www.bakuzen.com/?p=16 to teach sphinx 3 hebrew words. Unfortunatelly it fails to recognize any of the words even if I give the original train WAV file to the decoder. Maybe you can help me here :)

    I would be thankful if you could look at http://dl.dropbox.com/u/344251/hebrew.zip. If you want to try it, place the contents of the zip under C:\sphinx\tutorial\hebrew\ and run prepare.cmd. This script would run all the pre-requisites and finally will try to decode a WAV file. It yields 0% accuracy while it is the same WAV file used to train the system.

    Thanks in advance,
    Meir

  2. admin:

    I left a comment for you in the http://www.bakuzen.com/?p=16 comments section.

Leave a comment

You must be logged in to post a comment.