top of page

Second Iteration: Word Recognition
Details

In order to train the KNN classifier using the MFCC we treated each window of the MFCC as a data entry to the classifier. For classifying we had the classifier predict a score for each window and then for each word we added up the score for each window corresponding to that word and chose the label with the highest score as the most likely label. We also changed how we divided training and testing data. For the following confusion matrix we trained with 40 samples of each digit from each person and verified with 10 samples of each digit from each person. The results compare well with the averaged spectrogram. While the spectrogram has high accuracy for some digits for others it falls short, such as zero where it only identifies it properly about 60% of the time, meanwhile using the MFCC for feature extracting leads to 97.5% accuracy in the worst case.

KNN with MFCC: Text
KNN with MFCC: Gallery

Mel-Frequency Cepstral Coefficients (MFCC)

Mel Filter Cepstral Coefficients are calculated using the following steps

  1. Separate the signal into windows

  2. For each window calculate the Discrete Fourier Transform and obtain its magnitude

  3. Pass these magnitudes through a triangular filter bank with 20-40 filters

  4. Take the logarithm of the resulting energy

  5. Take the inverse cosine transform (similar to DFT except cosines are used instead of complex exponentials)

KNN with MFCC: Text

The m-th filter is defined as follows:

Where f() = is the list of m + 2 Mel Space Frequencies

filterBank.PNG
KNN with MFCC: Image

Where the Mel Scale is Defined as:

melScale.PNG
KNN with MFCC: Image

This method of feature extraction is ideal for audio applications. It is designed around how human hearing works and can pick out phonemes (word endings). It is tailored to give features of sound that are distinguishable between words and phonems. We decided to use it due to the close similarities we found in the time and frequency domain between the words. This method of calculating features increased the feature difference between words.

​

Mel Frequency Cepstral Coefficient (MFCC) tutorial (n.d.). In Practical Cryptography. Retrieved from http://practicalcryptography.com/miscellaneous/machine-learning/guide-mel-frequency-cepstral-coefficients-mfccs/

KNN with MFCC: Text
KNN with MFCC: Files
bottom of page