Education area was reinvented during last 15 years using Internet. There are a lot of educational portals like coursera.com, udemy.com, online universities provide excellent opportunities to learn different topics. But some processes like test are heavy to automate. It’s easy to check automatically mathematical calculations or gramma, but esse writing or language speaking check is till demand the coach participation.
Data science can provide additional possibilities in this area, for example the language speaking cleariness definition. It’s the famous issue in language studying. Clear speaking without native accent like norwegian or indian ones is one of the professional language skills. It’s important for language schools or companies like call centers.
The project goal is to recognize the accent on audio record.
Deep Neural Networks approach was chosen to solve this task. The basic solution was done for “native English” and “non-native English” speakers, but during research it was excluded to classification different accents (f.e. French, Arabic etc).
Stella dataset was used to train the Deep Learning model, which provides 30 seconds audio files from one speaker with few accents. BeautifulSoup library was used for scraping the data from the webpage.
Data preprocessing involves converting each audio files to vectors with 13 unique features. To get all the samples processed Mel-frequency Cepstrum Coefficients (MFCC) technique was utilized, which can be done with Librosa Python library.
The data processed was used as a training data to Convolutional Neural Network with 6 layers.
Classification accuracy of the solution is about 80%.