Quantcast
Channel: Active questions tagged feed-forward+python+neural-network - Stack Overflow
Viewing all articles
Browse latest Browse all 25

Feed Forward neural network for accent change? dont know if Im doing something wrong! (Tensorflow and keras)

$
0
0

Good evening, I am actually developing a proyect that involves machine learning, where I use MFCCs to change an accent to another. After a while, I reached the conclusion that my best shot (for now, since I am a total newbie in DL) is to create a neural network that change some coeficients to others of the same kind.

My hypothesis is simple. I think that I can use a simple FFNN to change some MFCCs (Mel Frecuency Cepstral Coefficients) into some other MFCCs like this paper from stanford university. What they do in that project is, they take 20 coefficients from a signal, insert them in a NN of 20x100x100x20 (sizes: 20 for input, 100 hidden layer 1, 100 hidden layer 2, 20 output), and those MFCCs are changed to another MFCCs which represent the same signal, but in a different accent.

I want to do the same, or similar, but the problem is: I just learned how to use TensorFlow and Keras to configure Neural Networks. I am a total noob and there is something I am not doing right. Maybe FFNN are only used for classification? I just dont understand them quite well, and so, my neural network just doesnt learn well and stagnate in a bad accuracy.

My question is, is this really possible? can I obtain new MFCCs values from the output of a neural network like this, even if they have negative values, and are not comprended between 0 and 1?

For more context: the data is stored in a tensor of dimension [2, m, n], where the 2 represent the accent (spanish & chilean), the m is the number of samples, and n the number of MFCCs of each sample (currently 20). I am using the spanish coefficients as the inputs and the chilean coefficients as the output/target of the NN. For the NN configuration: there are 4 layers of 20, 100, 100 and 20 neurons respectively, all of them with "tanh" activation function, the optimization function is Adam, the loss function is mean squared error and the metrics is "accuracy" (have to say I just used these because I dont know which one really use). I am training it with 10 epochs (maybe I should use more? 1k epochs??)

As I said, I dont know if there is something wrong, maybe my hypothesis is incorrect, but in the stanford paper they did it like this, so I dont really know how to do the same. I can show you my current code if it is necessary, but I dont think there is something bad since the model I configured is pretty simple, and very similar to others I saw online.

If someone would help me I will be eternally thankful, thank you all and have a good day! I can show you the code if you want too.


Viewing all articles
Browse latest Browse all 25

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>