Deep Learning summary for 2017: Text and Speech Applications
Deep Learning is disrupting many industries, and yours might not be an exception. Learn of the most notable deep learning projects of 2017 and ride the wave, or risk being rolled over…
Deep Learning (DL) has long crossed the traditional boundaries. Various DL projects are launched in the domains fromto insurance and from . For example, , while the researchers from Baidu group boast that experiments with are becoming trivial for them.
Thus said, every business should pay close attention to possible Deep Learning applications in their industry. We list the most discussed text and speech-related DL accomplishments of 2017 to benefit both Machine Learning professionals and sharp decision-makers who want to increase their bottom line.
Text-related Deep Learning applications
One of the most important areas of DL application is working with the text: translation, chatbots, text analysis and a plethora of other tasks.
From Google Translate…
A year ago, the Recurrent Neural Network. Over the year, Translate has progressed from producing an unreadable salad of words when attempting to translate large bricks of text, to producing almost flawless translations. The results are astonishing and only keeps improving!
…to Facebook negotiator chatbot
You might have heard a fairy tale of. Truth be told, that DL algorithm did come up with a non-human lexicon, yet it did not stop it from accomplishing its goal. The goal was actually for the aimed at splitting the inventory with the adversary (one gets the books, the other one gets the hats, etc.), particularly by mastering the textual conversation.
The bot was trained using a supervised recurrent network with a huge dataset of textual transcripts of real negotiations, and further polished the system using reinforced training while 2 instances of the system chatter with one another. The chatbot has mastered one of the real-life negotiation techniques, the false interest. It showed interest for the item it did not actually want and agreed to hand it over to the other party only if given the item it actually required.
Once the task was completed, the restriction to use human language was lifted, which has lead to the system inventing some new terms. Feel free toyourself and see what happens in your case!
Speech processing and generation
Another important field of DL application is related to speech processing. It includes the generation of speech and music, recognition and synchronization of the lip movements, etc.
The company behind the AlphaGo, Google Deepmind is currently developing— an algorithm that transforms the input text into raw audio. It shows extremely good results as compared to previous attempts. Listen to the .
As of now, the main flaw of this network is its performance, as 1 second of audio takes 1-2 minutes to generate, yet the progress is astonishing. To say even more, the algorithm can even create piano music! More details are available in the PDF.
Lip reading from Google DeepMind and Oxford University
Yet another initiative from Google DeepMind working in conjunction with specialists from Oxford University — lip reading algorithm described in depth in their joint. This model was trained using a dataset of more than 100,000 sentences, videos and audio files, using LSTM for audio, CNN+LSTM for video, and a combination of these 2 state-vectors that generates the state characters.
The system works with different types of input: audio, video, audio+video, making this algorithm multicanal.
Synchronization of the lips movement with the audio stream
The University of Washington processed more than 10,000 of hours of HD records of the President Obama speeches and developed an.
This creates immense capabilities for gaming industry and CGI movies… yet poses a disturbing concern the next presidential speech might actually be the computer-generated footage and not a real record.
Deep Learning is on the roll and new exciting projects are revealed in various domains on a regular basis. We are going to describe the advancements in machine perception, reinforced learning and miscellaneous other apps over the course of the next couple of weeks, so stay tuned for the updates!