Entry

MIDIの演奏に強弱をつけてより自然に! – Neural Translation of Musical Style

Simple Title

Malik, Iman, and Carl Henrik Ek. "Neural translation of musical style." arXiv preprint arXiv:1708.03535 (2017).

Type

Paper

Year

2017

Posted at

June 6, 2015

Overview

まったく同じメロディーでも、演奏時の強弱のつけ方や間の取り方で曲のニュアンスは大きく異なりますよね。この研究は、ピアノ曲の演奏のダイナミクス(強弱のつけ方)を題材に、さまざまざなジャンル(といってもピアノ曲のなかでのジャンルですが)の演奏を学習することで、のっぺりとしたMIDIファイルの演奏をより自然で音楽的に豊かなものに変換する、というものです。

Abstract

Can a machine learn to play sheet music? This thesis investigates whether it is possible for a suitable computational model to learn musical style and successfully perform using sheet music. Music captures several aspects of a musician’s style. Musical style can be observed in the unique dynamics of their performances and categorising genre. Style is difficult to define but there is a perceivable relationship between dynamics and style. This thesis investigates whether it is possible for a machine to learn musical style through the dynamics of music. Great advancements have been made in music generation using machine learning. However, the focus of previous research has not been on capturing style. To capture musical style through dynamics, a new architecture called StyleNet is designed. The designed architecture is capable of synthesising the dynamics of digital sheet music. The Piano dataset is created for the purposes of learning style. The designed model is trained on the Piano dataset which contains Jazz and Classical piano solo MIDIs. Di↵erent configurations and training techniques are experimented with. The model’s generated performances are then assessed by a musical Turing test. The model’s ability to perform in different styles is also evaluated. The research concludes that StyleNet’s musical performances successfully pass the musical Turing test. This opens many doors for using such a model for assisting the creative process in the music industry. To summarise, my main contributions and achievements in this project can be listed as follows: • I designed Stylenet which is a neural network architecture capable of synthesising the dynamics of sheet music. • I implemented StyleNet using the Tensorflow library with a total of 1000 lines in Python. • I implemented a batching system to effciently train the StyleNet model with a total of 500 lines in Python. • I designed the data representation format for StyleNet. • I implemented a data preprocessing pipeline for MIDI files with a total of 2000 lines in Python. • I experimented with different StyleNet configurations and designs. • I created the Piano dataset which contains a total of 649 Jazz and Classical piano solo MIDIs. • I successfully trained a StyleNet model which passed the musical Turing test by producing performances that are indistinguishable from that of a human.

Architecture

モデルは比較的シンプルで、MIDIの楽譜の情報を入力とし、対応する音符の演奏時の強さ(MIDIのベロシティ)の情報を出力するモデルをRNN(LSTM)で実装しています.(特定の音符に対して、それより前の音符の情報だけでなく、それに続く音符の情報を利用するために、入力のシーケンスを「先読み」できるBidirectional LSTMを使っています)

Further Thoughts

AIで「音楽のスタイルを変換」というタイトルを見た時には、メロディーやリズムのパターンを生成するのかなと思ったのですが、強弱にしぼっているところが面白いですね。どうように微妙なリズムのハネ具合なども学習できると面白そうです。

Links

https://github.com/imalikshake/StyleNet/

https://imanmalik.com/cs/2017/06/05/neural-style.html

https://imanmalik.com/documents/dissertation.pdf