📄

Talking Drums: Generating drum grooves with neural networks.

Entry

ドラムのパターンから、それにあったベースラインを生成するモデル

Simple Title

Hutchings, P. (2017). Talking Drums: Generating drum grooves with neural networks.

Description

ドラムのキックの位置を入力すると、リズムパターン全体を生成するモデル。言語モデルのseq-to-seqモデルの考え方を利用。

Type

Paper

Year

2017

Posted at

April 30, 2021

Tags

music

Arxiv

http://arxiv.org/abs/1706.09558

Project Page

https://doi.org/10.6084/m9.figshare.4903181.v1

Hutchings, P. (2017). Talking Drums: Generating drum grooves with neural networks. http://arxiv.org/abs/1706.09558

Overview - 何がすごい?

ドラムのキックの位置を入力すると、リズムパターン全体を生成するモデル。言語モデルのseq-to-seqモデルの考え方を利用。

Abstract

Presented is a method of generating a full drum kit part for a provided kick-drum sequence. A sequence to sequence neural network model used in natural language translation was adopted to encode multiple musical styles and an online survey was developed to test different techniques for sampling the output of the softmax function. The strongest results were found using a sampling technique that drew from the three most probable outputs at each subdivision of the drum pattern but the consistency of output was found to be heavily dependent on style.

Motivation

複数のエージェントのインタラクションで音楽を生成するシステムの中でリズムパートを担当するエージェントがこれ

Architecture / Data

250のRock, Pop, funk, Afro-Cubanの楽曲をデータセットとして利用。いわゆるタブ譜。
各小節は48分割。トリプレットも扱えるように。
例えば四つうちのインプットはこんなふうに

各ドラムは次の表の文字で表現

インプットは順番を前後入れ替えて(最後から順に)入力に使う → 時間の依存関係を学習するため
RNNのseq2seqモデル　3層でノード数は128

Results

ドラムに関するFacebookグループで参加者を募ってユーザ調査

ユーザテストに使ったウェブシステムのUI

ユーザテストに使ったウェブシステムのUI

三つの手法を比較

method 1 - greedy sampling (一番確率が高いものだけを選ぶ)
method 2 - モデルの出力に合わせて確率的にサンプリング
method 3 - トップ三つの中から確率的にサンプリング

ユーザ調査的にはMethod 3が一番いい結果に

生成されたリズム - Rockの例

Method 1

Method 2

Method 3

もう一つのユーザ調査 - モデルの出力結果としてもっともらしいものを選んだ時とそうでない時のユーザの評価の違い。一番もっともらしいものを選んだ時よりも、少しだけ低いものを選んだ時の方がユーザ評価が高い結果に..

予測された確率とユーザ評価

予測された確率とユーザ評価

Further Thoughts

二番目のユーザ評価が面白い!

Links

似たコンセプトでベースとドラムの関係を学習

📄A Bassline Generation System Based on Sequence-to-Sequence Learning