Entry

Infinite Bad Guy—YouTubeに上がっている動画の自動ミックス

Simple Title

Infinite Bad Guy (IYOIYO, Kyle McDonald)

Description

YouTube上で史上最も多くカバーされた楽曲ビリー・アイリッシュのBad Guy。YouTube上にあがっている曲を解析、分類し、リズムに合わせてスムーズに繋いでいく。無限に続くBad Guyジュークボックス。

Type

Project

Year

2020

Posted at

May 1, 2021

Overview - 何がすごい?

Technology/System

この作品における機械学習の目的は

ビートを認識して、原曲のどこに当たるのかを推定
映像コンテンツの分類
映像的な類似度の推定

1. ビートを揃える

1.1 原曲をそのまま使ってる場合。

SpectrogramのCosine Similarityを取る。少しずつずらして、一番マッチする場所を探す

1.2 アコースティックカバーなど...

テンポが揺れるので簡単な手法は使えない。DTW(Dynamic Time Warping)もループするセクションがあるとうまく行かない → RNNモデルを構築

RNNの入力は $n拍 \times m特徴量$ のマトリクス。出力は $(カバー曲の位置)n拍 \times (オリジナルの位置)k拍$ 。各拍が原曲の $k$ 拍のうちのどこにあるのかの確率分布を出力。曲全体で420拍 ( $k=420$ )。

学習データは自作。5000曲(!!)に手作業でタグを打った。

音の特徴量としては学習済みのモデル学習済みのモデル YAMNet のアウトプットも試したが、Constant-Q transformの方が結果はよかった。

2/3. 映像コンテンツの分類/類似度の判定

自作のインタフェースでタグ付けをマニュアルで行う → 学習データとした。

映像の特徴量はMobileNetのアウトプットを使う。ただ異なる長さのビデオを比較するために、特徴量を組み合わせて新しい特徴料を作る作り方を工夫した。

1. 均等な感覚をあけてフレームを抽出 → それぞれの特徴量を抽出

2. フーレムごとの特徴量の差分の平均 (activity)

3. 1で抽出したサンプルのフレーム間の差分 (diversity)

def feature_fingerprint(features, sample_count=9):
  """Compute more sparse fingerprint from dense video features."""
  sample_features = select_evenly_spaced(features, sample_count)
  activity_features = np.abs(np.diff(features, axis=0)).mean(axis=0)
  diversity_features = np.abs(np.diff(sample_features, axis=0)).max(axis=0)
  fingerprint = np.vstack(
      (sample_features, activity_features, diversity_features))
  return fingerprint.reshape(-1)

Results

Infinite Bad Guy

Tens of thousands of YouTube creators have covered Billie Eilish's "Bad Guy." What if those fans could play together? One song. Thousands of covers. The world's first infinite music video.

billie.withyoutube.com

Webサイトを見れば一目瞭然。面白い！

Further Thoughts

librosaのLaplacian segmentationで音楽のセグメンテーション(パートごとに分ける)ができそう？
CinemaNetという映像の詩的な感覚?を推定するようなモデルの存在をはじめて知る。
細かい試行錯誤の過程がわかりやすく書かれていて、blog記事として優秀。真似したい...
誰がやってるんだろうと思って調べてみると...なんとあのKyle McDonaldが新しくはじめた会社らしい。そりゃ優秀だよね。

Links

制作チームによる機械学習の詳細

Creating Infinite Bad Guy

Infinite Bad Guy brings together thousands of YouTube covers of Bad Guy by Billie Eilish into an experience that lets the viewer jump from one cover to another, always in-time and without skipping a beat, guided by similarities and differences in each performance.

iyoiyo.medium.com

他の事例: EchonestのAPIを使って曲の類似する部分をブリッジすることで無限に続く曲を生成

The (Retro) Eternal Jukebox

eternalbox.dev

Google AI Exeperiments

Infinite Bad Guy by YouTube Music, Google Creative Lab, IYOIYO | Experiments with Google

Tens of thousands of YouTube creators have covered Billie Eilish's "Bad Guy." What if those fans could play together? Machine learning keeps all the covers on the same beat and lets you jump from video to video seamlessly. With endless possible combinations, every play is unique and never the same twice.

experiments.withgoogle.com

Infinite Bad Guy by YouTube Music, Google Creative Lab, IYOIYO | Experiments with Google