サウンドプログラミング2

科目名：　メディアネットワーク実験IIA（2019年～）
対象：　メディアネットワークコース3年目
日時：　11月18日（水） - 11月19日（木）13:00～18:00
場所：　M棟1階計算機室
レポート提出締切：　11月25日（火）13:00
レポート提出先：　メールにファイルを添付し、aoki@ime.ist.hokudai.ac.jpまで提出すること．
連絡先：　青木直史（情報エレクトロニクス棟6階6-07）（Tel: 011-706-6532）（E-mail: aoki@ime.ist.hokudai.ac.jp）

目的

　音はマルチメディアコンテンツを構成する重要な要素である．本実験は，Pythonによるプログラミングを通して，サウンド処理に対する理解を深めることを目的としている．

1．はじめに

　本実験は，Jupyter Notebookを利用し，下記のプログラムをブラウザを使って実行しながら進めるものとする．なお，音を確認する場合は各自のイヤフォンまたはヘッドフォンを使うこと．環境のインストールはつぎの手順のとおり．

(1) Anacondaのインストール

https://www.anaconda.com/distribution/
から「Python 3.7」のインストーラをダウンロードして実行．

(2) ライブラリのインストール

Anaconda Promptを開く．

pip install librosa
と入力して実行．

pip install --upgrade tensorflow==2.0.0
と入力して実行．

pip install keras
と入力して実行．

(3) Jupyter Notebookの起動

jupyter notebook
と入力して実行．

(4) ブラウザ（Chrome推奨）を利用してプログラミングを行う．

「New」ボタンをクリックして「Python3」を選択．

音ファイルや画像ファイルを利用するときは「Upload」ボタンをクリックし，必要なファイルをJupyter Notebookにアップロードすること．

2．音楽制作

　つぎのプログラムを実行し，音楽制作に必要な要素について考察しなさい．

(1)

%matplotlib inline
import numpy as np
from scipy.io import wavfile
from IPython.display import display, Audio

(2)

def sine_instrument(fs, f, a, duration):
    length_of_s = int(fs * duration)
    s = np.zeros(length_of_s)
    for n in range(length_of_s):
        s[n] = np.sin(2 * np.pi * f * n / fs)

    for n in range(int(fs * 0.01)):
        s[n] *= n / (fs * 0.01)
        s[length_of_s - n - 1] *= n / (fs * 0.01)

    gain = a / np.max(np.abs(s))
    s *= gain
    return s

(3)

number_of_note = 16
score = np.array([[1, 0, 659.26, 0.8, 1],
                  [1, 1, 587.33, 0.8, 1],
                  [1, 2, 523.25, 0.8, 1],
                  [1, 3, 493.88, 0.8, 1],
                  [1, 4, 440.00, 0.8, 1],
                  [1, 5, 392.00, 0.8, 1],
                  [1, 6, 440.00, 0.8, 1],
                  [1, 7, 493.88, 0.8, 1],
                  [2, 0, 261.63, 0.8, 1],
                  [2, 1, 196.00, 0.8, 1],
                  [2, 2, 220.00, 0.8, 1],
                  [2, 3, 164.81, 0.8, 1],
                  [2, 4, 174.61, 0.8, 1],
                  [2, 5, 130.81, 0.8, 1],
                  [2, 6, 174.61, 0.8, 1],
                  [2, 7, 196.00, 0.8, 1]])

(4)

fs = 8000
length_of_s = int(fs * 10)
s = np.zeros(length_of_s)

for i in range(number_of_note):
    track = score[i, 0]
    onset = score[i, 1]
    f = score[i, 2]
    a = score[i, 3]
    duration = score[i, 4]
    x = sine_instrument(fs, f, a, duration)
    offset = int(fs * onset)
    length_of_x = len(x)
    for n in range(length_of_x):
        s[offset + n] += x[n]

(5)

gain = 0.5 / np.max(np.abs(s))
s *= gain

(6)

wavfile.write('ex01.wav', fs, (s * 32768).astype(np.int16))

(7)

Audio('ex01.wav')

3．MIDIのパラメータ

　つぎのプログラムを実行し，ノートナンバー，ベロシティ，ゲートタイムといったMIDIのパラメータの役割について考察しなさい．

(1)

%matplotlib inline
import numpy as np
from scipy.io import wavfile
from IPython.display import display, Audio

(2)

def sine_instrument(fs, note_number, velocity, gate):
    f = 440 * np.power(2, (note_number - 69) / 12)
    length_of_s = int(fs * gate)
    s = np.zeros(length_of_s)
    for n in range(length_of_s):
        s[n] = np.sin(2 * np.pi * f * n / fs)

    for n in range(int(fs * 0.01)):
        s[n] *= n / (fs * 0.01)
        s[length_of_s - n - 1] *= n / (fs * 0.01)

    gain = velocity / 127 / np.max(np.abs(s))
    s *= gain
    return s

(3)

number_of_note = 16
score = np.array([[1,    0, 76, 96, 960],
                  [1,  960, 74, 96, 960],
                  [1, 1920, 72, 96, 960],
                  [1, 2880, 71, 96, 960],
                  [1, 3840, 69, 96, 960],
                  [1, 4800, 67, 96, 960],
                  [1, 5760, 69, 96, 960],
                  [1, 6720, 71, 96, 960],
                  [2,    0, 60, 96, 960],
                  [2,  960, 55, 96, 960],
                  [2, 1920, 57, 96, 960],
                  [2, 2880, 52, 96, 960],
                  [2, 3840, 53, 96, 960],
                  [2, 4800, 48, 96, 960],
                  [2, 5760, 53, 96, 960],
                  [2, 6720, 55, 96, 960]])

(4)

time_division = 480
tempo = 120

(5)

fs = 8000
length_of_s = int(fs * 10)
s = np.zeros(length_of_s)

for i in range(number_of_note):
    track = score[i, 0]
    onset = score[i, 1] * 60  / tempo / time_division
    note_number = score[i, 2]
    velocity = score[i, 3]
    gate = int(score[i, 4] * 60  / tempo / time_division)
    x = sine_instrument(fs, note_number, velocity, gate)
    offset = int(fs * onset)
    length_of_x = len(x)
    for n in range(length_of_x):
        s[offset + n] += x[n]

(6)

gain = 0.5 / np.max(np.abs(s))
s *= gain

(7)

wavfile.write('ex02.wav', fs, (s * 32768).astype(np.int16))

(8)

Audio('ex02.wav')

4．加算合成によるマリンバの音響合成

　つぎのプログラムを実行し，加算合成によるマリンバの音響合成について考察しなさい．

(1)

%matplotlib inline
import numpy as np
from scipy.io import wavfile
from IPython.display import display, Audio

(2)

def ADSR(fs, A, D, S, R, gate, duration):
    A = int(fs * A)
    D = int(fs * D)
    R = int(fs * R)
    gate = int(fs * gate)
    duration = int(fs * duration)
    e = np.zeros(duration)
    if A != 0:
        for n in range(A):
            e[n] = 1.0 - np.exp(-5.0 * n / A)

    if D != 0:
        for n in range(A, gate):
            e[n] = S + (1.0 - S) * np.exp(-5.0 * (n - A) / D)

    else:
        for n in range(A, gate):
            e[n] = S

    if R != 0:
        for n in range(gate, duration):
            e[n]= e[gate - 1] * np.exp(-5.0 * (n - gate + 1) / R)

    return e

(3)

def marimba(fs, note_number, velocity, gate):
    f0 = 440 * np.power(2, (note_number - 69) / 12)
    duration = gate + 0.5
    length_of_s = int(fs * duration)
    s = np.zeros(length_of_s)
    p = np.zeros(length_of_s)
    
    for n in range(length_of_s):
        p[n] = np.sin(2 * np.pi * f0 * n / fs)
    
    vca_A = 0.05
    vca_D = 0.8
    vca_S = 0
    vca_R = 0.8
    vca_offset = 0
    vca_depth = 1
    vca = ADSR(fs, vca_A, vca_D, vca_S, vca_R, gate, duration)
    for n in range(length_of_s):
        vca[n] = vca_offset + vca[n] * vca_depth
    for n in range(length_of_s):
        p[n] *= vca[n] * 0.3
    for n in range(length_of_s):
        s[n] += p[n]
    
    for n in range(length_of_s):
        p[n] = np.sin(2 * np.pi * f0 * 4 * n / fs)
    
    vca_A = 0.05
    vca_D = 0.2
    vca_S = 0
    vca_R = 0.2
    vca_offset = 0
    vca_depth = 1
    vca = ADSR(fs, vca_A, vca_D, vca_S, vca_R, gate, duration)
    for n in range(length_of_s):
        vca[n] = vca_offset + vca[n] * vca_depth
    for n in range(length_of_s):
        p[n] *= vca[n] * 1.0
    for n in range(length_of_s):
        s[n] += p[n]
    
    for n in range(length_of_s):
        p[n] = np.sin(2 * np.pi * f0 * 10 * n / fs)
    
    vca_A = 0.05
    vca_D = 0.1
    vca_S = 0
    vca_R = 0.1
    vca_offset = 0
    vca_depth = 1
    vca = ADSR(fs, vca_A, vca_D, vca_S, vca_R, gate, duration)
    for n in range(length_of_s):
        vca[n] = vca_offset + vca[n] * vca_depth
    for n in range(length_of_s):
        p[n] *= vca[n] * 0.05
    for n in range(length_of_s):
        s[n] += p[n]
            
    gain = velocity / 127 / np.max(np.abs(s))
    s *= gain
    
    return s

(4)

number_of_note = 16
score = np.array([[1,    0, 76, 96, 960],
                  [1,  960, 74, 96, 960],
                  [1, 1920, 72, 96, 960],
                  [1, 2880, 71, 96, 960],
                  [1, 3840, 69, 96, 960],
                  [1, 4800, 67, 96, 960],
                  [1, 5760, 69, 96, 960],
                  [1, 6720, 71, 96, 960],
                  [2,    0, 60, 96, 960],
                  [2,  960, 55, 96, 960],
                  [2, 1920, 57, 96, 960],
                  [2, 2880, 52, 96, 960],
                  [2, 3840, 53, 96, 960],
                  [2, 4800, 48, 96, 960],
                  [2, 5760, 53, 96, 960],
                  [2, 6720, 55, 96, 960]])

(5)

time_division = 480
tempo = 120

(6)

fs = 8000
length_of_s = int(fs * 10)
s = np.zeros(length_of_s)

for i in range(number_of_note):
    track = score[i, 0]
    onset = score[i, 1] * 60  / tempo / time_division
    note_number = score[i, 2]
    velocity = score[i, 3]
    gate = int(score[i, 4] * 60  / tempo / time_division)
    x = marimba(fs, note_number, velocity, gate)
    offset = int(fs * onset)
    length_of_x = len(x)
    if track == 1:
        gain = 1
    elif track == 2:
        gain = 0.1
    for n in range(length_of_x):
        s[offset + n] += x[n] * gain

(7)

gain = 0.5 / np.max(np.abs(s))
s *= gain

(8)

wavfile.write('ex03.wav', fs, (s * 32768).astype(np.int16))

(9)

Audio('ex03.wav')

5．減算合成によるオルガンの音響合成

　つぎのプログラムを実行し，減算合成によるオルガンの音響合成について考察しなさい．

(1)

%matplotlib inline
import numpy as np
from scipy.io import wavfile
from IPython.display import display, Audio

(2)

def ADSR(fs, A, D, S, R, gate, duration):
    A = int(fs * A)
    D = int(fs * D)
    R = int(fs * R)
    gate = int(fs * gate)
    duration = int(fs * duration)
    e = np.zeros(duration)
    if A != 0:
        for n in range(A):
            e[n] = 1.0 - np.exp(-5.0 * n / A)

    if D != 0:
        for n in range(A, gate):
            e[n] = S + (1.0 - S) * np.exp(-5.0 * (n - A) / D)

    else:
        for n in range(A, gate):
            e[n] = S

    if R != 0:
        for n in range(gate, duration):
            e[n]= e[gate - 1] * np.exp(-5.0 * (n - gate + 1) / R)

    return e

(3)

def LPF(fs, fc, Q):
    fc /= fs
    fc = np.tan(np.pi * fc) / (2.0 * np.pi)
    a = np.zeros(3)
    b = np.zeros(3)
    a[0] = 1.0 + 2.0 * np.pi * fc / Q + 4.0 * np.pi * np.pi * fc * fc
    a[1] = (8.0 * np.pi * np.pi * fc * fc - 2.0) / a[0]
    a[2] = (1.0 - 2.0 * np.pi * fc / Q + 4.0 * np.pi * np.pi * fc * fc) / a[0]
    b[0] = 4.0 * np.pi * np.pi * fc * fc / a[0]
    b[1] = 8.0 * np.pi * np.pi * fc * fc / a[0]
    b[2] = 4.0 * np.pi * np.pi * fc * fc / a[0]
    a[0] = 1.0
    return a, b

(4)

def reed_organ(fs, note_number, velocity, gate):
    f0 = 440 * np.power(2, (note_number - 69) / 12)
    duration = gate + 0.1
    length_of_s = int(fs * duration)
    s0 = np.zeros(length_of_s)

    vco_A = 0
    vco_D = 0
    vco_S = 1
    vco_R = 0
    vco_offset = f0
    vco_depth = 0
    vco = ADSR(fs, vco_A, vco_D, vco_S, vco_R, gate, duration)
    for n in range(length_of_s):
        vco[n] = vco_offset + vco[n] * vco_depth
    
    t = 0
    for n in range(length_of_s):
        s0[n] = -2 * t + 1
        delta = vco[n] / fs
        if 0 <= t and t < delta:
            x = t / delta
            y = -x * x + x + x - 1
            s0[n] += y
        elif 1 - delta < t and t <= 1:
            x = (t - 1) / delta
            y = x * x + x + x + 1
            s0[n] += y;
        
        t += delta
        if t >= 1:
            t -= 1

    vcf_A = 0
    vcf_D = 0
    vcf_S = 1
    vcf_R = 0
    vcf_offset = f0 * 2
    vcf_depth = 0
    vcf = ADSR(fs, vcf_A, vcf_D, vcf_S, vcf_R, gate, duration)
    for n in range(length_of_s):
        vcf[n] = vcf_offset + vcf[n] * vcf_depth

    s1 = np.zeros(length_of_s)
    Q = 1 / np.sqrt(2)
    for n in range(length_of_s):
        a, b = LPF(fs, vcf[n], Q)
        for m in range(0, 3):
            if n - m >= 0:
                s1[n] += b[m] * s0[n - m]

        for m in range(1, 3):
            if n - m >= 0:
                s1[n] += -a[m] * s1[n - m]

    vca_A = 0.3
    vca_D = 0
    vca_S = 1
    vca_R = 0.1
    vca_offset = 0
    vca_depth = 1
    vca = ADSR(fs, vca_A, vca_D, vca_S, vca_R, gate, duration)
    for n in range(length_of_s):
        vca[n] = vca_offset + vca[n] * vca_depth

    for n in range(length_of_s):
        s1[n] *= vca[n]

    gain = velocity / 127 / np.max(np.abs(s1))
    s1 *= gain

    return s1

(5)

def reverb(fs, s):
    length_of_s = len(s)

    d = int(fs * 0.03985)
    g = 0.871402
    r1 = np.zeros(length_of_s)
    for n in range(length_of_s):
        if n - d > 0:
            r1[n] = s[n - d] + g * r1[n - d]

    d = int(fs * 0.03610)
    g = 0.882762
    r2 = np.zeros(length_of_s)
    for n in range(length_of_s):
        if n - d > 0:
            r2[n] = s[n - d] + g * r2[n - d]

    d = int(fs * 0.03327)
    g = 0.891443
    r3 = np.zeros(length_of_s)
    for n in range(length_of_s):
        if n - d > 0:
            r3[n] = s[n - d] + g * r3[n - d]

    d = int(fs * 0.03015)
    g = 0.901117
    r4 = np.zeros(length_of_s)
    for n in range(length_of_s):
        if n - d > 0:
            r4[n] = s[n - d] + g * r4[n - d]

    r5 = np.zeros(length_of_s)
    for n in range(length_of_s):
        r5[n] = r1[n] + r2[n] + r3[n] + r4[n]

    d = int(fs * 0.005)
    g = 0.7
    r6 = np.zeros(length_of_s)
    r7 = np.zeros(length_of_s)
    for n in range(length_of_s):
        if n - d > 0:
            r6[n] = r5[n - d] + g * r6[n - d]

        r7[n] = r6[n] - g * (r5[n] + g * r6[n])

    d = int(fs * 0.0017)
    g = 0.7
    r8 = np.zeros(length_of_s)
    r9 = np.zeros(length_of_s)
    for n in range(length_of_s):
        if n - d > 0:
            r8[n] = r7[n - d] + g * r8[n - d]

        r9[n] = r8[n] - g * (r7[n] + g * r8[n])

    for n in range(length_of_s):
        s[n] += r9[n] * 0.5

    return s

(6)

number_of_note = 16
score = np.array([[1,    0, 76, 96, 960],
                  [1,  960, 74, 96, 960],
                  [1, 1920, 72, 96, 960],
                  [1, 2880, 71, 96, 960],
                  [1, 3840, 69, 96, 960],
                  [1, 4800, 67, 96, 960],
                  [1, 5760, 69, 96, 960],
                  [1, 6720, 71, 96, 960],
                  [2,    0, 60, 96, 960],
                  [2,  960, 55, 96, 960],
                  [2, 1920, 57, 96, 960],
                  [2, 2880, 52, 96, 960],
                  [2, 3840, 53, 96, 960],
                  [2, 4800, 48, 96, 960],
                  [2, 5760, 53, 96, 960],
                  [2, 6720, 55, 96, 960]])

(7)

time_division = 480
tempo = 120

(8)

fs = 8000
length_of_s = int(fs * 10)
s = np.zeros(length_of_s)

for i in range(number_of_note):
    track = score[i, 0]
    onset = score[i, 1] * 60  / tempo / time_division
    note_number = score[i, 2]
    velocity = score[i, 3]
    gate = int(score[i, 4] * 60  / tempo / time_division)
    x = reed_organ(fs, note_number, velocity, gate)
    offset = int(fs * onset)
    length_of_x = len(x)
    if track == 1:
        gain = 1
    elif track == 2:
        gain = 1
    for n in range(length_of_x):
        s[offset + n] += x[n] * gain

(9)

s = reverb(fs, s)

(10)

gain = 0.5 / np.max(np.abs(s))
s *= gain

(11)

wavfile.write('ex04.wav', fs, (s * 32768).astype(np.int16))

(12)

Audio('ex04.wav')

6．レポートについて

　下記の課題に挑戦し，レポートを作成しなさい．課題1は必須とする．課題2～3は選択課題であり，ひとつを選択して挑戦しなさい．課題4～5は発展課題であり，ひとつを選択して挑戦しなさい．

（課題1）（必修）　楽器音を作るプログラムを作成し，その楽器音を使ってカノンの音楽データを作成しなさい．なお，プログラムおよび作成した音楽データはメールで提出しなさい．

（課題2）（選択）　音響合成の方法について調べ，具体的に説明しなさい．（キーワード検索のヒント：加算合成，減算合成，FM合成）

（課題3）（選択）　サウンドエフェクトの方法について調べ，具体的に説明しなさい．（キーワード検索のヒント：リバーブ，ディストーション，コーラス）

（課題4）（発展）　少なくとも2種類の楽器音を使って，カノン以外の音楽データを作りなさい．

（課題5）（発展）　MIDIファイルを解析し，音楽情報を読み取る方法について考察しなさい．

7．創成課題

1．楽器音を生成するPythonのプログラムを作成しなさい．また，自分で生成した音を使って少なくとも2トラックの音楽を制作しなさい．

2．創成課題の成果発表に向けて，プレゼンテーションのための資料を作成しなさい．どのように音楽を制作したか，デモをまじえ具体的な説明があるものを評価します．優秀な作品は表彰します．

参考文献

青木直史, ``ゼロからはじめる音響学,'' 講談社, 2014.

青木直史, ``サウンドプログラミング入門 - 音響合成の基本とC言語による実装 - ,'' 技術評論社, 2013.

青木直史, ``C言語ではじめる音のプログラミング - サウンドエフェクトの信号処理 - ,'' オーム社, 2008.

青木直史, ``ディジタル・サウンド処理入門 - 音のプログラミングとMATLAB（Octave・Scilab）における実際 - ,'' CQ出版社, 2006.

Last Modified: November 12 12:00 JST 2020 by Naofumi Aoki
E-mail: aoki@ime.ist.hokudai.ac.jp