🎮 Undertale LoRA v2_panns — PANNs auto-tag caption training

v2_panns は v1 の人手 caption に PANNs CNN14 で抽出した楽器 tag top-5 suffix を加えて training。学習データの楽器情報を強化することで本家との楽器 identity 距離 (PANNs KL) と spectral profile (4-band drift) が両方改善するか検証。
⚠ Scale = 1.0 default 確定 (2026-05-06 22:11 JST 耳判定)
hikari「scale1.0 が 1 番いい。1.3〜1.6 もなんとか聴けるけど、2.0 だとおかしくなり始めて、それ以降は雑音」。v2_panns 評価も SCALE=1.0 で行う。metric (4-band drift / PANNs KL) は補助 signal、final gate は耳。

📊 v1 LoRA vs v2_panns LoRA — PANNs caption の効き目

prompt 4-band drift (mean) PANNs KL (mean)
v1v2Δ v1v2Δ
01_battle_papyrus 15.2pp 15.9pp +0.7pp 0.521 0.099 -0.422
02_battle_megalo 23.4pp 21.1pp -2.3pp 0.144 0.130 -0.014
03_area_snowy 18.7pp 17.3pp -1.3pp 0.138 0.285 +0.148
04_atmos_memory 14.9pp 18.3pp +3.3pp 0.115 0.492 +0.376
05_boss_asgore 15.4pp 20.9pp +5.5pp 0.105 0.062 -0.043
06_battle_spider 12.1pp 23.2pp +11.0pp 0.046 0.139 +0.093
緑 = v2 改善 / 赤 = v2 退行。v2 で 4-band と PANNs KL が両方緑なら PANNs caption が効いた、片方だけなら部分的、両方赤なら退行。

📈 全体 mean

v1 LoRA — 4-band

16.6pp

v2 LoRA — 4-band

19.4pp

v1 LoRA — PANNs KL

0.178

v2 LoRA — PANNs KL

0.201

🔍 v2_panns 自身の効き (lora vs baseline)

prompt 4-band drift (mean) PANNs KL (mean)
baselinev2 LoRAΔ baselinev2 LoRAΔ
01_battle_papyrus 16.5pp 15.9pp -0.6pp 0.145 0.099 -0.047
02_battle_megalo 20.8pp 21.1pp +0.4pp 0.101 0.130 +0.029
03_area_snowy 15.9pp 17.3pp +1.5pp 0.055 0.285 +0.231
04_atmos_memory 27.7pp 18.3pp -9.5pp 0.371 0.492 +0.121
05_boss_asgore 24.6pp 20.9pp -3.7pp 0.311 0.062 -0.249
06_battle_spider 17.2pp 23.2pp +6.0pp 0.148 0.139 -0.009

🔍 per-take 詳細 (v2)

01_battle_papyrus
4-band driftPANNs KLtop instrumentslisten
lora take 1 19.3pp 0.169 Sampler 0.03, Synthesizer 0.02
base take 1 11.2pp 0.096 Piano 0.03, Sampler 0.03
lora take 2 8.6pp 0.043 Sampler 0.01, Guitar 0.01
base take 2 19.0pp 0.209 Drum and bass 0.04, Sampler 0.02
lora take 3 19.7pp 0.084 Orchestra 0.01, Steelpan 0.01
base take 3 19.3pp 0.131 Sampler 0.04, Drum and bass 0.02
02_battle_megalo
4-band driftPANNs KLtop instrumentslisten
lora take 1 17.1pp 0.230 Drum 0.03, Drum kit 0.02
base take 1 20.7pp 0.089 Sampler 0.03, Synthesizer 0.02
lora take 2 31.3pp 0.062 Piano 0.02, Sampler 0.01
base take 2 20.5pp 0.079 Sampler 0.02, Piano 0.01
lora take 3 15.0pp 0.098 Sampler 0.04, Synthesizer 0.03
base take 3 21.1pp 0.135 Sampler 0.04, Synthesizer 0.02
03_area_snowy
4-band driftPANNs KLtop instrumentslisten
lora take 1 17.6pp 0.060 Piano 0.01, Orchestra 0.01
base take 1 18.9pp 0.040 Piano 0.02, Sampler 0.01
lora take 2 17.7pp 0.036 Piano 0.01, Electric piano 0.01
base take 2 9.9pp 0.046 Piano 0.02, Sampler 0.02
lora take 3 16.7pp 0.760 Piano 0.13, Electric piano 0.13
base take 3 18.8pp 0.079 Sampler 0.03, Piano 0.02
04_atmos_memory
4-band driftPANNs KLtop instrumentslisten
lora take 1 20.4pp 0.426 Drum and bass 0.06, Harp 0.02
base take 1 20.3pp 0.277 Drum and bass 0.06, Sampler 0.02
lora take 2 16.9pp 0.945 Violin, fiddle 0.06, Saxophone 0.05
base take 2 24.4pp 0.727 Saxophone 0.03, Guitar 0.03
lora take 3 17.5pp 0.104 Electric piano 0.03, Piano 0.03
base take 3 38.5pp 0.109 Harp 0.01, Guitar 0.01
05_boss_asgore
4-band driftPANNs KLtop instrumentslisten
lora take 1 13.3pp 0.022 Harpsichord 0.01, Piano 0.01
base take 1 38.3pp 0.084 Guitar 0.02, Piano 0.01
lora take 2 32.0pp 0.060 Sampler 0.02, Violin, fiddle 0.01
base take 2 13.2pp 0.125 Sampler 0.03, Piano 0.03
lora take 3 17.4pp 0.103 Piano 0.02, Sampler 0.02
base take 3 22.2pp 0.723 Electric piano 0.17, Piano 0.15
06_battle_spider
4-band driftPANNs KLtop instrumentslisten
lora take 1 22.1pp 0.217 Violin, fiddle 0.03, Harp 0.03
base take 1 15.3pp 0.089 Guitar 0.03, Piano 0.02
lora take 2 32.5pp 0.134 Sampler 0.03, Synthesizer 0.03
base take 2 23.0pp 0.297 Electric piano 0.09, Piano 0.09
lora take 3 14.9pp 0.065 Sampler 0.02, Synthesizer 0.02
base take 3 13.2pp 0.058 Sampler 0.01, Piano 0.01

🎧 reference

← Megalovania 本家 (Toby Fox)
v1 dashboard (4-band only) ↗   v1.5 dashboard (dual-metric) ↗   scale sweep ↗