🎮 Undertale LoRA v2_panns — PANNs auto-tag caption training
v2_panns は v1 の人手 caption に PANNs CNN14 で抽出した楽器 tag top-5 suffix を加えて training。学習データの楽器情報を強化することで本家との楽器 identity 距離 (PANNs KL) と spectral profile (4-band drift) が両方改善するか検証。
⚠ Scale = 1.0 default 確定 (2026-05-06 22:11 JST 耳判定)
hikari「scale1.0 が 1 番いい。1.3〜1.6 もなんとか聴けるけど、2.0 だとおかしくなり始めて、それ以降は雑音」。v2_panns 評価も SCALE=1.0 で行う。metric (4-band drift / PANNs KL) は補助 signal、final gate は耳。
📊 v1 LoRA vs v2_panns LoRA — PANNs caption の効き目
| prompt |
4-band drift (mean) |
PANNs KL (mean) |
|
v1 | v2 | Δ |
v1 | v2 | Δ |
| 01_battle_papyrus |
15.2pp |
15.9pp |
+0.7pp |
0.521 |
0.099 |
-0.422 |
| 02_battle_megalo |
23.4pp |
21.1pp |
-2.3pp |
0.144 |
0.130 |
-0.014 |
| 03_area_snowy |
18.7pp |
17.3pp |
-1.3pp |
0.138 |
0.285 |
+0.148 |
| 04_atmos_memory |
14.9pp |
18.3pp |
+3.3pp |
0.115 |
0.492 |
+0.376 |
| 05_boss_asgore |
15.4pp |
20.9pp |
+5.5pp |
0.105 |
0.062 |
-0.043 |
| 06_battle_spider |
12.1pp |
23.2pp |
+11.0pp |
0.046 |
0.139 |
+0.093 |
緑 = v2 改善 / 赤 = v2 退行。v2 で 4-band と PANNs KL が両方緑なら PANNs caption が効いた、片方だけなら部分的、両方赤なら退行。
📈 全体 mean
🔍 v2_panns 自身の効き (lora vs baseline)
| prompt |
4-band drift (mean) |
PANNs KL (mean) |
|
baseline | v2 LoRA | Δ |
baseline | v2 LoRA | Δ |
| 01_battle_papyrus |
16.5pp |
15.9pp |
-0.6pp |
0.145 |
0.099 |
-0.047 |
| 02_battle_megalo |
20.8pp |
21.1pp |
+0.4pp |
0.101 |
0.130 |
+0.029 |
| 03_area_snowy |
15.9pp |
17.3pp |
+1.5pp |
0.055 |
0.285 |
+0.231 |
| 04_atmos_memory |
27.7pp |
18.3pp |
-9.5pp |
0.371 |
0.492 |
+0.121 |
| 05_boss_asgore |
24.6pp |
20.9pp |
-3.7pp |
0.311 |
0.062 |
-0.249 |
| 06_battle_spider |
17.2pp |
23.2pp |
+6.0pp |
0.148 |
0.139 |
-0.009 |
🔍 per-take 詳細 (v2)
01_battle_papyrus
| 4-band drift | PANNs KL | top instruments | listen |
| lora take 1 |
19.3pp |
0.169 |
Sampler 0.03, Synthesizer 0.02 |
|
| base take 1 |
11.2pp |
0.096 |
Piano 0.03, Sampler 0.03 |
|
| lora take 2 |
8.6pp |
0.043 |
Sampler 0.01, Guitar 0.01 |
|
| base take 2 |
19.0pp |
0.209 |
Drum and bass 0.04, Sampler 0.02 |
|
| lora take 3 |
19.7pp |
0.084 |
Orchestra 0.01, Steelpan 0.01 |
|
| base take 3 |
19.3pp |
0.131 |
Sampler 0.04, Drum and bass 0.02 |
|
02_battle_megalo
| 4-band drift | PANNs KL | top instruments | listen |
| lora take 1 |
17.1pp |
0.230 |
Drum 0.03, Drum kit 0.02 |
|
| base take 1 |
20.7pp |
0.089 |
Sampler 0.03, Synthesizer 0.02 |
|
| lora take 2 |
31.3pp |
0.062 |
Piano 0.02, Sampler 0.01 |
|
| base take 2 |
20.5pp |
0.079 |
Sampler 0.02, Piano 0.01 |
|
| lora take 3 |
15.0pp |
0.098 |
Sampler 0.04, Synthesizer 0.03 |
|
| base take 3 |
21.1pp |
0.135 |
Sampler 0.04, Synthesizer 0.02 |
|
03_area_snowy
| 4-band drift | PANNs KL | top instruments | listen |
| lora take 1 |
17.6pp |
0.060 |
Piano 0.01, Orchestra 0.01 |
|
| base take 1 |
18.9pp |
0.040 |
Piano 0.02, Sampler 0.01 |
|
| lora take 2 |
17.7pp |
0.036 |
Piano 0.01, Electric piano 0.01 |
|
| base take 2 |
9.9pp |
0.046 |
Piano 0.02, Sampler 0.02 |
|
| lora take 3 |
16.7pp |
0.760 |
Piano 0.13, Electric piano 0.13 |
|
| base take 3 |
18.8pp |
0.079 |
Sampler 0.03, Piano 0.02 |
|
04_atmos_memory
| 4-band drift | PANNs KL | top instruments | listen |
| lora take 1 |
20.4pp |
0.426 |
Drum and bass 0.06, Harp 0.02 |
|
| base take 1 |
20.3pp |
0.277 |
Drum and bass 0.06, Sampler 0.02 |
|
| lora take 2 |
16.9pp |
0.945 |
Violin, fiddle 0.06, Saxophone 0.05 |
|
| base take 2 |
24.4pp |
0.727 |
Saxophone 0.03, Guitar 0.03 |
|
| lora take 3 |
17.5pp |
0.104 |
Electric piano 0.03, Piano 0.03 |
|
| base take 3 |
38.5pp |
0.109 |
Harp 0.01, Guitar 0.01 |
|
05_boss_asgore
| 4-band drift | PANNs KL | top instruments | listen |
| lora take 1 |
13.3pp |
0.022 |
Harpsichord 0.01, Piano 0.01 |
|
| base take 1 |
38.3pp |
0.084 |
Guitar 0.02, Piano 0.01 |
|
| lora take 2 |
32.0pp |
0.060 |
Sampler 0.02, Violin, fiddle 0.01 |
|
| base take 2 |
13.2pp |
0.125 |
Sampler 0.03, Piano 0.03 |
|
| lora take 3 |
17.4pp |
0.103 |
Piano 0.02, Sampler 0.02 |
|
| base take 3 |
22.2pp |
0.723 |
Electric piano 0.17, Piano 0.15 |
|
06_battle_spider
| 4-band drift | PANNs KL | top instruments | listen |
| lora take 1 |
22.1pp |
0.217 |
Violin, fiddle 0.03, Harp 0.03 |
|
| base take 1 |
15.3pp |
0.089 |
Guitar 0.03, Piano 0.02 |
|
| lora take 2 |
32.5pp |
0.134 |
Sampler 0.03, Synthesizer 0.03 |
|
| base take 2 |
23.0pp |
0.297 |
Electric piano 0.09, Piano 0.09 |
|
| lora take 3 |
14.9pp |
0.065 |
Sampler 0.02, Synthesizer 0.02 |
|
| base take 3 |
13.2pp |
0.058 |
Sampler 0.01, Piano 0.01 |
|
🎧 reference