GTZAN results
Baselines
Our baselines consist on random samples from the training set of GTZAN and prototypes obtained with the APNet model.
class | dataset sample | APNet |
---|---|---|
blues | ||
classical | ||
country | ||
disco | ||
hip-hop | ||
jazz | ||
metal | ||
pop | ||
reggae | ||
rock |
Our models
NOTE (2024/02/02): By mistake we duplicated the links for prototypes 0 and 1, and they sounded identical. Fixed now
We show the results obtained with PECMAE-3 (3 prototypes per target class). For each class, we sonify two of the prototypes.
class | PECMAE-3 (prototype 0) | PECMAE-3 (prototype 1) |
---|---|---|
blues | ||
classical | ||
country | ||
disco | ||
hip-hop | ||
jazz | ||
metal | ||
pop | ||
reggae | ||
rock |
Prototype-class connections
In PECMAE, prototypes are linearly connected to the classification layer.
The following plot shows the weights learned for these connections.
Certain prototypes have a slightly positive correlation with related classes.
For example, rock
and metal
.
10-second autoencoder
Results obtainer with an autoencoder with a 10-second context.
class | PECMAE-5 (10s) |
---|---|
blues | |
classical | |
country | |
disco | |
hip-hop | |
jazz | |
metal | |
pop | |
reggae | |
rock |