Sitemap

A list of all the posts and pages found on the site. For you robots out there, there is an XML version available for digesting as well.

Posts

Empirical Studies on Capsule Network Representation and Improvements

2 minute read

Published: March 13, 2019

Capsule networks is a novel approach showing promising results on SmallNorb and MNIST. Here we reproduce and build upon the impressive results shown by Sara Sabour et al. We experiment on the Capsule Network architecture by visualizing exactly what the capsules on different layers represents, what information they store about 3D objects in an image, and try to improve its classification results on CIFAR10 and SmallNorb with various methods including some tricks with reconstruction loss. Further, We present a deconvolution-based reconstruction module that reduces the number of learnable parameters by 80% from the fully-connected module presented by Sara Sabour et al.

portfolio

publications

LakhNES: Improving multi-instrumental music generation with cross-domain pre-training

Published in International Society for Music Information Retrieval (ISMIR), 2019

We are interested in the task of generating multi-instrumental music scores. The Transformer architecture has recently shown great promise for the task of piano score generation; here we adapt it to the multi-instrumental setting. Transformers are complex, high-dimensional language models which are capable of capturing long-term structure in sequence data, but require large amounts of data to fit. Their success on piano score generation is partially explained by the large volumes of symbolic data readily available for that domain. We leverage the recently-introduced NES-MDB dataset of four-instrument scores from an early video game sound synthesis chip (the NES), which we find to be well-suited to training with the Transformer architecture. To further improve the performance of our model, we propose a pre-training technique to leverage the information in a large collection of heterogeneous music, namely the Lakh MIDI dataset. Despite differences between the two corpora, we find that this transfer learning procedure improves both quantitative and qualitative performance for our primary task.

Recommended citation: Donahue, C., Mao, H. H., Li, Y. E., Cottrell, G. W., and McAuley, J. J. LakhNES: Improving multi-instrumental music generation with cross-domain pre-training. In International Society for Music Information Retrieval Conference, pp. 685–692, 2019.
Download Paper

Yiting (Ethan) Li

Sitemap

Pages

Page Not Found

Hello There 👋

Archive Layout with Content

Posts by Category

Posts by Collection

CV

CV

Markdown

Page not in menu

Page Archive

Portfolio

Publications

Sitemap

Posts by Tags

Talk map

Talks and presentations

Teaching

Terms and Privacy Policy

Blog posts

Jupyter notebook markdown generator

Posts

Empirical Studies on Capsule Network Representation and Improvements

portfolio

publications

LakhNES: Improving multi-instrumental music generation with cross-domain pre-training

talks

teaching