Recurrent Knowledge Distillation

Research output: Chapter in Book/Conference proceedings/Edited volumeConference contributionScientificpeer-review

1 Citation (Scopus)


Knowledge distillation compacts deep networks by letting a small student network learn from a large teacher network. The accuracy of knowledge distillation recently benefited from adding residual layers. We propose to reduce the size of the student network even further by recasting multiple residual layers in the teacher network into a single recurrent student layer. We propose three variants of adding recurrent connections into the student network, and show experimentally on CIFAR-10, Scenes and MiniPlaces, that we can reduce the number of parameters at little loss in accuracy.
Original languageEnglish
Title of host publication2018 25th IEEE International Conference on Image Processing (ICIP)
Subtitle of host publicationProceedings
Place of PublicationPiscataway
Number of pages5
ISBN (Electronic)978-1-4799-7061-2
ISBN (Print)978-1-4799-7062-9
Publication statusPublished - 2018
Event25th IEEE International Conference on Image Processing - Athens, Greece
Duration: 7 Oct 201810 Oct 2018
Conference number: 25


Conference25th IEEE International Conference on Image Processing
Abbreviated titleICIP 2018


  • Knowledge distillation
  • compacting deep representations for image classification
  • recurrent layers

Fingerprint Dive into the research topics of 'Recurrent Knowledge Distillation'. Together they form a unique fingerprint.

  • Cite this

    Pintea, S. L., Liu, Y., & van Gemert, J. (2018). Recurrent Knowledge Distillation. In 2018 25th IEEE International Conference on Image Processing (ICIP): Proceedings (pp. 3393-3397). IEEE.