Less Machine (=) More Vision: Approaches towards Practical and Efficient Machine Vision with Applications in Face Analysis

Research output: ThesisDissertation (TU Delft)

108 Downloads (Pure)

Abstract

Machines that interact with humans can do so better if they can also visually understand us, but they have limited resources to do so. The main topic of this dissertation is contrasting the use of resources by machine vision systems against the accuracy obtained by them. This thesis focuses on reducing the need for data, memory, and computation in real-world machine vision systems, applied to human observation and face analysis.

This dissertation tackles annotation effort by exploring how weakly-supervised object/person detectors can be improved. Findings show that prior knowledge about objects' bounds in images helps the detector learn the spatial extent of objects using only weak image-level labels. The proposed implementation enables single-shot detection, thus improving computational efficiency of this data-efficient method.

The thesis also demonstrates how prior knowledge about eye locations can be used to reduce the computational burden of gaze tracking: non-vital parts of the input image can be discarded without losing accuracy. Additionally, the thesis finds how a priori known geometrical relations can be exploited to project gaze onto a screen with little human annotation effort.

Findings of this dissertation further suggest that spatial structures in images can be exploited for improving efficiency of vision tasks. The proposed solution allows for learning detection of facial occlusions and anomalies from only a few examples. Results also indicate that this solution can be used as a loss function for unsupervised pre-training of neural networks when resources are constrained.

Lastly, this thesis showcases how prior know-how about blood-flow physiology in faces can be applied in a camera-based vital signs estimator. Even when data is available, this hand-crafted method performs better than deep learning methods — both in terms of accuracy and efficiency. At the same time, the results also reveal the pitfalls of assumptions made in the prior knowledge when exposed to more complex tasks — such as video compression noise filtering.

Through its common theme of incorporating prior knowledge, this dissertation brings attention to the costs incurred by machine vision systems to achieve high accuracy.
Original languageEnglish
QualificationDoctor of Philosophy
Awarding Institution
  • Delft University of Technology
Supervisors/Advisors
  • Reinders, M.J.T., Supervisor
  • van Gemert, J.C., Advisor
Thesis sponsors
Award date3 Oct 2022
Print ISBNs978-94-6366-602-2
DOIs
Publication statusPublished - 2022

Funding

The work in this thesis has been funded by Vicarious Perception Technologies (VicarVision).

Keywords

  • Computer Vision
  • Machine learning
  • Artificial Intelligence (AI)
  • Efficiency
  • Computational Efficiency
  • Data Efficiency
  • Face Analysis
  • Human Observation
  • Remote Photoplethysmography

Fingerprint

Dive into the research topics of 'Less Machine (=) More Vision: Approaches towards Practical and Efficient Machine Vision with Applications in Face Analysis'. Together they form a unique fingerprint.
  • Efficiency in Real-time Webcam Gaze Tracking

    Gudi, A., li, X. & van Gemert, J., 2020, Computer Vision – ECCV 2020 Workshops: Proceedings. Bartoli, A. & Fusiello, A. (eds.). 1 ed. Cham: Springer, p. 529 - 543 15 p. (Part of the Lecture Notes in Computer Science book series (LNCS, volume 12535) Also part of the Image Processing, Computer Vision, Pattern Recognition, and Graphics book sub series (LNIP, volume 12535); vol. 12535).

    Research output: Chapter in Book/Conference proceedings/Edited volumeConference contributionScientificpeer-review

    Open Access
    File
    7 Citations (Scopus)
    56 Downloads (Pure)
  • Real-Time Webcam Heart-Rate and Variability Estimation with Clean Ground Truth for Evaluation

    Gudi, A., Bittner, M. & van Gemert, J., 2020, In: Applied Sciences. 10, 23, p. 1-24 24 p., 8630.

    Research output: Contribution to journalArticleScientificpeer-review

    Open Access
    File
    354 Downloads (Pure)
  • Efficient real-time camera based estimation of heart rate and its variability

    Gudi, A., Bittner, M., Lochmans, R. & van Gemert, J., 2019, Proceedings - 2019 International Conference on Computer Vision Workshop, ICCVW 2019. p. 1570-1579 10 p. 9022193. (Proceedings - 2019 International Conference on Computer Vision Workshop, ICCVW 2019).

    Research output: Chapter in Book/Conference proceedings/Edited volumeConference contributionScientificpeer-review

    Open Access
    File
    22 Citations (Scopus)
    605 Downloads (Pure)

Cite this