Memory and Communication Profiling for Accelerator-Based Platforms

Imran Ashraf, Nader Khammassi, Mottaqiallah Taouil, Koen Bertels

Research output: Contribution to journalArticleScientificpeer-review

5 Citations (Scopus)
47 Downloads (Pure)


The growing demand of processing power is being satisfied mainly by an increase in the number of homogeneous and heterogeneous computing cores in a system. Efficient utilization of these architectures demands analysis of memory-access behaviour of applications and perform data-communication aware mapping of applications on these architectures. Appropriate tools are required to highlight memory-access patterns and provide detailed intra-application data-communication information to assist developers in porting existing sequential applications efficiently to these architectures. In this work, we present the design of an open-source tool which provides such a detailed profile for C/C++ applications. In contrast to prior work, our tool not only reports detailed information, but also generates this information with manageable overheads for realistic workloads. Comparison with the state-of-the-art shows that the proposed profiler has, on the average, an order of magnitude less overhead as compared to the state-of-the-art data-communication profilers for a wide range of benchmarks. The experimental results show that our proposed tool generated profiling information for image processing applications which assisted in achieving a speed-up of 6.14× and 2.75× for heterogeneous multi-core platforms containing an FPGA and a GPU as accelerators, respectively.
Original languageEnglish
Pages (from-to)934-948
Number of pages15
JournalIEEE Transactions on Computers
Issue number7
Publication statusPublished - 2018

Bibliographical note

Accepted Author Manuscript


  • Tools
  • Computer architecture
  • Field programmable gate arrays
  • Graphics processing units
  • Instruments
  • Acceleration
  • Open source software


Dive into the research topics of 'Memory and Communication Profiling for Accelerator-Based Platforms'. Together they form a unique fingerprint.

Cite this