Reading on digital devices has become more commonplace, while it often poses challenges to learners' attention. In this study, we hypothesized that allowing learners to reflect on their reading phases with an empathic social robot companion might enhance learners' attention in e-reading. To verify our assumption, we collected a novel dataset (SKEP) in an e-reading setting with social robot support. It contains 25 multimodal features from various sensors and logged data that are direct and indirect cues of attention. Based on the SKEP dataset, we comprehensively compared the difference between HRI-based (treatment) and GUI-based (control) feedback and obtained insights for intervention design. Based on the human annotation of the nearly 40 hours of video data streams from 60 subjects, we developed a machine learning model to capture attention-regulation behaviors in e-reading. We exploited a two-stage framework to recognize learners' observable self-regulatory behaviors and conducted attention analysis. The proposed system showed a promising performance with high prediction results of e-reading with HRI, such as 72.97% accuracy in recognizing attention regulation behaviors, 74.29% accuracy in predicting knowledge gain, 75.00% for perceived interaction experience, and 75.00% for perceived social presence. We believe our work can inspire the future design of HRI-based e-reading and its analysis.