Public transport (PT) plays an increasingly important role in solving mobility challenges, especially in densely populated metropolitan areas. Further improving PT systems requires more advanced planning and operations. Fortunately, the considerable amount of data that have become increasingly available for PT systems offer an opportunity to address this challenge. However, how these data can be effectively used to achieve this goal still remains as an unresolved question in the scientific literature. More research is therefore needed to bridge this gap in order to advance PT systems for addressing mobility challenges. To this end, this dissertation is focused on developing methods and models for translating high-volume data from various sources into novel knowledge and insights that can be used to improve PT planning and operations. This dissertation first examines how to obtain onboard occupancy of PT vehicles by integrating all the three different data sources mentioned above. Second, this dissertation deals with the issue of high-dimensionality in large-scale passenger flows. Third, we propose a k-means-based method to cluster PT stops for constructing zone-to-zone OD matrices. Fourth, this dissertation presents a new method for analyzing the accessibility of PT service networks based on a novel network science approach. Last, we investigate whether passenger flow distribution can be estimated solely based on network properties in PT systems.