Learning state machines from data streams: A generic strategy and an improved heuristic

Research output: Chapter in Book/Conference proceedings/Edited volumeConference contributionScientificpeer-review

49 Downloads (Pure)

Abstract

State machines models are models that simulate the behavior of discrete event systems, capable of representing systems such as software systems, network interactions, and control systems, and have been researched extensively. The nature of most learning algorithms however is the assumption that all data be available at the begining of the algorithm, and little research has been done in learning state machines from streaming data. In this paper, we want to close this gap further by presenting a generic method for learning state machines from data streams, as well as a merge heuristic that uses sketches to account for incomplete prefix trees. We implement our approach in an open-source state merging library and compare it with existing methods. We show the effectiveness of our approach with respect to run-time, memory consumption, and quality of results on a well known open dataset.State machines models are models that simulate the behavior of discrete event systems, capable of representing systems such as software systems, network interactions, and control systems, and have been researched extensively. The nature of most learning algorithms however is the assumption that all data be available at the begining of the algorithm, and little research has been done in learning state machines from streaming data. In this paper, we want to close this gap further by presenting a generic method for learning state machines from data streams, as well as a merge heuristic that uses sketches to account for incomplete prefix trees. We implement our approach in an open-source state merging library and compare it with existing methods. We show the effectiveness of our approach with respect to run-time, memory consumption, and quality of results on a well known open dataset.

Original languageEnglish
Title of host publicationProceedings of Machine Learning Research
EditorsFrancois Coste Coste, Faissal Ouardi, Guillaume Rabusseau
Number of pages25
Volume217
Publication statusPublished - 2023
Event16th International Conference on Grammatical Inference - Rabat, Morocco
Duration: 10 Jul 202313 Jul 2023
Conference number: 16

Publication series

NameProceedings of Machine Learning Research
ISSN (Print)1938-7228

Conference

Conference16th International Conference on Grammatical Inference
Abbreviated titleICGI 2023
Country/TerritoryMorocco
CityRabat
Period10/07/2313/07/23

Fingerprint

Dive into the research topics of 'Learning state machines from data streams: A generic strategy and an improved heuristic'. Together they form a unique fingerprint.

Cite this