The use of machine learning to identify the correctness of HS Code for the customs import declarations

Hao Chen, Ben Van Rijnsoever, Marcel Molenhuis, Dennis van Dijk, Y. Tan, B.D. Rukanova

Research output: Chapter in Book/Conference proceedings/Edited volumeConference contributionScientificpeer-review

3 Citations (Scopus)
691 Downloads (Pure)

Abstract

As an increasing volume of international trade activities around the world, the amount of cross-boarder import declarations grows rapidly, resulting in an unprecedented scale of potentially fraudulent transactions, in particular false commodity code (e.g., HS Code). The incorrect HS Code will cause duty risk and adversely impact the revenue collection. Physical investigation by the customs administrations is impractical due to the substantial quantity of declarations. This paper provides an automatic approach by harnessing the power of machine learning techniques to relief the burden of customs targeting officers. We introduced a novel model based on the off-the-shelf embedding encoder to identify the correctness of HS Code without any human effort. Determining whether the HS Code is correctly matched with commodity description is a classification task, so the labelled data is typically required. However, the lack of gold standard labelled data sets in customs domain limits the development of supervised-based approach. Our model is developed by the unsupervised mechanism and trained on the unlabelled historical declaration records, which is robust and able to be smoothly adapted by the different customs administrations. Rather than typically classifying whether the HS Code is correct or not, our model predicts the score to indicate the degree of the HS Code being correct. We have evaluated our proposed model on the ground-truth data set provided by Dutch customs officers. Results show promising performance of 71% overall accuracy.
Original languageEnglish
Title of host publicationIEEE 8th International Conference on Data Science and Advanced Analytics (DSAA)
PublisherIEEE
DOIs
Publication statusPublished - 2021
Event IEEE 8th International Conference on Data Science and Advanced Analytics - Porto, Portugal
Duration: 6 Oct 20219 Oct 2021
Conference number: 8

Conference

Conference IEEE 8th International Conference on Data Science and Advanced Analytics
Abbreviated titleDSAA
Country/TerritoryPortugal
CityPorto
Period6/10/219/10/21

Bibliographical note

Green Open Access added to TU Delft Institutional Repository ‘You share, we take care!’ – Taverne project https://www.openaccess.nl/en/you-share-we-take-care Otherwise as indicated in the copyright section: the publisher is the copyright holder of this work and the author uses the Dutch legislation to make this work public.

Fingerprint

Dive into the research topics of 'The use of machine learning to identify the correctness of HS Code for the customs import declarations'. Together they form a unique fingerprint.

Cite this