Beyond Six Digits: Automated Tariff Line HS Transposition Using Natural Language Processing
World Trade Organization
This paper explores the application of Natural Language Processing (NLP) techniques to automate Harmonized System (HS) tariff line transposition, employing a three-stage process: unique 1:1 tariff code matching (Round 1), exact description matching (Round 2), and "smart" description matching (Round 3) using Artificial Intelligence (AI) and lexical similarity methods paired with harmonized 6-digit concordance and cosine similarity. Similarity is calculated using either Term Frequency Inverse Document Frequency (TF-IDF) vectors or Sentence-BERT (SBERT) embeddings, comparing two scenarios: a straightforward case (Economy A) with standardized descriptions, and a complex case (Economy B), with more detailed technical descriptions.