Challenges in Technical Regulatory Text Variation Detection

Jan 1, 2025·

Shriya Vaagdevi Chikati

Samuel Larkin

David Minicola

Chi-Kiu Lo

· 0 min read

Abstract

We present a preliminary study on the feasibility of using current natural language processing techniques to detect variations between the construction codes of different jurisdictions. We formulate the task as a sentence alignment problem and evaluate various sentence representation models for their performance in this task. Our results show that task-specific trained embeddings perform marginally better than other models, but the overall accuracy remains a challenge. We also show that domain-specific fine-tuning hurts the task performance. The results highlight the challenges of developing NLP applications for technical regulatory texts.

Type

Conference paper

Publication

Proceedings of the 1st Regulatory NLP Workshop (RegNLP 2025)

Last updated on Jan 1, 2025

Speech Generation for Indigenous Language Education Jan 1, 2025 →