Back to Projects
Professional ProjectAI/ML

Generic Rule and Location-Based Extraction

Timeline: 2018
Role: Software Developer

Overview

Designed and implemented semantic rule engine for flexible field extraction using spatial and relational patterns.

Challenge

Customers need custom fields beyond standard invoice data. Manual coding for each field request not scalable. Required flexible, user-configurable solution.

Solution & Approach

Created semantic language for defining extraction rules with spatial relations. Implemented pattern composition for complex extraction logic. Integrated Aho-Corasick algorithm for fast database entity matching. Enabled searching millions of PO numbers against OCR text in milliseconds. Example pattern: "date_label <left_of> date as invoice_date".

Outcome & Impact

Successfully used for custom fields since 2018. Extensible when customers need new fields. Applicable across invoice types. Fast matching of database entries. Core feature of Invoicetrack product.

Technologies Used

Custom semantic rule languageAho-Corasick algorithmPattern matchingC#Database integration

Key Highlights

  • Core feature of product since 2018
  • Fast database entity matching (millions of records in milliseconds)
  • Flexible semantic rule language
  • Handles complex spatial relations
View All Projects