Designed and implemented semantic rule engine for flexible field extraction using spatial and relational patterns.
Customers need custom fields beyond standard invoice data. Manual coding for each field request not scalable. Required flexible, user-configurable solution.
Created semantic language for defining extraction rules with spatial relations. Implemented pattern composition for complex extraction logic. Integrated Aho-Corasick algorithm for fast database entity matching. Enabled searching millions of PO numbers against OCR text in milliseconds. Example pattern: "date_label <left_of> date as invoice_date".
Successfully used for custom fields since 2018. Extensible when customers need new fields. Applicable across invoice types. Fast matching of database entries. Core feature of Invoicetrack product.