Expanding Vision Capabilities with bem

We’re thrilled to be expanding our vision capabilities to support the transformation of complex documents

Jan 08, 2025

As of today, our team is proud to be supporting the growing bem customer base in transforming business processes through making sense of their unstructured data, from automatically handling procurement data within vendor emails, to ingesting large scale real estate public records.

Although we’ve been supporting visual documents since we launched our platform, we were most capable in processing english-language documents with clean and well aligned layouts, such as PDF data exports or rendered web pages. However, we found that documents that didn’t have those qualities have been harder to extract data from. Most of this has been related to our internal systems ability to process visual documents.

Why we’ve been working on vision

Ultimately we’ve realized we needed to upgrade our platform with state of the art vision models in order to best serve our customers (and their customers) that require data extraction on a wider range of document types such as:

Hand written documents
- Hand writing within forms
- Physical notes
Photos of documents
- Receipt processing
- Warehouse labels
Complex layouts
- Trifold documents
- Multi-column articles
Non-english languages
- Customs forms with multiple languages
- International receipts

As part of this investigation into our abilities to handle more challenging document types, we created an internal dataset that covered a high variance suite of documents which were representative of the complex needs of our customers across several industries. We used this dataset for accuracy evaluation while testing new internal solutions for visual data extraction, maximizing the performance of our vision system against a diverse array of document types.

How it works

You might be wondering how you can get access to these capabilities, and the answer is our new vision capabilities are available to all new users of bem! Here’s an example of what our new platform capabilities look like in this new update, using a Japanese language receipt as an example.

Given the following document:

We’ll generate the following transformation before and after our vision upgrade.

Before our vision upgrade

{
  "items": [
    {
      "name": "PM 7-11 OHSA4-XR 1P",
      "amount": 460
    },
    {
      "name": "at",
      "amount": 644
    },
    {
      "name": "#5 #",
      "amount": 356
    }
  ],
  "total": 1000,
  "currencyCode": "JPY"
}

After our vision upgrade

{
  "items": [
    {
      "name": "Nissin Cup Noodle Smoky Chili",
      "amount": 184
    },
    {
      "name": "PM Marlboro HS Smooth R 1P",
      "amount": 460
    }
  ],
  "total": 644,
  "currencyCode": "JPY"
}

If you’d like to try out bem, schedule a demo with us below

Schedule a Demo