Expanding Vision Capabilities with bem
We’re thrilled to be expanding our vision capabilities to support the transformation of complex documents
As of today, our team is proud to be supporting the growing bem customer base in transforming business processes through making sense of their unstructured data, from automatically handling procurement data within vendor emails, to ingesting large scale real estate public records.
Although we’ve been supporting visual documents since we launched our platform, we were most capable in processing english-language documents with clean and well aligned layouts, such as PDF data exports or rendered web pages. However, we found that documents that didn’t have those qualities have been harder to extract data from. Most of this has been related to our internal systems ability to process visual documents.
Why we’ve been working on vision
Ultimately we’ve realized we needed to upgrade our platform with state of the art vision models in order to best serve our customers (and their customers) that require data extraction on a wider range of document types such as:
Hand written documents
Hand writing within forms
Physical notes
Photos of documents
Receipt processing
Warehouse labels
Complex layouts
Trifold documents
Multi-column articles
Non-english languages
Customs forms with multiple languages
International receipts
As part of this investigation into our abilities to handle more challenging document types, we created an internal dataset that covered a high variance suite of documents which were representative of the complex needs of our customers across several industries. We used this dataset for accuracy evaluation while testing new internal solutions for visual data extraction, maximizing the performance of our vision system against a diverse array of document types.
How it works
You might be wondering how you can get access to these capabilities, and the answer is our new vision capabilities are available to all new users of bem! Here’s an example of what our new platform capabilities look like in this new update, using a Japanese language receipt as an example.
Given the following document:
We’ll generate the following transformation before and after our vision upgrade.
Before our vision upgrade
{
"items": [
{
"name": "PM 7-11 OHSA4-XR 1P",
"amount": 460
},
{
"name": "at",
"amount": 644
},
{
"name": "#5 #",
"amount": 356
}
],
"total": 1000,
"currencyCode": "JPY"
}
After our vision upgrade
{
"items": [
{
"name": "Nissin Cup Noodle Smoky Chili",
"amount": 184
},
{
"name": "PM Marlboro HS Smooth R 1P",
"amount": 460
}
],
"total": 644,
"currencyCode": "JPY"
}