In collaboration with Sartorius, a global biopharmaceutical and laboratory equipment supplier, ioLabs embarked on a 2021 project to revolutionize document handling. The objective was to develop a data-driven machine-learning solution that mimics human-like placement of watermarks and stamps on technical PDFs, eliminating the need for manual decision-making. The solution was seamlessly integrated into a web platform, enabling users to select documents with direct BIM 360 access.
Using the state-of-the-art Mask R-CNN machine learning model, we can accurately identify document parts such as legends, headers, footers, and logos, and efficiently locate the stamp location. The model was trained on more than 10,000 pages of annotated technical documents. In addition, a technical document-specific ML algorithm, trained on the same dataset, successfully detects document orientation. To improve the stability and performance of the ML techniques, a traditional algorithm was developed to place the stamp around recognized objects such as headers and footers in the document. In addition, our active learning algorithms will continuously improve the performance of the tool as it is used.
The stamping tool is build as an API in our ioFramework, using our monitoring tools, including the Kibana board and integration with the ioLabs Health board. It is used by the Sartorius Plot-App, used for document management.
This project not only revolutionized document processing but also drove advancements in machine learning and computer vision, providing valuable assets for the companies involved.
Video
Gallery
Client
ioLabs AG (own R&D project)
Partner
-
Credits
ioLabs AG
|
Technology
Pytorch
MaskRCNN
ResNET
FastAPI
Logstash and Kibana
AWS EC2 and Textract
BIM 360 Docs
RabbitMQ
|