Metadata-Version: 2.1 Name: sdsvkvu Version: 0.0.1 Summary: SDSV OCR Team: Key-value understanding Home-page: https://github.com/open-mmlab/mmocr Author: tuanlv Author-email: lv.tuan3@samsung.com License: Apache License 2.0 Classifier: Development Status :: 4 - Beta Classifier: License :: OSI Approved :: Apache Software License Classifier: Operating System :: OS Independent Classifier: Programming Language :: Python :: 3.9 Requires-Python: >=3.9 Description-Content-Type: text/markdown License-File: LICENSE

SDSVKVU

***Feature*** - Extract pairs of key-value in documents: Invoice/Receipt, Forms, Government documents (Id cards, driver license, birth's certificate) - Language: VI + EN ***What's news*** ### - Ver 0.0.1: - Support inputs: image, PDF file (single or multi pages) - Extract all pairs key-value return raw_outputs + Weights: sdsvkvu/weights/key_value_understanding-20230716-085549_final - For VAT invoices : Extract 14 specific fields + Weights: sdsvkvu/weights/key_value_understanding-20230627-164536_fi - For SBT invoices ("sbt" option): Extract table in SBT invoice + Weights: sdsvkvu/weights/key_value_understanding-20230617-162324_sbt ### - Ver 0.0.2: Add more option: "vtb" - Vietin Bank - For Vietin Bank document ("vtb" option): Extract 6 specific fileds + Weights: sdsvkvu/weights/key_value_understanding-20230824-164236_vietin ### - Ver 0.0.3: Add default option: - Return all potential pairs of key-value, title, only key, triplet, and table with raw key ## I. Setup ***Dependencies*** - Python: 3.10 - Torch: 1.11.3 - CUDA: 11.6 - transformers: 4.30.0 ``` pip install -v -e . ``` ## II. Inference run cmd: python test.py ``` import os from sdsvkvu import load_engine, process_img os.environ["CUDA_VISIBLE_DEVICES"]="1" if __name__ == "__main__": kwargs = {"device": "cuda:0"} img_dir = "/mnt/ssd1T/tuanlv/02-KVU/sdsvkvu/visualize/test_img/RedInvoice_WaterPurfier_Feb_PVI_829_0.jpg" save_dir = "/mnt/ssd1T/tuanlv/02-KVU/sdsvkvu/visualize/test2/" engine = load_engine(kwargs) # option: "vat" for vat invoice outputs, "sbt": sbt invoice outputs, else for raw outputs outputs = process_img(img_dir, save_dir, engine, export_all=False, option="vat") ``` # Structure project . ├── sdsvkvu │   ├── main.py ├── externals │   │   ├── __init__.py │   │   ├── ocr_engine │   │   │   ├── ... │   │   ├── ocr_engine_deskew │   │   │   ├── ... │   ├── model │   │   ├── combined_model.py │   │   ├── document_kvu_model.py │   │   ├── __init__.py │   │   ├── kvu_model.py │   │   └── relation_extractor.py │   ├── modules │   │   ├── __init__.py │   │   ├── predictor.py │   │   ├── preprocess.py │   │   └── run_ocr.py │   ├── requirements.txt │   ├── settings.yml │   ├── sources │   │   ├── __init__.py │   │   ├── kvu.py │   │   └── utils.py │   ├── utils │   │   ├── dictionary │   │   │   ├── __init__.py │   │   │   ├── sbt.py │   │   │   └── vat.py │   │   │   └── vtb.py │   │   ├── __init__.py │   │   ├── post_processing.py │   │   ├── query │   │   │   ├── __init__.py │   │   │   ├── sbt.py │   │   │   └── vat.py │   │   │   └── vtb.py │   │   └── utils.py │   └── weights │   └── key_value_understanding-20230627-164536_fi │   ├── key_value_understanding-20230617-162324_sbt │   └── key_value_understanding-20230716-085549_final │   └── key_value_understanding-20230824-164236_vietin ├── LICENSE ├── MANIFEST.in ├── pyproject.toml ├── README.md ├── scripts │   └── run.sh ├── setup.cfg ├── setup.py ├── test.py └── visualize