Skip to content

anjijava16/aws_textract_utils

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 

Repository files navigation

aws_textract_utils

What is AWS Textract service :

Textract is a machine learning service that automatically extracts text, handwriting, and data from scanned documents that goes beyond simple optical character recognition (OCR) to identify and extract data from forms and tables. Today, many companies manually extract data from scanned documents like PDFs, images, tables, and forms, or through simple OCR software that requires manual configuration which oftentimes requires reconfiguration when the form changes. To overcome these manual and expensive processes, Textract uses machine learning to read and process any type of document, accurately extracting text, handwriting, tables and other data without any manual effort

image

$ pip install -r requirements.txt

$ streamlit run app.py

$ Upload .pdf file it will call aws texract service display below result .

image

Reference use cases

  1. image

About

textract utils

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages