Intelligent lead generation using automated OCR and generative AI

Profectus Capital provides financing solutions for MSMEs in manufacturing and service sectors and identified a need to support institutions requiring infrastructure upgrades, which could be funded through loans. To generate targeted leads, they faced challenges with manually downloading and extracting information from ~800K PDFs from the public domain. Noventiq developed an automated solution using AWS Glue for PDF downloads and Textract for OCR to convert data into a structured dataset on Amazon Simple Storage Service (S3). A text2SQL bot powered by Amazon Bedrock and Claude 3.5 Sonnet model was deployed to enable Sales Teams to query relevant information in natural language through Amazon Athena.