Is your business still manually adding data from monthly invoices, receipts or reports to your CRM or database? Would you like to automate this process, extracting data automatically for hundreds of PDFs with the same format? pdf2Data can help!


iText 7 pdf2Data

pdf2Data allows you to automate PDF processing and easily extract data from a volume of PDF documents in the same format. It offers a framework to recognize data inside PDF documents, based on selection rules that you can define in a template.

pdf2Data is one of our commercially licensed add-ons for iText 7, you will require a commercial license for iText 7 Core and pdf2Data to be able to use this product for commercial purposes or in a closed-source. Request a quote to learn more about licensing and pricing for your project.


Why use iText 7 pdf2Data?

Data is an important commodity, and you may have more than you realize locked inside your PDF documents. Of course, collecting this data manually would take you a lot of time, and increase the risk of input errors as well as security issues. With pdf2Data you can automate the process of extracting data in a secure way. Continue reading for more pdf2Data benefits. 


Automate data extraction from PDF invoices and documents

pdf2Data icon svg

Extract and process data from large amounts of PDFs by defining the information that is important to you in a template and pulling it out automatically with programming in Java and .NET. 

Define which specific parts of data you want to extract

pdf2Data icon svg

Quickly define the desired information you want to extract in a template with the pdf2Data template editor. Such as the address field that is always in the right top corner of your PDF invoices.

Integrate into your existing document processes

pdf2Data icon svg

pdf2Data uses open standards to facilitate integration, which makes integrating it into existing workflows easy and fast. It includes SDKs for Java and .NET as well as a command line interface.

Key features

Core capabilities of iText 7 pdf2data

pdf2Data works by defining the areas, fonts, patterns, or tables of interest in a template that is used for all PDFs created in the same format, such as an invoice or an intake document. You then can define areas of interest with selectors. Each selector uses a different way of identifying the information that is important and can be used in conjunction or alone to meet your needs. 

Extract data from PDF documents

Development icon

Leverage iText 7 Core content extraction, for a high fidelity recognition process of text and images.

Intuitive extraction configuration

Development icon

This add-on has comprehensive out of the box functionality, with the flexibility to extend and customize. Focus on easy integration and open standards.

Use templates to streamline extraction

Development icon

Define areas of interest and selection rules to get exactly the content you need.

Integrate in your PDF and/or data workflow

Development icon

Data output in a structured, reusable format for further processing, with access to the page coordinates of the extracted content.

Happy customers

iText is a breeze! Using a proven and tested PDF technology helped us to focus on what we do best — building a high quality mobile app.

We chose the iText library because it was the only solution that allowed easy integration into our open standards architecture.

With iText we have the peace of mind that we are delivering a solid solution to our client.


Still have questions? 

We're happy to answer your questions. Reach out to us and we'll get back to you shortly.

Contact us
Stay updated

Join 11,000+ subscribers and become an iText PDF expert by staying up to date with our new products, updates, tips, technical solutions and happenings.

Subscribe Now