Class PdfFormProcessor


  • public final class PdfFormProcessor
    extends Object
    Utility for processing the a PDF document as a form. Can be used, for example, to convert a PDF document to a web form. The convert is a generic method that iterates over the content of a PDF document that are relevant for forms, and delegates to the given IPdfFormVisitor when it encounters such content. The IPdfFormVisitor can then implement specific business-logic needed to produce e.g. a HTML or XML etc. representation of the form.

    In general, content relevant for forms are form fields and annotation widgets (e.g. AcroForms); plain text which may contain labels for form elements; as well as images which may contain necessary illustrations or logos.

    Since:
    7.2.0
    Author:
    XIMA MEDIA GmbH
    • Method Detail

      • process

        public static <T> T process​(Path pdf,
                                    String password,
                                    IPdfFormVisitor<T> visitor)
                             throws UnsupportedXfaFormException,
                                    org.apache.pdfbox.pdmodel.encryption.InvalidPasswordException,
                                    IOException
        Processes a PDF document for form relevant content, and invokes the given visitor with that content. Usually the visitor keeps some state, this method returns the final state or result from the visitor. See the class docs for more details.
        Type Parameters:
        T - Type of the result accumulated by the visitor.
        Parameters:
        pdf - PDF document to process.
        password - Password for the document, if encrypted, can be null when not encrypted.
        visitor - Visitor for processing each form relevant content of the PDF, such as form fields, texts and labels, as well as images and illustrations.
        Returns:
        The values accumulated by the given form visitor.
        Throws:
        org.apache.pdfbox.pdmodel.encryption.InvalidPasswordException - When the given password is invalid.
        IOException - When the document could not be read.
        UnsupportedXfaFormException - When the PDF document contains an XFA form.
      • process

        public static <T> T process​(byte[] pdf,
                                    String password,
                                    IPdfFormVisitor<T> visitor)
                             throws UnsupportedXfaFormException,
                                    org.apache.pdfbox.pdmodel.encryption.InvalidPasswordException,
                                    IOException
        Processes a PDF document for form relevant content, and invokes the given visitor with that content. Usually the visitor keeps some state, this method returns the final state or result from the visitor. See the class docs for more details.
        Type Parameters:
        T - Type of the result accumulated by the visitor.
        Parameters:
        pdf - PDF document to process.
        password - Password for the document, if encrypted, can be null when not encrypted.
        visitor - Visitor for processing each form relevant content of the PDF, such as form fields, texts and labels, as well as images and illustrations.
        Returns:
        The values accumulated by the given form visitor.
        Throws:
        org.apache.pdfbox.pdmodel.encryption.InvalidPasswordException - When the given password is invalid.
        IOException - When the document could not be read.
        UnsupportedXfaFormException - When the PDF document contains an XFA form.
      • process

        public static <T> T process​(File pdf,
                                    String password,
                                    IPdfFormVisitor<T> visitor)
                             throws UnsupportedXfaFormException,
                                    org.apache.pdfbox.pdmodel.encryption.InvalidPasswordException,
                                    IOException
        Processes a PDF document for form relevant content, and invokes the given visitor with that content. Usually the visitor keeps some state, this method returns the final state or result from the visitor. See the class docs for more details.
        Type Parameters:
        T - Type of the result accumulated by the visitor.
        Parameters:
        pdf - PDF document to process.
        password - Password for the document, if encrypted, can be null when not encrypted.
        visitor - Visitor for processing each form relevant content of the PDF, such as form fields, texts and labels, as well as images and illustrations.
        Returns:
        The values accumulated by the given form visitor.
        Throws:
        org.apache.pdfbox.pdmodel.encryption.InvalidPasswordException - When the given password is invalid.
        IOException - When the document could not be read.
        UnsupportedXfaFormException - When the PDF document contains an XFA form.
      • process

        public static <T> T process​(InputStream pdf,
                                    String password,
                                    IPdfFormVisitor<T> visitor)
                             throws UnsupportedXfaFormException,
                                    org.apache.pdfbox.pdmodel.encryption.InvalidPasswordException,
                                    IOException
        Processes a PDF document for form relevant content, and invokes the given visitor with that content. Usually the visitor keeps some state, this method returns the final state or result from the visitor. See the class docs for more details.
        Type Parameters:
        T - Type of the result accumulated by the visitor.
        Parameters:
        pdf - PDF document to process.
        password - Password for the document, if encrypted, can be null when not encrypted.
        visitor - Visitor for processing each form relevant content of the PDF, such as form fields, texts and labels, as well as images and illustrations.
        Returns:
        The values accumulated by the given form visitor.
        Throws:
        org.apache.pdfbox.pdmodel.encryption.InvalidPasswordException - When the given password is invalid.
        IOException - When the document could not be read.
        UnsupportedXfaFormException - When the PDF document contains an XFA form.