Listing / Extracting images from PDF files

This operation supports listing and extracting images in a PDF file. By default, all images are listed, each with its associated metadata, object index in the PDF file, and a URL for a thumbnail-sized copy. However, there are two optional filters for 1) selecting only images from a specific page; and 2) extracting a single image given its object index in the PDF file (see below for an example). If a single image is extracted, the resulting asset URL will redirect to an original-sized copy of the image instead of a thumbnail.

The resulting assets are stored in the S3 cloud storage service and their URLs are listed in the output. These URLs will then be publicly available for one day.


  • Endpoint: POST http://mu3.hp-mu.com/jobs/sync
  • Authorization: BASIC bXVscHA6eGhDdGNxNW9U
  • Accept-type: application/json
  • Content-type: application/json
  • Input schema sample: (page and imgPath are optional, and imgPath is a string encoding a path in the PDF object index.)
{
    "workflow": [
        {
            "task": "ExtractImage",
            "inputs": {
                "pdf": "http://127.0.0.1:9999/wallart.pdf",

                "page": "1",
                "imgPath": "<p{1}/d{Resources}/d{XObject}/d{Im1}>"
            }
        }
    ]
}
  • Output schema sample: (E. g., for listing all images in a one-page PDF.)
{
    "id": "54f8addc17d013e858b01470",
    "status": "success",
    "workflow": [
        {
            "task": "ExtractImage",
            "inputs": {
                "pdf": "http://127.0.0.1:9999/wallart.pdf",

                "page": "1",
                "imgPath": "<p{1}/d{Resources}/d{XObject}/d{Im1}>"
            },
            "result": {
                "images": [
                    {
                        "colorSpace": "RGB",
                        "format": "JPEG",
                        "height": 573,
                        "id": "<p{1}/d{Resources}/d{XObject}/d{Im1}>",
                        "page": 1,
                        "thumbUrl": "https://mu3.s3.amazonaws.com/123.jpg",
                        "width": 971
                    },
                    {
                        "colorSpace": "RGB",
                        "format": "JPEG",
                        "height": 505,
                        "id": "<p{1}/d{Resources}/d{XObject}/d{Im2}>",
                        "page": 1,
                        "thumbUrl": "https://mu3.s3.amazonaws.com/789.jpg",
                        "width": 730
                    }
                ]
            }
        }
    ]
}