Frequently Asked Question:
Check DPI of image in PDF and adjust resolution if it is greater than 150 DPI
I need to check all the images in a PDF to see if they have a horizontal and vertical resolution of 150 DPI or less. If the resolution is greater than 150 DPI, then I need to change it to 50 DPI. I'm assuming that I would need to extract the image first, check it's resolution, adjust it if necessary and then either replace or overlay the adjusted image with the new one.
Do you have any tips on how I could do this?
There are two possible ways you could get the necessary image information from your PDF using Quick PDF Library.
The software API is split into two sections: the standard functions and the Direct Access functions.
The standard functions read the entire PDF into memory and analyse the content while the direct access functions open the source data (memory or directly on disk) and just read the information required without loading the entire document into memory.
Here are the details of the two methods of investigating the image properties:
=== Option 1: Direct Access with drawn images ===
Functionality is available in the Direct Access section of the library that will allow you to enumerate all the images that have been drawn onto a particular page. The graphics commands that describe the page layout are run through our rendering system and for every "draw image" command the details are stored in a list.
For each image found certain information is available, including:
- Horizontal and vertical co-ordinates of the four corners of the image
- Image width and height in pixels
Using these values you could calculate the true DPI of the image (i.e. the number of pixels per inch at the size the image is drawn onto the page). This might be different to the DPI meta information stored in the image data, if any.
For example, consider a 200x200 pixel JPEG image that has been marked as 600DPI in the image metadata. If this image is drawn into a 1 x 2 inch area the actual horizontal DPI would be 200 while the vertical DPI would be 100.
This is all accomplished using the "direct access" part of the library.
In particular you would use the following functions:
DAOpenFile
DAFindPage
DAGetPageImageList
DAGetImageIntProperty
DAGetImageDblProperty
In a PDF document image data can be stored once and then drawn onto many different pages in multiple locations. This functionality would return a separate set of image data for each drawn image even if the source image data was the same.
Note: Because the information returned is based on the graphics representation of the page it is not possible (with the current version of QPL) to replace the image data using the Direct Access functions.
=== Option 2: Standard functions, resource searching ===
In the standard part of the library there are a different set of functions for locating images. Instead of looking at the pages themselves the routine looks at the stored resources in the entire document.
Because the page description is not examined this routine only looks at the raw image data and therefore is not able to give any details regarding the actual location or size of the image on the page.
For each image found in the document the following properties can be returned:
- Image width and height in pixels
- Image horizontal and vertical resolution as stored in the raw image metadata if this information is available
The functions you would use for this are:
LoadFromFile
FindImages
GetImageID
SelectImage
ImageWidth
ImageHeight
ImageResolutionUnits
ImageHorizontalResolution
ImageVerticalResolution
You would have to run a test on your particular PDF document to determine whether all, some or none of the embedded image data contains resolution metadata.
If the images are missing the resolution marker it might still be possible to calculate a rough estimate of the DPI using just the width and height in pixels if you knew in advance that the embedded images were all the same DPI and the kind of sizes the images occupied on the page.
If everything works out and you are able to read/calculate (or estimate) the DPI values for the images there is a function in the standard section that will let you replace the image data.
The first step would be to add a new image to the document and once this has been done the reference to the original image can be replaced with a reference to the new image.
This cannot be done for the entire document, it must be done for each page one at a time. Note also that the original image data will remain inside the PDF, it will not be removed so there will be no saving in terms of file size.
The functions you would use to do the image replacement are:
AddImageFromFile
SelectPage
ReplaceImage