Search


Warning: file(https://raw.githubusercontent.com/NicolasBernaerts/ubuntu-scripts/master/pdf/pdf-repair.desktop): failed to open stream: HTTP request failed! HTTP/1.1 404 Not Found in /mnt/data/home/www/www-bernaerts/plugins/content/displayfile/displayfile.php on line 121

Warning: Invalid argument supplied for foreach() in /mnt/data/home/www/www-bernaerts/plugins/content/displayfile/displayfile.php on line 123

Warning: file(https://raw.githubusercontent.com/NicolasBernaerts/ubuntu-scripts/master/pdf/pdf-repair-action.desktop): failed to open stream: HTTP request failed! HTTP/1.1 404 Not Found in /mnt/data/home/www/www-bernaerts/plugins/content/displayfile/displayfile.php on line 121

Warning: Invalid argument supplied for foreach() in /mnt/data/home/www/www-bernaerts/plugins/content/displayfile/displayfile.php on line 123

Ubuntu - Repair corrupted or broken PDF

Contents[Hide]

dropcap-ubuntu

If you are a day-to-day Linux user, you may have faced some web sites using some broken PDF generation software, where PDF files they generate can't be properly displayed with open-source viewers like Evince.

Latest site where I've faced this problem was Easyjet. I was supposed to print my e-ticket, but all important data were totally unreadable. Here is what Evince was displaying :

ubuntu-pdf-broken

While googling to find a reader able to handle these broken PDF files, I realised that this problem is quite common and that tools like gs (Ghostscript) or mutool (MuPDF) may be able to repair these files.

This article explains how to prepare your Linux desktop to be able to repair corrupted PDF files (like Easyjet e-tickets). It also explains how to integrate this tool as a custom action available from your favorite file manager (Nautilus & PcManFM) with a simple right click on the PDF file.

It has been tested on Ubuntu 16.04 LTS and Ubuntu Gnome 16.04 LTS. But, it should be applicable to any distribution using a Nautilus or PCmanFM.

Thanks to this setup, you'll be able to repair your corrupted PDF files which should be displayed properly in Evince.

ubuntu-pdf-repaired

1. Main Principles

A PDF repair tool should be used in 2 ways :

  • as a classic standalone application where you select the file thru a dialog box
  • from a file manager custom action with a right click on the PDF file

Evince, ePDFView, Xpdf and KPDF are sharing the same PDF rendering engine. So the main idea in a PDF correction tool is to find a robust PDF converter using a different rendering engine.

This is where gs or mutool come into the light :

  • gs uses Ghostscript rendering engine. This engine is well known to be able to handle files which are giving trouble to other renderers.
  • mutool includes a PDF structure repair function

As the PDF reparation job is not guaranteed, the tool in charge of the repairation job should never replace the original file. It should keep the original file and generate a new repaired file in the same folder.

As we have 2 different tools available, repair script will generate 2 different files :

  • myfile (GhostScript repaired).pdf
  • myfile (MuPDF repaired).pdf

As most modern file managers allow to handle files directly on a remote share (thru ftp, smb, ssh, ...), the repairing script should handle these remote files in a transparent manner using either URI or local path.
To do so, we will use gvfs tools to pull the file locally and to push it back to the remote share after it has been repaired.

Finally, as repair job may take some time on big PDF files, a notification should inform you that it is over and that your newly repaired file is available.

If you are not interested in step by step explainations and you just want to install the PDF reparation script, you can jump to Complete Installation Procedure

2. Needed packages

First step is to install all the tools that will be used by the script in charge of the PDF files reparation :

  • gvfs-copy to handle remote files copy
  • notify-send to display desktop notifications
  • gs and mutool to handle the repair work
  • urlencode to convert URI filename for notification display

Some of these tools should be installed by default.

3. Main Script

It's now time to install the main script in charge of the reparation job and to declare it as a desktop application.

/usr/local/bin/pdf-repair
#!/usr/bin/env bash
# ---------------------------------------------------
# Repair broken PDF file using gs
#
# Depends on :
#   * ghostscript
#   * mupdf-tools
#
# Parameter :
#   $1 - URI of original PDF
#
# Revision history :
#   08/11/2014, V1.0 - Creation by N. Bernaerts
#   20/11/2014, V1.1 - Add file selection dialog box
#   24/01/2015, V1.2 - Check tools availability
#   24/11/2017, V2.0 - Add MuTool repair method (thank to Willie Wildgrube idea)
#   01/05/2020, V2.1 - Add method selection with --method
#                      Adaptation for Ubuntu 20.04 LTS 
#   07/05/2020, V2.2 - Multiple files management 
# ---------------------------------------------------

# variable
ERROR=""
METHOD="ghostscript"

# if no argument, display help
if [ $# -eq 0 ] 
then
    echo "Tool to repair a PDF documents"
    echo "Parameters are :"
    echo "  --method <method>   Reparation method (ghostscrit or mutool)"
    echo "  <file1> <file2>     Files to repair"
    exit 1
fi

# iterate thru parameters
while test ${#} -gt 0
do
    case $1 in
        --method) shift; METHOD="$1"; shift; ;;
        *) ARR_FILE=( "${ARR_FILE[@]}" "$1" ); shift; ;;
    esac
done

# --------------------------
# check tools availability
# --------------------------

# check tools availability
command -v gs >/dev/null 2>&1 || { zenity --error --text="Please install gs [ghostscript]"; exit 1; }
command -v mutool >/dev/null 2>&1 || { zenity --error --text="Please install mutool [mupdf-tools]"; exit 1; }

# generate temporary directory
TMP_DIR=$(mktemp -t -d "pdf-repair-XXXXXXXX")
TMP_ORIGINAL="${TMP_DIR}/original.pdf"
TMP_REPAIRED="${TMP_DIR}/repaired.pdf"
pushd "${TMP_DIR}"

# check at least one file is provided
NBR_FILE=${#ARR_FILE[@]}
[ "${ERROR}" = "" -a ${NBR_FILE} -eq 0 ] && ERROR="No file selected"

# check repair method
[ "${ERROR}" = "" -a "${METHOD}" != "ghostscript" -a "${METHOD}" != "mutool" ] && ERROR="Unknown repair tool"

# --------------------
#   PDF repair
# --------------------

# loop thru image files
if [ "${ERROR}" = "" ] 
then
    (
    INDEX=0
    for ORIGINAL_URI in "${ARR_FILE[@]}"
    do
        # increment file index
        INDEX=$((INDEX+1))

        # generate filenames
        ORIGINAL_DIR=$(dirname "${ORIGINAL_URI}")
        ORIGINAL_FILE=$(basename "${ORIGINAL_URI}")

        # copy input file to temporary folder
        echo "# ${INDEX} / ${NBR_FILE} - Copy of original PDF document ..."
        gio copy "${ORIGINAL_URI}" "${TMP_ORIGINAL}"
    
        # repair PDF
        echo "# ${INDEX} / ${NBR_FILE} - Repair original PDF using ${METHOD} ..."
        [ "${METHOD}" = "ghostscript" ] && gs -q -dNOPAUSE -dBATCH -dSAFER -sDEVICE=pdfwrite -sOutputFile="${TMP_REPAIRED}" "${TMP_ORIGINAL}" \
                                        || mutool clean "${TMP_ORIGINAL}" "${TMP_REPAIRED}"

        # place corrected file side to original 
        echo "# ${INDEX} / ${NBR_FILE} - Copy of repaired PDF ..."
        gio copy "${TMP_REPAIRED}" "${ORIGINAL_DIR}/${ORIGINAL_NAME}-repaired.pdf"
    done
    
    ) | zenity --width=500 --height=25 --progress --pulsate --auto-close --title "Repair PDF" --window-icon="/usr/share/icons/pdf-repair.png"
fi

# -------------------
#   End of operation
# -------------------

# display error message
[ "${ERROR}" != "" ] && zenity --error --width=600 --text="${ERROR}"

# remove temporary directory
popd
rm -r "${TMP_DIR}"

/usr/share/applications/pdf-repair.desktop

Both files are available from my GitHub account.

4. File Manager Integration

To get a full desktop integration, this repair tool should be available from a custom action in your file manager context menu.

This context menu should be displayed for any file having a PDF mimetype.

With latest Extension for Menus and Actions of the freedesktop.org Desktop Entry Specification (DES-EMA) this integration has become quite easy.

You just need to declare the new custom action in a .desktop file placed under $HOME/.local/share/file-manager/actions.

~/.local/share/file-manager/actions/pdf-repair-action.desktop

5. Nautilus specific

If you are using Nautilus file manager, you need one extra step to get this right click menu.

In fact, Nautilus is implementing DES-EMA specifications thru an extra nautilus-actions package which need to be installed :

Terminal
# sudo apt install nautilus-actions

Once installed, launch Nautilus Actions application and configure the settings to unselect "Create a root Nautilus-Actions menu" :

ubuntu-nautilus-action-preferences

As Nautilus does not display menu icons by default, you also need to enable this feature.

Terminal
# gsettings set org.gnome.desktop.interface menus-have-icons true

6. Complete Installation Procedure

A complete installation script is available from my GitHub account.

This script will handle package installation, icon & scripts download.

You just need to download and run it :

Terminal
# wget https://raw.githubusercontent.com/NicolasBernaerts/ubuntu-scripts/master/pdf/pdf-repair-install.sh
# chmod +x pdf-repair-install.sh
# ./pdf-repair-install.sh

After following all these steps, you should get :

  • a new Repair corrupted PDF application
  • a new Repair broken PDF right click menu on PDF files

 

ubuntu-pdf-repair-menu

 

Hope it helps.

Signature Technoblog

This article is published "as is", without any warranty that it will work for your specific need.
If you think this article needs some complement, or simply if you think it saved you lots of time & trouble,
just let me know at This email address is being protected from spambots. You need JavaScript enabled to view it.. Cheers !

icon linux icon debian icon apache icon mysql icon php icon piwik icon googleplus