Generating Document Previews Using unoconv

Tags: unoconv, Drupal, PHP

Drupal’s file upload widget is flexible and allows various types of file uploads.

In one of our recent projects, the client wanted to generate a preview of Excel spreadsheets, PowerPoint presentations, etc, so that end users could preview files before downloading them.

After some research, we found the unoconv library, which allows users to convert documents to PDF or other required formats.

What Is unoconv?

As per unoconv documentation on GitHub: “Universal Office Converter (unoconv) is a command line tool to convert any document format that LibreOffice can import to any document format that LibreOffice can export. It makes use of the LibreOffice’s UNO bindings for non-interactive conversion of documents.” unoconv uses LibreOffice’s underlying power to convert documents.

Implementing the document preview feature involved two steps:

  1. Install and configure unoconv library.
  2. Create a simple module that uses the PHP Library for converting documents.

Install And Configure unoconv Library

At Axelerant, we primarily use pre-configured development environments for development, particularly DrupalVM or Lando based development environments. For this project, we used DrupalVM with Ubuntu 14.04.

Here are the instructions to set up unoconv on Ubuntu:

Make sure you are using the latest version of LibreOffice. And then use this command to install LibreOffice:

sudo add-apt-repository ppa:libreoffice/ppa
sudo apt-get update
sudo apt-get install libreoffice

  1. Go to opt directory in Linux.
    cd /opt
  2. Download the unoconv library from Github.
    sudo wget
  3. Modify the Python unoconv file by changing 'python' in the first line to 'python3'
    sudo nano /opt/unoconv
    For example: #!/usr/bin/env python3
    (This step might not be needed in different Linux distributions; it is needed specifically for Ubuntu though. You can refer to the detailed guide here.)
  4. Make unoconv executable.
    sudo chmod ugo+x /opt/unoconv
  5. Change permissions so Apache can write to its home directory.
    sudo chown www-data /var/www
  6. Add symlink.
    sudo ln -s /opt/unoconv /usr/bin/

Once you are done with setting up unoconv, you can run unoconv --listener & to start the unoconv runner. If you skip this step, it won’t convert documents. You might need to run it after you stop vagrant and start again. To avoid the repetitive process of starting listener, you can set that command to run on system start. You can also set it to run in the background as a daemon process using something like daemonize.

Create A Simple Module That Uses PHP Library

We used the PHP-Unoconv PHP library that provides an interface to talk to unoconv as well as the document conversion facility.

We created a Drupal module that provides a simple service to make calls to the PHP-Unoconv library.

Add the unoconv_service module to your project by running the following command:

composer require "drupal/unoconv_service"

As we have specified "php-unoconv" as a dependency inside the module’s composer.json, the composer will fetch the “php-unoconv” library. Enable the module as normal.

Visit "/admin/config/unoconv_service/unoconvconfig" for configurations related to timeout, path of unoconv binary, etc. These are essential settings for the module to work. Change settings as per your installation path and the appropriate timeout.

Now, we will create a new service in the Drupal module and use it to convert documents.

To provide a simple wrapper, I have created a service which provides the code for transcoding and generating previews.

You can see it here.

In the function above, we are initializing unoconv and saving its converted value to the destination file.

“transcode()” function takes four arguments in order:

  • Path of input/source file
  • Output format
  • Path of output file
  • Page range

For page range, you can give arguments like 1-10, which will generate the preview of only 1 to 10 pages.

You might need to update the “'unoconv.binaries” path depending on your installation. The wrapper will generate a new file and return the generated file entity. We can call this function from any presave hook and generate the preview file.

To reduce load while generating file run time, what I did is to store the file in one media field. That way, we won’t have to transcode a single file every time.

In the function below, “unoconv_example_media_presave()”, we are generating a preview with the following steps:

  • Get value from file field.
  • Generate preview file. It will be a file entity.
  • Save/attach that file to the media entity’s field.

In the $source_path variable in the function above, we are giving the actual path on the system. That helps to resolve issues when we are using a private file system. Moreover, unoconv needs a physical file location to transcode documents.

So in this way, we can generate previews for uploaded media files. Again, the hook_ENTITY_TYPE_entity_presave() hook is to show you how we can use it along with media entities.

We can also use this for a variety of use cases:

  • If you want to build a paid document sharing site, you can generate a preview document for free users and the full document for subscribers.
  • It also helps us to build preview documents for file types like Excel or PowerPoint, so users can check the preview before downloading files.

Found this helpful? Let us know your thoughts in the comments below. 

Mohit Aghera, Back-end Developer
Posted by

Mohit Aghera, Back-end Developer

Offline, if he's not wandering around the city with family or friends, you can find him in his home studio painting away.