The problem
Convert .doc
, .odt
, .docx
files to .pdf
, or another combination (i.e. .docx
to .odt
)
under PHP.
To solve this problem we’ll install Unoconv,
LibreOffice command tools and build a PHP Class.
Installing
Unoconv is Python tool that uses LibreOffice libs (pyuno).
Instaling LibreOffice command line tools
On a server you’re not required to make a full install of LibreOffice,
just command line and converters that you find on core.
For Ubuntu/Debian:
apt-get install openjdk-6-jdk libreoffice-core libreoffice-common libreoffice-writer python-uno
Important:
Installing libreoffice-writer
gives you support to convert TEXT documents.
To convert other formats (spreadsheets, presentations, images, etc), install the related LibreOffice package.
For images, consider lighter and well-known tools such ImageMagick;
For converting PDF to Text, consider PDF to Text – you can find it in Poppler-Utils package.
Installing o Unoconv
Installing libreoffice-writer
gives you support to convert TEXT documents.
To convert other formats (spreadsheets, presentations, images, etc), install the related LibreOffice package.
For images, consider lighter and well-known tools such ImageMagick;
For converting PDF to Text, consider PDF to Text – you can find it in Poppler-Utils package.
Installing o Unoconv
As root:
cd /tmp git clone https://github.com/dagwieers/unoconv cd unoconv/ make install cd ../ rm -rf unoconv/ unoconv --listener &
So you’ve started LibreOffice/OpenOffice as a service running on a local port,
and you can check with ps aux | grep soffice
.
Some warnings:
- Unlike you can convert only with a LibreOffice/OpenOffice install, using the
service and unoconv is better for mass intensive operations, because you
reuse an instance always in memory. -
unoconv package is already in Debian repositories but that’s an old version.
Showing support formats
unoconv --show
Creating a Deamon
To demonize unoconv (better for server mode), create a file /etc/init.d/unoconvd with the following content:
( Source )
#!/bin/sh ### BEGIN INIT INFO # Provides: unoconvd # Required-Start: $network # Required-Stop: $network # Default-Start: 2 3 5 # Default-Stop: # Description: unoconvd - Converting documents to PDF by unoconv ### END INIT INFO case "$1" in start) /usr/bin/unoconv --listener & ;; stop) killall soffice.bin ;; restart) killall soffice.bin sleep 1 /usr/bin/unoconv --listener & ;; esac
The adjust permissions, put on boot and run the daemon:
chmod 755 /etc/init.d/unoconvd update-rc.d unoconvd defaults service unoconvd start
Basic use
It doesn’t matter if you’ve started unoconv manualy or deamonized, you can
use as bellow to convert files:
unoconv --format pdf --output /OUTPUT_DIR/ file.odt
That will convert the file.odt
to file.pdf
on the informed output directory.
PHP Class
A simple PHP wrapper could be as bellow:
<?php namespace Unoconv; /** * Unoconv class wrapper * * @author Rafael Goulart <rafaelgou@gmail.com> * @see http://tech.rgou.net/ */ class Unoconv { /** * Basic converter method * * @param string $originFilePath Origin File Path * @param string $toFormat Format to export To * @param string $outputDirPath Output directory path */ public static function convert($originFilePath, $outputDirPath, $toFormat) { $command = 'unoconv --format %s --output %s %s'; $command = sprintf($command, $toFormat, $outputDirPath, $originFilePath); system($command, $output); return $output; } /** * Convert to PDF * * @param string $originFilePath Origin File Path * @param string $outputDirPath Output directory path */ public static function convertToPdf($originFilePath, $outputDirPath) { return self::convert($originFilePath, $outputDirPath, 'pdf'); } /** * Convert to TXT * * @param string $originFilePath Origin File Path * @param string $outputDirPath Output directory path */ public static function convertToTxt($originFilePath, $outputDirPath) { return self::convert($originFilePath, $outputDirPath, 'txt'); } }
Sample use:
<?php /** * Sample use of Unoconv class * */ require 'Unoconv.php'; use Unoconv\Unoconv; // Converting to PDF $originFilePath = 'test.odt'; $outputDirPath = './'; Unoconv::convertToPdf($originFilePath, $outputDirPath); // Converting to DOCX $originFilePath = 'test.odt'; $outputDirPath = './'; Unoconv::convert($originFilePath, $outputDirPath, 'docx');
Pingback: Errors while working on Certificate Generation System – Rupinderjit Kaur()