Introduction
************

About `ulif.openoffice`
=======================

``ulif.openoffice`` is a Python package to support document
conversions using LibreOffice/OpenOffice.org (LO).

It provides components to interact with a running LO-server for
document conversions from office-type documents like .doc or .odt to
HTML or PDF (to be extended). Using ulif.openoffice you can trigger
such conversions via commandline, programmatically from Python, or via
HTTP.

Furthermore, it provides a caching server that caches all documents
once converted and delivers them in case a document is requested
again. Depending on your needs this can speed-up things by factor 10
or more.

Finally there is also a daemon (``oooctl``) included that starts the
LO server in background and restarts it in case of crashes.


Sources
=======

``ulif.openoffice`` is hosted on:

  http://pypi.python.org/pypi/ulif.openoffice

where you can get latest released versions.

Development can be tracked on github:

  https://github.com/ulif/ulif.openoffice

The documentation can be browsed on:

  https://ulif-openoffice.readthedocs.io/en/latest/


Requirements
============

``ulif.openoffice`` requires `unoconv`_ executable to do the actual
conversions. Current Debian-based distributions normally offer install
of `unoconv`_.

``ulif.openoffice`` is tested on Debian-based systems, most notably
Ubuntu. It will probably miserably fail on Windows and there are no
plans to change that.

The package is designed for server-based deployments. While the
LO-server is running, you cannot use the office-suite on your desktop
(at least at time of writing this). This is a limitation of LO
itself.


Overview
========

``ulif.openoffice`` mainly provides six different components, of which
four merely act as 'frontends' for the core functionality: a cmdline
client, a RESTful WSGI_ application, a WSGI_ based XMLRPC application,
and the respective API calls for use from Python programmes.

* Additional to plain LibreOffice conversions, we provide a set of
  filters to modify office documents on the fly. We call these filters
  ``document processors``. They can unzip incoming docs, zip results,
  extract CSS stylesheets from generated HTML into own files, brush up
  generated HTML and much more. You can always tell which filters to
  apply for each conversion and in what order.

  You can even register your own document processors and they will
  appear in the frontends (cmdline client, WSGI app, API calls).

* An ``oooctl`` server that runs in background, starts a local
  LO-server and monitors its status. If the LO server process dies, it
  is restarted by ``oooctl``.

* An ``oooclient`` commandline tool to trigger conversions.

  ``oooclient`` also supports use of a cache manager that
  caches already converted documents and delivers them in case the
  converted version exists already.

* A DocumentConverter WSGI_ application that acts as a REST_
  server. You can send it documents via HTTP and will get the
  converted documents back.

  The DocumentConverter also supports use of a cache manager that
  caches already converted documents and delivers them in case the
  converted version exists already.

* A WSGIXMLRPC application that also acts as a WSGI_ application but
  provides XMLRPC services. You can use it for instance via the
  standard Python `xmlrpclib` library.

* A Python API to perform all the conversion stuff in your own Python
  programmes.


The components play together roughly as shown in the following figure:

  .. figure:: overview.png

     Fig. 1: Overview of ulif.openoffice components

The black arrows show the way from a source document (in .doc format)
to the LibreOffice server and the way back of the converted document
(PDF).

Use of client-API, ``oooctl`` server and cache is optional.

The LibreOffice server can run on a remote machine.

.. _unoconv: http://dag.wieers.com/home-made/unoconv/
.. _WSGI: http://www.wsgi.org/
.. _REST: http://en.wikipedia.org/wiki/Representational_state_transfer