To support a wide range of document formats, Okular was designed in a modular way, so you have the following components:
The shell is the application which is started by the user as standalone application and which embeds the part. The part contains all GUI elements of Okular, for example the content list, the bookmark manager, menus and the graphical view of the document class. The document class is an abstract presentation of the document content. It contains information about every page of the document, its size, orientation etc.
But somehow the document class must retrieve these information from the various types of documents. This is the task of the Generators. Generators are plugins which are loaded at runtime and which have the knowledge about the internal structure of the different document types. They extract the needed information from the documents, convert the data into a common format and pass them to the document class.
Currently Generators for the following document types are available:
- Portable Document Format (PDF)
- Device Independent Format (DVI)
- DeJaVu Format
- Comic Books
- Images (JPEG, PNG, GIF, and many more)
- TIFF Image Format
- FictionBook Format
- Plucker Format
- OpenDocument Text Format
- Microsofts CHM Format
- Microsofts XML Document Format
Now the questions is how can these various formats be represented in a unified way? Okular provides features like rotation, text search and extraction, zooming and many more, so how does it match with the different capabilities of the formats?
Lets start with the smallest commonness of all document formats:
- they have pages (one ore more) of a given size
- pages can be represented as pictures
So the first thing every Generator must support is to return the number of pages of a document. Furthermore it must be able to return the picture of a page at a requested size.
For vector based document formats (e.g. PDF or PostScript) the Generators can render the page for the requested size, for static documents formats (e.g. images), the Generator must scale the content according to the requested size, so when you zoom a page in Okular, the Generators are just asked to return the page for the zoomed size.
When the document class has retrieved the page pictures from the Generators, it can do further image manipulation on it, for example rotating them or applying fancy effects.
Some document formats however support more functionality than just representing a page as an image. PDF, PostScript, DVI and DeJaVu for example contains a machine readable representation of the included text. For those document formats Okular provides additional features like text search, text extraction and text selection.
How is that supported by the Generators?
To access the text from the documents the generators must extract it somehow and make it available to the document class. However for the text selection feature the document class must also know where the extracted text is located on the page. For a zoom factor of 100% the absolute position of the text in the document can be used, however for larger or smaller zoom factors the position must be recalculated. To make this calculation as easy as possible, the Generators return an abstract representation (Okular::TextPage) of the text which includes every character together with its normalized position. Normalized means that the width and height of the page is in the range of 0 to 1, so a character in the middle of the page is at x=0.5 and y=0.5.
So when you want to know where this character is located on the page which is zoomed at 300%, you just multiply the position by 3 * page width (and page height) and get the absolute position for this zoom level.
This abstract text representation also allows an easy rotation of the coordinates, so that text selection is available on rotated pages as well.
Most documents have additional meta information:
- Name of the author
- Date of creation
- Version number
- Table of Content
These information can be retrieved by the generator as well and will be shown by Okular.