Okular::TextPage

Search for usage in LXR

Okular::TextPage Class Reference

#include <textpage.h>

Public Types

enum  TextAreaInclusionBehaviour { AnyPixelTextAreaInclusionBehaviour , CentralPixelTextAreaInclusionBehaviour }
 

Public Member Functions

 TextPage ()
 
 TextPage (const TextEntity::List &words)
 
 ~TextPage ()
 
void append (const QString &text, const NormalizedRect &area)
 
RegularAreaRectfindText (int searchID, const QString &query, SearchDirection direction, Qt::CaseSensitivity caseSensitivity, const RegularAreaRect *area)
 
QString text (const RegularAreaRect *area, TextAreaInclusionBehaviour b) const
 
QString text (const RegularAreaRect *area=nullptr) const
 
std::unique_ptr< RegularAreaRecttextArea (const TextSelection &selection) const
 
std::unique_ptr< RegularAreaRectwordAt (const NormalizedPoint &p) const
 
TextEntity::List words (const RegularAreaRect *area, TextAreaInclusionBehaviour b) const
 

Detailed Description

Represents the textual information of a Page.

Makes search and text selection possible.

A Generator with text support should add a TextPage to every Page. For every piece of text, a TextEntity is added, holding the string representation and the bounding box.

Ideally, every TextEntity describes only one glyph. A "glyph" is one character in the graphical representation, but the textual representation may consist of multiple characters (like diacritic modifiers).

When the TextPage is added to the Page, the TextEntitys are restructured to optimize text selection.

See also
TextEntity

Definition at line 101 of file textpage.h.

Member Enumeration Documentation

◆ TextAreaInclusionBehaviour

Defines the behaviour of adding characters to text() result.

Since
0.10 (KDE 4.4)
Enumerator
AnyPixelTextAreaInclusionBehaviour 

A character is included into text() result if any pixel of his bounding box is in the given area.

CentralPixelTextAreaInclusionBehaviour 

A character is included into text() result if the central pixel of his bounding box is in the given area.

Definition at line 113 of file textpage.h.

Constructor & Destructor Documentation

◆ TextPage() [1/2]

TextPage::TextPage ( )

Creates a new text page.

Definition at line 169 of file textpage.cpp.

◆ TextPage() [2/2]

TextPage::TextPage ( const TextEntity::List & words)
explicit

Creates a new text page with the given words.

Definition at line 174 of file textpage.cpp.

◆ ~TextPage()

TextPage::~TextPage ( )

Destroys the text page.

Definition at line 180 of file textpage.cpp.

Member Function Documentation

◆ append()

void TextPage::append ( const QString & text,
const NormalizedRect & area )

Appends the given text with the given area as new TextEntity to the page.

Definition at line 185 of file textpage.cpp.

◆ findText()

RegularAreaRect * TextPage::findText ( int searchID,
const QString & query,
SearchDirection direction,
Qt::CaseSensitivity caseSensitivity,
const RegularAreaRect * area )

Returns the bounding rect of the text which matches the following criteria or 0 if the search is not successful.

Parameters
searchIDAn unique id for this search.
queryThe search text.
directionThe direction of the search (SearchDirection)
caseSensitivityIf Qt::CaseSensitive, the search is case sensitive; otherwise the search is case insensitive.
areaIf null the search starts at the beginning of the page, otherwise right/below the coordinates of the given rect.

Definition at line 549 of file textpage.cpp.

◆ text() [1/2]

QString TextPage::text ( const RegularAreaRect * area,
TextAreaInclusionBehaviour b ) const

Text extraction function.

Looks for text in the given area.

Returns
  • If area points to a valid null area, a null string.
  • If area is nullptr, the whole page text as a single string.
  • Otherwise, the text which is included by area, as a single string.
Since
0.10 (KDE 4.4)

Definition at line 876 of file textpage.cpp.

◆ text() [2/2]

QString TextPage::text ( const RegularAreaRect * area = nullptr) const

Text extraction function.

Looks for text in the given area.

Returns
  • If area points to a valid null area, a null string.
  • If area is nullptr, the whole page text as a single string.
  • Otherwise, the text which is included by area, as a single string. Uses AnyPixelTextAreaInclusionBehaviour

Definition at line 871 of file textpage.cpp.

◆ textArea()

std::unique_ptr< RegularAreaRect > TextPage::textArea ( const TextSelection & selection) const

Returns the rectangular area of the given selection.

It works like this: There are two cursors, we need to select all the text between them. The coordinates are normalised, leftTop is (0,0) rightBottom is (1,1), so for cursors start (sx,sy) and end (ex,ey) we start with finding text rectangles under those points, if not we search for the first that is to the right to it in the same baseline, if none found, then we search for the first rectangle with a baseline under the cursor, having two points that are the best rectangles to both of the cursors: (rx,ry)x(tx,ty) for start and (ux,uy)x(vx,vy) for end, we do a

  1. (rx,ry)x(1,ty)
  2. (0,ty)x(1,uy)
  3. (0,uy)x(vx,vy)

To find the closest rectangle to cursor (cx,cy) we search for a rectangle that either contains the cursor or that has a left border >= cx and bottom border >= cy.

We will now find out the TinyTextEntity for the startRectangle and TinyTextEntity for the endRectangle. We have four cases:

Case 1(a): both startpoint and endpoint are out of the bounding Rectangle and at one side, so the rectangle made of start and endPoint are outof the bounding rect (do not intersect)

Case 1(b): both startpoint and endpoint are out of bounding rect, but they are in different side, so is their rectangle

Case 2(a): find the rectangle which contains start and endpoint and having some TextEntity

Case 2(b): if 2(a) fails (if startPoint and endPoint both are unchanged), then we check whether there is any TextEntity within the rect made by startPoint and endPoint

Case 3: Now, we may have two type of selection.

  1. startpoint is left-top of start_end and endpoint is right-bottom
  2. startpoint is left-bottom of start_end and endpoint is top-right

Also, as 2(b) is passed, we might have it,itEnd or both unchanged, but the fact is that we have text within them. so, we need to search for the best suitable textposition for start and end.

Case 3(a): We search the nearest rectangle consisting of some TinyTextEntity right to or bottom of the startPoint for selection 01. And, for selection 02, we have to search for right and top

Case 3(b): For endpoint, we have to find the point top of or left to endpoint if we have selection 01. Otherwise, the search will be left and bottom

note that, after swapping of start and end, we know that, start is always left to end. but, we cannot say start is positioned upper than end.

Definition at line 277 of file textpage.cpp.

◆ wordAt()

std::unique_ptr< RegularAreaRect > TextPage::wordAt ( const NormalizedPoint & p) const

Returns the area and text of the word at the given point Note that ownership of the returned area belongs to the caller.

Since
0.15 (KDE 4.9)

Definition at line 1695 of file textpage.cpp.

◆ words()

TextEntity::List TextPage::words ( const RegularAreaRect * area,
TextAreaInclusionBehaviour b ) const

Text entity extraction function.

Similar to text() but returns the words including their bounding rectangles. Note that ownership of the contents of the returned list belongs to the caller.

Since
0.14 (KDE 4.8)

Definition at line 1667 of file textpage.cpp.


The documentation for this class was generated from the following files:
This file is part of the KDE documentation.
Documentation copyright © 1996-2025 The KDE developers.
Generated on Fri Jan 3 2025 11:58:07 by doxygen 1.12.0 written by Dimitri van Heesch, © 1997-2006

KDE's Doxygen guidelines are available online.