Okular::TextPage
#include <textpage.h>
Public Types | |
enum | TextAreaInclusionBehaviour { AnyPixelTextAreaInclusionBehaviour , CentralPixelTextAreaInclusionBehaviour } |
Public Member Functions | |
TextPage () | |
TextPage (const TextEntity::List &words) | |
~TextPage () | |
void | append (const QString &text, const NormalizedRect &area) |
RegularAreaRect * | findText (int searchID, const QString &query, SearchDirection direction, Qt::CaseSensitivity caseSensitivity, const RegularAreaRect *area) |
QString | text (const RegularAreaRect *area, TextAreaInclusionBehaviour b) const |
QString | text (const RegularAreaRect *area=nullptr) const |
std::unique_ptr< RegularAreaRect > | textArea (const TextSelection &selection) const |
std::unique_ptr< RegularAreaRect > | wordAt (const NormalizedPoint &p) const |
TextEntity::List | words (const RegularAreaRect *area, TextAreaInclusionBehaviour b) const |
Detailed Description
Represents the textual information of a Page.
Makes search and text selection possible.
A Generator with text support should add a TextPage to every Page. For every piece of text, a TextEntity is added, holding the string representation and the bounding box.
Ideally, every TextEntity describes only one glyph. A "glyph" is one character in the graphical representation, but the textual representation may consist of multiple characters (like diacritic modifiers).
When the TextPage is added to the Page, the TextEntitys are restructured to optimize text selection.
- See also
- TextEntity
Definition at line 101 of file textpage.h.
Member Enumeration Documentation
◆ TextAreaInclusionBehaviour
Defines the behaviour of adding characters to text() result.
- Since
- 0.10 (KDE 4.4)
Enumerator | |
---|---|
AnyPixelTextAreaInclusionBehaviour | A character is included into text() result if any pixel of his bounding box is in the given area. |
CentralPixelTextAreaInclusionBehaviour | A character is included into text() result if the central pixel of his bounding box is in the given area. |
Definition at line 113 of file textpage.h.
Constructor & Destructor Documentation
◆ TextPage() [1/2]
TextPage::TextPage | ( | ) |
Creates a new text page.
Definition at line 169 of file textpage.cpp.
◆ TextPage() [2/2]
|
explicit |
Creates a new text page with the given words
.
Definition at line 174 of file textpage.cpp.
◆ ~TextPage()
TextPage::~TextPage | ( | ) |
Destroys the text page.
Definition at line 180 of file textpage.cpp.
Member Function Documentation
◆ append()
void TextPage::append | ( | const QString & | text, |
const NormalizedRect & | area ) |
Appends the given text
with the given area
as new TextEntity to the page.
Definition at line 185 of file textpage.cpp.
◆ findText()
RegularAreaRect * TextPage::findText | ( | int | searchID, |
const QString & | query, | ||
SearchDirection | direction, | ||
Qt::CaseSensitivity | caseSensitivity, | ||
const RegularAreaRect * | area ) |
Returns the bounding rect of the text which matches the following criteria or 0 if the search is not successful.
- Parameters
-
searchID An unique id for this search. query The search text. direction The direction of the search (SearchDirection) caseSensitivity If Qt::CaseSensitive, the search is case sensitive; otherwise the search is case insensitive. area If null the search starts at the beginning of the page, otherwise right/below the coordinates of the given rect.
Definition at line 549 of file textpage.cpp.
◆ text() [1/2]
QString TextPage::text | ( | const RegularAreaRect * | area, |
TextAreaInclusionBehaviour | b ) const |
Text extraction function.
Looks for text in the given area
.
- Returns
- If
area
points to a valid null area, a null string. - If
area
is nullptr, the whole page text as a single string. - Otherwise, the text which is included by
area
, as a single string.
- If
- Since
- 0.10 (KDE 4.4)
Definition at line 876 of file textpage.cpp.
◆ text() [2/2]
QString TextPage::text | ( | const RegularAreaRect * | area = nullptr | ) | const |
Text extraction function.
Looks for text in the given area
.
- Returns
- If
area
points to a valid null area, a null string. - If
area
is nullptr, the whole page text as a single string. - Otherwise, the text which is included by
area
, as a single string. Uses AnyPixelTextAreaInclusionBehaviour
- If
Definition at line 871 of file textpage.cpp.
◆ textArea()
std::unique_ptr< RegularAreaRect > TextPage::textArea | ( | const TextSelection & | selection | ) | const |
Returns the rectangular area of the given selection
.
It works like this: There are two cursors, we need to select all the text between them. The coordinates are normalised, leftTop is (0,0) rightBottom is (1,1), so for cursors start (sx,sy) and end (ex,ey) we start with finding text rectangles under those points, if not we search for the first that is to the right to it in the same baseline, if none found, then we search for the first rectangle with a baseline under the cursor, having two points that are the best rectangles to both of the cursors: (rx,ry)x(tx,ty) for start and (ux,uy)x(vx,vy) for end, we do a
- (rx,ry)x(1,ty)
- (0,ty)x(1,uy)
- (0,uy)x(vx,vy)
To find the closest rectangle to cursor (cx,cy) we search for a rectangle that either contains the cursor or that has a left border >= cx and bottom border >= cy.
We will now find out the TinyTextEntity for the startRectangle and TinyTextEntity for the endRectangle. We have four cases:
Case 1(a): both startpoint and endpoint are out of the bounding Rectangle and at one side, so the rectangle made of start and endPoint are outof the bounding rect (do not intersect)
Case 1(b): both startpoint and endpoint are out of bounding rect, but they are in different side, so is their rectangle
Case 2(a): find the rectangle which contains start and endpoint and having some TextEntity
Case 2(b): if 2(a) fails (if startPoint and endPoint both are unchanged), then we check whether there is any TextEntity within the rect made by startPoint and endPoint
Case 3: Now, we may have two type of selection.
- startpoint is left-top of start_end and endpoint is right-bottom
- startpoint is left-bottom of start_end and endpoint is top-right
Also, as 2(b) is passed, we might have it,itEnd or both unchanged, but the fact is that we have text within them. so, we need to search for the best suitable textposition for start and end.
Case 3(a): We search the nearest rectangle consisting of some TinyTextEntity right to or bottom of the startPoint for selection 01. And, for selection 02, we have to search for right and top
Case 3(b): For endpoint, we have to find the point top of or left to endpoint if we have selection 01. Otherwise, the search will be left and bottom
note that, after swapping of start and end, we know that, start is always left to end. but, we cannot say start is positioned upper than end.
Definition at line 277 of file textpage.cpp.
◆ wordAt()
std::unique_ptr< RegularAreaRect > TextPage::wordAt | ( | const NormalizedPoint & | p | ) | const |
Returns the area and text of the word at the given point Note that ownership of the returned area belongs to the caller.
- Since
- 0.15 (KDE 4.9)
Definition at line 1695 of file textpage.cpp.
◆ words()
TextEntity::List TextPage::words | ( | const RegularAreaRect * | area, |
TextAreaInclusionBehaviour | b ) const |
Text entity extraction function.
Similar to text() but returns the words including their bounding rectangles. Note that ownership of the contents of the returned list belongs to the caller.
- Since
- 0.14 (KDE 4.8)
Definition at line 1667 of file textpage.cpp.
The documentation for this class was generated from the following files:
Documentation copyright © 1996-2025 The KDE developers.
Generated on Fri Jan 3 2025 11:58:07 by doxygen 1.12.0 written by Dimitri van Heesch, © 1997-2006
KDE's Doxygen guidelines are available online.