KItinerary::ScriptExtractor
#include <scriptextractor.h>
Public Member Functions | |
bool | canHandle (const ExtractorDocumentNode &node) const override |
ExtractorResult | extract (const ExtractorDocumentNode &node, const ExtractorEngine *engine) const override |
const std::vector< ExtractorFilter > & | filters () const |
QString | mimeType () const |
QString | name () const override |
QString | scriptFileName () const |
QString | scriptFunction () const |
Public Member Functions inherited from KItinerary::AbstractExtractor |
Detailed Description
A single unstructured data extraction rule set.
These rules are loaded from JSON meta-data files in a compiled-in qrc file, or from $XDG_DATA_DIRS/kitinerary/extractors.
Meta Data Format
The meta-data files either contain a single JSON object or an array of JSON objects with the following content:
mimeType:
The MIME type of the extractor,text
if not specified.filter:
An array of filters that are used to select this extractor for a given input file.script:
A JavaScript file to execute.function:
The entry point in the above mentioned script,main
if not specified.
The following extractor types are supported:
text/plain
: plain text, the argument to the script function is a single string.text/html
: HTML documents, the argument to the script function is a KItinerary::HtmlDocument instance.application/pdf
: PDF documents, the argument to the script function is a KItinerary::PdfDocument instance.application/vnd.apple.pkpass
: Apple Wallet passes, the argument to the script function is a KPkPass::Pass instance.internal/event
: iCalendar events, the argument to the script function is a KCalendarCore::Event instance.
Filter definitions have the following field:
mimeType:
The MIME type of the document part this filter can match against.field:
The name of the field to match against. This can be a field id in a Apple Wallet pass, A MIME message header name, a property on a Json-LD object or an iCal calendar or event. For plain text or binary content, this is ignored.match:
A regular expression that is matched against the specified value (see QRegularExpression).scope:
Specifies how the filter should be applied relative to the document node that is being extracted. One ofCurrent
,Parent
,Children
,Ancestors
,Descendants
(Current
is the default).
Example:
Development
For development it's convenient to symlink the extractors source folder to $XDG_DATA_DIRS/kitinerary/extractors, so you can re-run a changed extractor script without recompiling or restarting the application.
Definition at line 76 of file scriptextractor.h.
Constructor & Destructor Documentation
◆ ScriptExtractor()
|
explicit |
Definition at line 37 of file scriptextractor.cpp.
Member Function Documentation
◆ canHandle()
|
overridevirtual |
Fast check whether this extractor is applicable for node
.
Implements KItinerary::AbstractExtractor.
Definition at line 159 of file scriptextractor.cpp.
◆ extract()
|
overridevirtual |
Extract data from node
.
Implements KItinerary::AbstractExtractor.
Definition at line 175 of file scriptextractor.cpp.
◆ filters()
const std::vector< ExtractorFilter > & ScriptExtractor::filters | ( | ) | const |
Returns the filters deciding whether this extractor should be applied.
Definition at line 144 of file scriptextractor.cpp.
◆ mimeType()
QString ScriptExtractor::mimeType | ( | ) | const |
Mime type this script extractor supports.
Definition at line 109 of file scriptextractor.cpp.
◆ name()
|
overridevirtual |
Identifier for this extractor.
Mainly used for diagnostics and tooling.
Implements KItinerary::AbstractExtractor.
Definition at line 104 of file scriptextractor.cpp.
◆ scriptFileName()
QString ScriptExtractor::scriptFileName | ( | ) | const |
The JS script containing the code of the extractor.
Definition at line 119 of file scriptextractor.cpp.
◆ scriptFunction()
QString ScriptExtractor::scriptFunction | ( | ) | const |
The JS function entry point for this extractor, main
if empty.
Definition at line 129 of file scriptextractor.cpp.
The documentation for this class was generated from the following files:
Documentation copyright © 1996-2024 The KDE developers.
Generated on Mon Nov 18 2024 12:09:59 by doxygen 1.12.0 written by Dimitri van Heesch, © 1997-2006
KDE's Doxygen guidelines are available online.