• Skip to content
  • Skip to link menu
KDE API Reference
  • KDE API Reference
  • kdelibs API Reference
  • KDE Home
  • Contact Us
 

Nepomuk-Core

  • sources
  • kde-4.12
  • kdelibs
  • nepomuk-core
  • services
  • fileindexer
  • indexer
plaintextextractor.cpp
Go to the documentation of this file.
1 /*
2  <one line to give the library's name and an idea of what it does.>
3  Copyright (C) 2012 Vishesh Handa <me@vhanda.in>
4 
5  This library is free software; you can redistribute it and/or
6  modify it under the terms of the GNU Lesser General Public
7  License as published by the Free Software Foundation; either
8  version 2.1 of the License, or (at your option) any later version.
9 
10  This library is distributed in the hope that it will be useful,
11  but WITHOUT ANY WARRANTY; without even the implied warranty of
12  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
13  Lesser General Public License for more details.
14 
15  You should have received a copy of the GNU Lesser General Public
16  License along with this library; if not, write to the Free Software
17  Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
18 */
19 
20 
21 #include "plaintextextractor.h"
22 
23 #include "nie.h"
24 #include "nfo.h"
25 
26 #include <QtCore/QFile>
27 #include <KDebug>
28 
29 using namespace Nepomuk2::Vocabulary;
30 
31 namespace Nepomuk2 {
32 
33 PlainTextExtractor::PlainTextExtractor(QObject* parent, const QVariantList&)
34 : ExtractorPlugin(parent)
35 {
36 
37 }
38 
39 bool PlainTextExtractor::shouldExtract(const QUrl& url, const QString& mimeType)
40 {
41  Q_UNUSED( url );
42  return mimeType.startsWith( QLatin1String("text/") ) || mimeType.endsWith( QLatin1String("/xml") );
43 }
44 
45 SimpleResourceGraph PlainTextExtractor::extract(const QUrl& resUri, const QUrl& fileUrl, const QString& mimeType)
46 {
47  Q_UNUSED( mimeType );
48 
49  QFile file( fileUrl.toLocalFile() );
50 
51  // FIXME: make a size filter or something configurable
52  if ( file.size() > 5*1024*1024 ) {
53  return SimpleResourceGraph();
54  }
55 
56  if( !file.open( QIODevice::ReadOnly | QIODevice::Text ) ) {
57  return SimpleResourceGraph();
58  }
59 
60  QTextStream ts( &file );
61  QString contents = ts.readAll();
62 
63  int characters = contents.length();
64  int lines = contents.count( QChar('\n') );
65  int words = contents.count( QRegExp("\\b\\w+\\b") );
66 
67  SimpleResource fileRes( resUri );
68  fileRes.addType( NFO::PlainTextDocument() );
69  fileRes.addProperty( NIE::plainTextContent(), contents );
70  fileRes.addProperty( NFO::wordCount(), words );
71  fileRes.addProperty( NFO::lineCount(), lines );
72  fileRes.addProperty( NFO::characterCount(), characters );
73 
74  return SimpleResourceGraph() << fileRes;
75 }
76 
77 }
78 
79 NEPOMUK_EXPORT_EXTRACTOR( Nepomuk2::PlainTextExtractor, "nepomukplaintextextractor" )
Nepomuk2::PlainTextExtractor::PlainTextExtractor
PlainTextExtractor(QObject *parent, const QVariantList &)
Definition: plaintextextractor.cpp:33
Nepomuk2::PlainTextExtractor
Definition: plaintextextractor.h:28
Nepomuk2::ExtractorPlugin
The ExtractorPlugin is the base class for all file metadata extractors.
Definition: extractorplugin.h:60
Nepomuk2::SimpleResource
Represents a snapshot of one Nepomuk resource.
Definition: simpleresource.h:46
QObject
Nepomuk2::PlainTextExtractor::shouldExtract
virtual bool shouldExtract(const QUrl &url, const QString &mimeType)
By default this returns true if mimetype is in the list of mimetypes provided by the plugin...
Definition: plaintextextractor.cpp:39
Nepomuk2::PlainTextExtractor::extract
virtual SimpleResourceGraph extract(const QUrl &resUri, const QUrl &fileUrl, const QString &mimeType)
The main function of the plugin that is responsible for extracting the data from the file url and ret...
Definition: plaintextextractor.cpp:45
Nepomuk2::SimpleResource::addProperty
void addProperty(const QUrl &property, const QVariant &value)
Add a property.
Definition: simpleresource.cpp:206
Nepomuk2::SimpleResourceGraph
Definition: simpleresourcegraph.h:48
plaintextextractor.h
NEPOMUK_EXPORT_EXTRACTOR
#define NEPOMUK_EXPORT_EXTRACTOR(classname, libname)
Export a Nepomuk file extractor.
Definition: extractorplugin.h:163
Nepomuk2::SimpleResource::addType
void addType(const QUrl &type)
A convenience method which adds a property of type rdf:type.
Definition: simpleresource.cpp:257
This file is part of the KDE documentation.
Documentation copyright © 1996-2014 The KDE developers.
Generated on Tue Oct 14 2014 22:48:08 by doxygen 1.8.7 written by Dimitri van Heesch, © 1997-2006

KDE's Doxygen guidelines are available online.

Nepomuk-Core

Skip menu "Nepomuk-Core"
  • Main Page
  • Namespace List
  • Namespace Members
  • Alphabetical List
  • Class List
  • Class Hierarchy
  • Class Members
  • File List
  • File Members
  • Modules
  • Related Pages

kdelibs API Reference

Skip menu "kdelibs API Reference"
  • DNSSD
  • Interfaces
  •   KHexEdit
  •   KMediaPlayer
  •   KSpeech
  •   KTextEditor
  • kconf_update
  • KDE3Support
  •   KUnitTest
  • KDECore
  • KDED
  • KDEsu
  • KDEUI
  • KDEWebKit
  • KDocTools
  • KFile
  • KHTML
  • KImgIO
  • KInit
  • kio
  • KIOSlave
  • KJS
  •   KJS-API
  • kjsembed
  •   WTF
  • KNewStuff
  • KParts
  • KPty
  • Kross
  • KUnitConversion
  • KUtils
  • Nepomuk
  • Nepomuk-Core
  • Nepomuk
  • Plasma
  • Solid
  • Sonnet
  • ThreadWeaver

Search



Report problems with this website to our bug tracking system.
Contact the specific authors with questions and comments about the page contents.

KDE® and the K Desktop Environment® logo are registered trademarks of KDE e.V. | Legal