kviewshell
DataPool Class Reference
Thread safe data storage. More...
#include <DataPool.h>
Public Member Functions | |
void | clear_stream (const bool release=true) |
void | load_file (void) |
void | stop (bool only_blocked=false) |
Adding data. | |
void | add_data (const void *buffer, int offset, int size) |
void | add_data (const void *buffer, int size) |
void | set_eof (void) |
Trigger callbacks. | |
{Trigger callbacks} are special callbacks called when all data for the given range of offsets has been made available. Since reading unavailable data may result in a thread block, which may be bad, the usage of {trigger callbacks} appears to be a convenient way to signal availability of data.
You can add a trigger callback in two ways: {enumerate} By specifying a range. This is the most general case By providing just one {threshold}. In this case the range is assumed to start from offset ZERO# and last for {threshold}+1 bytes. {enumerate} | |
void | add_trigger (int thresh, void(*callback)(void *), void *cl_data) |
void | add_trigger (int start, int length, void(*callback)(void *), void *cl_data) |
void | del_trigger (void(*callback)(void *), void *cl_data) |
Accessing data. | |
int | get_data (void *buffer, int offset, int size) |
GP< ByteStream > | get_stream (void) |
State querying functions. | |
int | get_length (void) const |
int | get_size (void) const |
bool | has_data (int start, int length) |
bool | is_connected (void) const |
bool | is_eof (void) const |
DataPool.h | |
Files #"DataPool.h"# and #"DataPool.cpp"# implement classes {DataPool} and {DataRange} used by DjVu decoder to access data. The main goal of class {DataPool} is to provide concurrent access to the same data from many threads with a possibility to add data from yet another thread. It is especially important in the case of the Netscape plugin when data is not immediately available, but decoding should be started as soon as possible. In this situation it is vital to provide transparent access to the data from many threads possibly blocking readers that try to access information that has not been received yet. When the data is local though, it can be accessed directly using standard IO mechanism. To provide a uniform interface for decoding routines, {DataPool} supports file mode as well. Thread safe data storage
| |
bool | simple_compare (DataPool &pool) const |
Static Public Member Functions | |
static void | close_all (void) |
static void | load_file (const GURL &url) |
Static Public Attributes | |
static const char * | Stop = ERR_MSG("STOP") |
Protected Member Functions | |
DataPool (void) | |
Initialization | |
static GP< DataPool > | create (const GURL &url, int start=0, int length=-1) |
static GP< DataPool > | create (const GP< DataPool > &master_pool, int start=0, int length=-1) |
static GP< DataPool > | create (const GP< ByteStream > &str) |
static GP< DataPool > | create (void) |
void | connect (const GURL &url, int start=0, int length=-1) |
void | connect (const GP< DataPool > &master_pool, int start=0, int length=-1) |
virtual | ~DataPool () |
Detailed Description
Thread safe data storage.The purpose of DataPool# is to provide a uniform interface for accessing data from decoding routines running in a multi-threaded environment. Depending on the mode of operation it may contain the actual data, may be connected to another DataPool# or may be mapped to a file. Regardless of the mode, the class returns data in a thread-safe way, blocking reading threads if there is no data of interest available. This blocking is especially useful in the networking environment (plugin) when there is a running decoding thread, which wants to start decoding as soon as there is just one byte available blocking if necessary.
Access to data in a DataPool# may be direct (Using {get_data}() function) or sequential (See {get_stream}() function).
If the DataPool# is not connected to anything, that is it contains some real data, this data can be added to it by means of two {add_data}() functions. One of them adds data sequentially maintaining the offset of the last block of data added by it. The other can store data anywhere. Thus it's important to realize, that there may be "white spots" in the data storage.
There is also a way to test if data is available for some given data range (See {has_data}()). In addition to this mechanism, there are so-called {trigger callbacks}, which are called, when there is all data available for a given data range.
Let us consider all modes of operation in details:
{enumerate} { Not connected DataPool#}. In this mode the DataPool# contains some real data. As mentioned above, it may be added by means of two functions {add_data}() operating independent of each other and allowing to add data sequentially and directly to any place of data storage. It's important to call function {set_eof}() after all data has been added.
Functions like {get_data}() or {get_stream}() can be used to obtain direct or sequential access to the data. As long as {is_eof}() is FALSE#, DataPool# will block every reader, which is trying to read unavailable data until it really becomes available. But as soon as {is_eof}() is TRUE#, any attempt to read non-existing data will read #0# bytes.
Taking into account the fact, that DataPool# was designed to store DjVu files, which are in IFF formats, it becomes possible to predict the size of the DataPool# as soon as the first #32# bytes have been added. This is invaluable for estimating download progress. See function {get_length}() for details. If this estimate fails (which means, that stored data is not in IFF format), {get_length}() returns #-1#.
Triggers may be added and removed by means of {add_trigger}() and {del_trigger}() functions. {add_trigger}() takes a data range. As soon as all data in that data range is available, the trigger callback will be called.
All trigger callbacks will be called when EOF# condition has been set.
{ DataPool# connected to another DataPool#}. In this {slave} mode you can map a given DataPool# to any offsets range inside another DataPool#. You can connect the slave DataPool# even if there is no data in the master DataPool#. Any {get_data}() request will be forwarded to the master DataPool#, and it will be responsible for blocking readers trying to access unavailable data.
The usage of {add_data}() functions is prohibited for connected DataPool::s.
The offsets range used to map a slave DataPool# can be fully specified (both start offset and length are positive numbers) or partially specified (the length is negative). In this mode the slave DataPool# is assumed to extend up to the end of the master DataPool#.
Triggers may be used with slave DataPool::s as well as with the master ones.
Calling {stop}() function of a slave will stop only the slave (and any other slave connected to it), but not the master.
{set_eof}() function is meaningless for slaves. They obtain the ByteStream::EndOfFile# status from their master.
Depending on the offsets range passed to the constructor, {get_length}() returns different values. If the length passed to the constructor was positive, then it is returned by {get_length}() all the time. Otherwise the value returned is either #-1# if master's length is still unknown (it didn't manage to parse IFF data yet) or it is calculated as masters_length-slave_start#.
{ DataPool# connected to a file}. This mode is quite similar to the case, when the DataPool# is connected to another DataPool#. Similarly, the DataPool# stores no data inside. It just forwards all {get_data}() requests to the underlying source (a file in this case). Thus these requests will never block the reader. But they may return #0# if there is no data available at the requested offset.
The usage of {add_data}() functions is meaningless and is prohibited.
{is_eof}() function always returns TRUE#. Thus {set_eof}() us meaningless and does nothing.
{get_length}() function always returns the file size.
Calling {stop}() function will stop this DataPool# and any other slave connected to it.
Trigger callbacks passed through {add_trigger}() function are called immediately.
This mode is useful to read and decode DjVu files without reading and storing them in full in memory. {enumerate}
Definition at line 225 of file DataPool.h.
Constructor & Destructor Documentation
DataPool::DataPool | ( | void | ) | [protected] |
Definition at line 741 of file DataPool.cpp.
DataPool::~DataPool | ( | void | ) | [virtual] |
Definition at line 830 of file DataPool.cpp.
Member Function Documentation
void DataPool::add_data | ( | const void * | buffer, | |
int | offset, | |||
int | size | |||
) |
Stores the specified block of data at the specified offset.
Like the function above this one can also unblock readers waiting for data and engage trigger callbacks. The difference is that { this} function can store data anywhere.
{ Note:} After all the data has been added, it's necessary to call {set_eof}() to tell the DataPool# that nothing else is expected.
{ Note:} This function may not be called if the DataPool# has been connected to something.
- Parameters:
-
buffer data to store offset where to store the data size length of the {buffer}
Definition at line 1022 of file DataPool.cpp.
void DataPool::add_data | ( | const void * | buffer, | |
int | size | |||
) |
Appends the new block of data to the DataPool#.
There are two {add_data}() functions available. One is for adding data sequentially. It keeps track of the last byte position, which has been stored { by it} and always appends the next block after this position. The other {add_data}() can store data anywhere.
The function will unblock readers waiting for data if this data arrives with this block. It may also trigger some {trigger callbacks}, which may have been added by means of {add_trigger}() function.
{ Note:} After all the data has been added, it's necessary to call {set_eof}() to tell the DataPool# that nothing else is expected.
{ Note:} This function may not be called if the DataPool# has been connected to something.
- Parameters:
-
buffer data to append size length of the {buffer}
Definition at line 1011 of file DataPool.cpp.
void DataPool::add_trigger | ( | int | thresh, | |
void(*)(void *) | callback, | |||
void * | cl_data | |||
) |
Associates the specified {trigger callback} with the specified threshold.
This function is a simplified version of the function above. The callback will be called when there is data available for every offset from #0# to thresh#, if thresh# is positive, or when EOF# condition has been set otherwise.
Definition at line 1506 of file DataPool.cpp.
void DataPool::add_trigger | ( | int | start, | |
int | length, | |||
void(*)(void *) | callback, | |||
void * | cl_data | |||
) |
Associates the specified {trigger callback} with the given data range.
{ Note:} The callback may be called immediately if all data for the given range is already available or EOF# is TRUE#.
- Parameters:
-
start The beginning of the range for which all data should be available length If the {length} is not negative then the callback will be called when there is data available for every offset from {start} to {start+length-1}. If {thresh} is negative, the callback is called after EOF# condition has been set. callback Function to call cl_data Argument to pass to the callback when it's called.
Definition at line 1515 of file DataPool.cpp.
void DataPool::clear_stream | ( | const bool | release = true |
) |
Definition at line 812 of file DataPool.cpp.
void DataPool::close_all | ( | void | ) | [static] |
void DataPool::connect | ( | const GURL & | url, | |
int | start = 0 , |
|||
int | length = -1 | |||
) |
Connects the DataPool# to the specified offsets range of the named url#.
- Parameters:
-
url Name of the file to connect to. start Beginning of the offsets range which the DataPool# is mapped into length Length of the offsets range. If negative, the range is assumed to extend up to the end of the file.
Definition at line 899 of file DataPool.cpp.
Switches the DataPool# to slave mode and connects it to the specified offsets range of the master DataPool#.
- Parameters:
-
master_pool Master DataPool# providing data for this slave start Beginning of the offsets range which the slave is mapped into length Length of the offsets range. If negative, the range is assumed to extend up to the end of the master DataPool#.
Definition at line 863 of file DataPool.cpp.
Initializes the DataPool# in slave mode and connects it to the specified offsets range of the specified file.
It is equivalent to calling default constructor and function {connect}().
- Parameters:
-
url Name of the file to connect to. start Beginning of the offsets range which the DataPool# is mapped into length Length of the offsets range. If negative, the range is assumed to extend up to the end of the file.
Definition at line 795 of file DataPool.cpp.
GP< DataPool > DataPool::create | ( | const GP< DataPool > & | master_pool, | |
int | start = 0 , |
|||
int | length = -1 | |||
) | [static] |
Initializes the DataPool# in slave mode and connects it to the specified offsets range of the specified master DataPool#.
It is equivalent to calling default constructor and function {connect}().
- Parameters:
-
master_pool Master DataPool# providing data for this slave start Beginning of the offsets range which the slave is mapped into length Length of the offsets range. If negative, the range is assumed to extend up to the end of the master DataPool#.
Definition at line 782 of file DataPool.cpp.
GP< DataPool > DataPool::create | ( | const GP< ByteStream > & | str | ) | [static] |
Creates and initialized the DataPool# with data from stream str#.
The constructor will read the stream's contents and add them to the pool using the {add_data}() function. Afterwards it will call {set_eof}() function, and no other data will be allowed to be added to the pool.
Definition at line 760 of file DataPool.cpp.
Default creator.
Will prepare DataPool# for accepting data added through functions {add_data}(). Use {connect}() functions if you want to map this DataPool# to another or to a file.
Definition at line 744 of file DataPool.cpp.
void DataPool::del_trigger | ( | void(*)(void *) | callback, | |
void * | cl_data | |||
) |
Use this function to unregister callbacks, which are no longer needed.
{ Note!} It's important to do it when the client is about to be destroyed.
Definition at line 1556 of file DataPool.cpp.
int DataPool::get_data | ( | void * | buffer, | |
int | offset, | |||
int | size | |||
) |
Attempts to return a block of data at the given offset# of the given size#.
{enumerate} If the DataPool# is connected to another DataPool# or to a file, the request will just be forwarded to them. If the DataPool# is not connected to anything and some of the data requested is in the internal buffer, the function copies available data to buffer# and returns immediately.
If there is no data available, and {is_eof}() returns FALSE#, the reader (and the thread) will be { blocked} until the data actually arrives. Please note, that since the reader is blocked, it should run in a separate thread so that other threads have a chance to call {add_data}(). If there is no data available, but {is_eof}() is TRUE# the behavior is different and depends on the DataPool#'s estimate of the file size: {itemize} If DataPool# learns from the IFF structure of the data, that its size should be greater than it really is, then any attempt to read non-existing data in the range of {valid} offsets will result in an ByteStream::EndOfFile# exception. This is done to indicate, that there was an error in adding data, and the data requested is { supposed} to be there, but has actually not been added. If DataPool#'s expectations about the data size coincide with the reality then any attempt to read data beyond the legal range of offsets will result in ZERO# bytes returned. {itemize}. {enumerate}.
- Parameters:
-
buffer Buffer to be filled with data offset Offset in the DataPool# to read data at size Size of the {buffer}
- Returns:
- The number of bytes actually read
- Exceptions:
-
STOP The stream has been stopped EOF The requested data is not there and will not be added, although it should have been.
Definition at line 1098 of file DataPool.cpp.
int DataPool::get_length | ( | void | ) | const |
Returns the {length} of data in the DataPool#.
The value returned depends on the mode of operation: {itemize} If the DataPool# is not connected to anything then the length returned is either calculated by interpreting the IFF structure of stored data (if successful) or by calculating the real size of data after {set_eof}() has been called. Otherwise it is #-1#. If the DataPool# is connected to a file, the length is calculated basing on the length passed to the {connect}() function and the file size. If the DataPool# is connected to a master DataPool#, the length is calculated basing on the value returned by the master's get_length()# function and the length passed to the {connect}() function. {itemize}.
Definition at line 967 of file DataPool.cpp.
int DataPool::get_size | ( | void | ) | const [inline] |
Returns the number of bytes of data available in this DataPool#.
Contrary to the {get_length}() function, this one doesn't try to interpret the IFF structure and predict the file length. It just returns the number of bytes of data really available inside the DataPool#, if it contains data, or inside its range, if it's connected to another DataPool# or a file.
Definition at line 476 of file DataPool.h.
GP< ByteStream > DataPool::get_stream | ( | void | ) |
Returns a {ByteStream} to access contents of the DataPool# sequentially.
By reading from the returned stream you basically call {get_data}() function. Thus, everything said for it remains true for the stream too.
Definition at line 1795 of file DataPool.cpp.
bool DataPool::has_data | ( | int | start, | |
int | length | |||
) |
Returns TRUE# if all data available for offsets from start# till start+length-1#.
If length# is negative, the range is assumed to extend up to the end of the DataPool#. This function works both for connected and not connected DataPool::s. Once it returned TRUE# for some offsets range, you can be sure that the subsequent {get_data}() request will not block.
Definition at line 1087 of file DataPool.cpp.
bool DataPool::is_connected | ( | void | ) | const [inline] |
Returns TRUE# if this DataPool# is connected to another DataPool# or to a file.
Definition at line 613 of file DataPool.h.
bool DataPool::is_eof | ( | void | ) | const [inline] |
Definition at line 451 of file DataPool.h.
void DataPool::load_file | ( | const GURL & | url | ) | [static] |
This function will make every DataPool# in the program, which is connected to a file, to load the file contents to the main memory and close the file.
This feature is important when you want to do something with the file like remove or overwrite it not affecting the rest of the program.
Definition at line 1445 of file DataPool.cpp.
void DataPool::load_file | ( | void | ) |
Loads data from the file into memory.
This function is only useful for DataPool::s getting data from a file. It descends the DataPool::s hierarchy until it either reaches a file-connected DataPool# or DataPool# containing the real data. In the latter case it does nothing, in the first case it makes the DataPool# read all data from the file into memory and stop using the file.
This may be useful when you want to overwrite the file and leave existing DataPool::s with valid data.
Definition at line 1401 of file DataPool.cpp.
void DataPool::set_eof | ( | void | ) |
Tells the DataPool# that all data has been added and nothing else is anticipated.
When EOF# is true, any reader attempting to read non existing data will not be blocked. It will either read ZERO# bytes or will get an ByteStream::EndOfFile# exception (see {get_data}()). Calling this function will also activate all registered trigger callbacks.
{ Note:} This function is meaningless and does nothing when the DataPool# is connected to another DataPool# or to a file.
Definition at line 1324 of file DataPool.cpp.
bool DataPool::simple_compare | ( | DataPool & | pool | ) | const [inline] |
Useful in comparing data pools.
Returns true if dirived from same URL or bytestream.
Definition at line 603 of file DataPool.h.
void DataPool::stop | ( | bool | only_blocked = false |
) |
Tells the DataPool# to stop serving readers.
If only_blocked# flag is TRUE# then only those requests will be processed, which would not block. Any attempt to get non-existing data would result in a STOP# exception (instead of blocking until data is available).
If only_blocked# flag is FALSE# then any further attempt to read from this DataPool# (as well as from any DataPool# connected to this one) will result in a STOP# exception.
Definition at line 1347 of file DataPool.cpp.
Member Data Documentation
const char * DataPool::Stop = ERR_MSG("STOP") [static] |
Definition at line 598 of file DataPool.h.
The documentation for this class was generated from the following files: