• Time series
Skip to end of metadata
Go to start of metadata

Interfaces

Time series in OGEMA are represented by a set of data points and an InterpolationMode, which specifies how to interprete the discrete set of points. Possible interpolation modes are LINEAR, NEAREST, STEPS and NONE. The basic time series interface is ReadOnlyTimeSeries, which defines methods for accessing the data points in a time series, but does not allow to add or modify points. The latter are provided by the derived interface TimeSeries. In the following two expandable code blocks the interface definitions are shown.

ReadOnlyTimeSeries  Expand source
TimeSeries  Expand source

Data points are represented by objects of type SampledValue, which consist of a timestamp, value container (see Value), and a Quality. The latter is an enum with two values, GOOD and BAD. BAD quality values are used to define gaps in the domain of the function. 

Implementations

The time series interfaces are used in three different contexts in the OGEMA framework:

  • Log data: value resources can be configured for logging (TODO link), and the resulting time series object is of type RecordedData, which extends ReadOnlyTimeSeries. RecordedData always has interpolation mode NONE.
  • Schedules: Schedules are persistent time series, i.e. they also implement the Resource interface (see Working with OGEMA Resources). Schedules are created just like ordinary resources, and they extend TimeSeries.
  • MemoryTimeSeries: a set of TimeSeries implementations that can be used for working with time series that need not be stored persistently:
    • TreeTimeSeries: the default implementation, used internally to represent schedules
    • FloatTreeTimeSeries: an extension of TreeTimeSeries, which provides a few additional methods specific for float-valued time series, such as addition of two time series, multiplication by scalars and other float time series, etc.
    • ArrayTimeSeries: an alternative implementation

Bulk access

In order to retrieve the set of points defining the time series, use one of the methods

The method #getValue(long timestamp) on the other hand takes into account the interpolation mode, and hence may return a point with good quality even if the defining set does not contain a point for the specified timestamp.

For writing a lot of values, it is recommended to use #addValues(Collection) instead of calling #addValue(long,Value) multiple times. 

Iteration

There are in general two alternative methods to iterate through the points of a time series, either using an iterator (see ReadOnlyTimeSeries#iterator() and #iterator(long start, long end)), or the methods #getNextValue(long timestamp) and #getPreviousValue(long timestamp). The latter allows to iterate from end to start, whereas all other methods proceed chronologically. For larger data sets it may be more efficient to use an iterator over #getNextValue. For instance, log data is stored internally in a database which uses one file per day. Repeated use of #getNextValue will lead to a lot of file access, hence slowing down the application. The iterator preloads all data points for a single day when the first point of the day is requested, and hence reduces file access to a minimum.

The iterators provided by schedules and log data do not throw ConcurrentModificationException; changes to the time series during iteration may or may not be reflected in the iterator. The iterator of a MemoryTimeSeries is fail-fast, on the other hand, and requries synchronization if concurrent modification can occur.

Next


  • No labels