Khiva

This is the documentation of Khiva library.

Khiva [1] is an open source C++ library which focus on providing efficient algorithms to perform analytics over time series of data. It can be used to extract insights from one or a group of time series. The large number of available methods allow us to understand the nature of each time series. Based on that analytics, the user can reduce dimensionality, find out recurrent motifs or discords, understand the seasonality or trend from a given time series, forecasting and detect anomalies.

It is a novel project that wants to provide a mean for time series analytics at scale. Our vision is that this kind of analytics can be exploit in a wide range of use cases across several industries, like finance, energy, e-health, IOT, application performance monitoring (APM), music industry, etc.

It is just the beginning, so keep tuned, more features are coming …

shapelets-khiva@googlegroups.com is the place for discussions and questions about Khiva library. We use the GitHub Issue Tracker to manage bug reports and feature requests.

You can jump right into the package by looking into our Getting Started.

Getting Started

Getting the source code

You can download the latest stable released version, or you can get the latest source code version by cloning our git repository:

git clone https://github.com/Shapelets/Khiva

Dependencies

Khiva relies on a number of open source libraries and tools which are required to get it running.

Tools:

Note

All versions of Khiva Library require a fully C++11-compliant compiler.

Libraries:

Linux

We will use Ubuntu 16.04 LTS as our example linux distribution.

Once we have installed all Khiva dependencies, we are ready to build and install Khiva. First, go to the directory where the source code is stored.

mkdir build
cd build
cmake ..
make install

It will install the library in /usr/local/lib and /usr/local/include folders.

In case ArrayFire is not installed in the system default directories, it is also required to add the Arrayfire lib folder to the environment variable LD_LIBRARY_PATH.

export LD_LIBRARY_PATH="/pathToArrayfire/arrayfire/lib:$LD_LIBRARY_PATH"

Mac OS

Once we have installed all Khiva dependencies, we are ready to build and install Khiva. First, go to the directory where the source code is stored:

mkdir build
cd build
cmake ..
make install

It will install the library in /usr/local/lib and /usr/local/include folders.

Windows

First, we need to ensure the Graphviz, Dot and Doxygen binaries are included in the environment variable PATH. Once we have installed all Khiva dependencies, we are ready to build and install Khiva. So, go to the directory where the source code is stored and proceed as follows:

mkdir build
cd build
cmake ..
make install

It will install the library in C:/Program Files/Khiva/v0/lib and C:/Program Files/Khiva/v0/include folders.

Khiva API

This is the list of namespaces that comprise the Khiva library.

Namespace Array

namespace khivaarray

Functions

af::array khiva::arraycreateArray(void *data, unsigned ndims, dim_t *dims, const int type)

Creates an af::array.

Return
af::array Containing the data.
Parameters
  • data: Data used to create the af::array.
  • ndims: Number of dimensions of data.
  • dims: Cardinality of dimensions of data.
  • type: Data type.

void khiva::arraygetData(af::array array, void *data)

Retrieves the data from the device to the host.

Parameters
  • array: The Array that contains the data to be retrieved.
  • data: Pointer to a preallocated block of memory in the host.

af::dim4 khiva::arraygetDims(af::array array)

Returns the dimensions from a given array.

Return
af::dim4 The dimensions.
Parameters
  • array: Array from which to get the dimensions.

void khiva::arrayprint(af::array array)

Prints the content of an array.

Parameters
  • array: The array to be printed.

void khiva::arraydeleteArray(af_array array)

Decreases the references count for the given array.

Parameters
  • array: The Array to be deleted.

int khiva::arraygetType(af::array array)

Gets the type of the array.

Return
int Value of the Dtype enumeration.
Parameters
  • array: The array to obtain the type from.

template <typename T>
std::vector<int> khiva::arraygetRowsWithMaximals(khiva::array::Array<T> a)

Gets the indices of all rows containing a maximal.

Return
std::vector<int> with the indices of the rows with maximals.
Parameters
  • a: The input array.

template <typename T>
std::vector<int> khiva::arraygetIndexMaxColums(std::vector<T> r)

Gets the indices of the columns with maximals.

Return
std::vector<int> with the indices of the columns with maximals.
Parameters
  • r: The input row.

template <class T>
class khiva::arrayArray
#include </home/docs/checkouts/readthedocs.org/user_builds/khiva/checkouts/v0.1.0/include/khiva/array.h>

Array class, This class provides functionality manage Arrays on the host side.

Public Functions

khiva::array::ArrayArray()

Default Constructor of Array class.

khiva::array::ArrayArray(af::array in)

Constructor of Array class which receives and af::array.

Parameters
  • in: The input af::array.

khiva::array::Array~Array()

Default destructor of Array class.

void khiva::array::ArraysetNumX(int val)

Sets the cardinality of the first dimension.

Parameters
  • val: The value to be set.

void khiva::array::ArraysetNumY(int val)

Sets the cardinality of the second dimension.

Parameters
  • val: The value to be set.

void khiva::array::ArraysetNumW(int val)

Sets the cardinality of the third dimension.

Parameters
  • val: The value to be set.

void khiva::array::ArraysetNumZ(int val)

Sets the cardinality of the fourth dimension.

Parameters
  • val: The value to be set.

void khiva::array::ArraysetData(T *pd)

Sets the data to be stored in the Array.

Parameters
  • pd: The data to be stored.

int khiva::array::ArraygetNumX()

Gets the cardinality of the first dimension.

Return
int the Cardinality of the first dimension.

int khiva::array::ArraygetNumY()

Gets the cardinality of the second dimension.

Return
int the Cardinality of the second dimension.

int khiva::array::ArraygetNumW()

Gets the cardinality of the third dimension.

Return
int the Cardinality of the third dimension.

int khiva::array::ArraygetNumZ()

Gets the cardinality of the fourth dimension.

Return
int the Cardinality of the fourth dimension.

int khiva::array::ArraygetNumElements()

Gets the number of elements in data.

Return
int the Cardinality of the number of elements.

std::vector<T> khiva::array::ArraygetRow(int idx)

Gets the row number given by idx.

Return
std::vector Containing the selected row.
Parameters
  • idx: The row number to be extracted.

std::vector<T> khiva::array::ArraygetColumn(int idx)

Gets the column number given by idx.

Return
std::vector Containing the selected column.
Parameters
  • idx: The column number to be extracted.

T khiva::array::ArraygetElement(int row, int column)

Gets the element given by row and column.

Return
T The element to be extracted.
Parameters
  • row: The row number.
  • column: The column number.

T *khiva::array::ArraygetData()

Gets a pointer to the data stored in the array.

Return
T Pointer to data.

bool khiva::array::ArrayisEmpty()

Checks whether The Array is empty or not.

Return
True if the Array is empty, false otherwise.

void khiva::array::Arrayprint()

Prints the content of the array.

Namespace Dimensionality

namespace khivadimensionality

Typedefs

typedef std::pair<float, float> khiva::dimensionalityPoint
typedef std::pair<int, int> khiva::dimensionalitySegment

Functions

std::vector<Point> khiva::dimensionalityPAA(std::vector<Point> points, int bins)

Piecewise Aggregate Approximation (PAA) approximates a time series \(X\) of length \(n\) into vector \(\bar{X}=(\bar{x}_{1},…,\bar{x}_{M})\) of any arbitrary length \(M \leq n\) where each of \(\bar{x_{i}}\) is calculated as follows:

\[ \bar{x}_{i} = \frac{M}{n} \sum_{j=n/M(i-1)+1}^{(n/M)i} x_{j}. \]
Which simply means that in order to reduce the dimensionality from \(n\) to \(M\), we first divide the original time series into \(M\) equally sized frames and secondly compute the mean values for each frame. The sequence assembled from the mean values is the PAA approximation (i.e., transform) of the original time series.

Return
result A vector of Points with the reduced dimensionality.
Parameters
  • points: Set of points.
  • bins: Sets the total number of divisions.

af::array khiva::dimensionalityPAA(af::array a, int bins)

Piecewise Aggregate Approximation (PAA) approximates a time series \(X\) of length \(n\) into vector \(\bar{X}=(\bar{x}_{1},…,\bar{x}_{M})\) of any arbitrary length \(M \leq n\) where each of \(\bar{x_{i}}\) is calculated as follows:

\[ \bar{x}_{i} = \frac{M}{n} \sum_{j=n/M(i-1)+1}^{(n/M)i} x_{j}. \]
Which simply means that in order to reduce the dimensionality from \(n\) to \(M\), we first divide the original time series into \(M\) equally sized frames and secondly compute the mean values for each frame. The sequence assembled from the mean values is the PAA approximation (i.e., transform) of the original time series.

Return
af::array An array of points with the reduced dimensionality.
Parameters
  • a: Set of points.
  • bins: Sets the total number of divisions.

af::array khiva::dimensionalityPIP(af::array ts, int numberIPs)

Calculates the number of Perceptually Important Points (PIP) in the time series.

[1] Fu TC, Chung FL, Luk R, and Ng CM. Representing financial time series based on data point importance. Engineering Applications of Artificial Intelligence, 21(2):277-300, 2008.

Return
af::array Array with the most Perceptually Important numPoints.
Parameters
  • ts: Expects an input array whose dimension zero is the length of the time series.
  • numberIPs: The number of points to be returned.

std::vector<Point> khiva::dimensionalityPLABottomUp(std::vector<Point> ts, float maxError)

Applies the Piecewise Linear Approximation (PLA BottomUP) to the time series.

[1] Zhu Y, Wu D, Li Sh (2007). A Piecewise Linear Representation Method of Time Series Based on Feature Points. Knowledge-Based Intelligent Information and Engineering Systems 4693:1066-1072.

Return
std::vector Vector with the reduced number of points.
Parameters
  • ts: Expects an input vector containing the set of points to be reduced.
  • maxError: The maximum approximation error allowed.

af::array khiva::dimensionalityPLABottomUp(af::array ts, float maxError)

Applies the Piecewise Linear Approximation (PLA BottomUP) to the time series.

[1] Zhu Y, Wu D, Li Sh (2007). A Piecewise Linear Representation Method of Time Series Based on Feature Points. Knowledge-Based Intelligent Information and Engineering Systems 4693:1066-1072.

Return
af::array with the reduced number of points.
Parameters
  • ts: Expects an af::array containing the set of points to be reduced. The first component of the points in the first column and the second component of the points in the second column.
  • maxError: The maximum approximation error allowed.

std::vector<Point> khiva::dimensionalityPLASlidingWindow(std::vector<Point> ts, float maxError)

Applies the Piecewise Linear Approximation (PLA Sliding Window) to the time series.

[1] Zhu Y, Wu D, Li Sh (2007). A Piecewise Linear Representation Method of Time Series Based on Feature Points. Knowledge-Based Intelligent Information and Engineering Systems 4693:1066-1072.

Return
std::vector Vector with the reduced number of points.
Parameters
  • ts: Expects an input vector containing the set of points to be reduced.
  • maxError: The maximum approximation error allowed.

af::array khiva::dimensionalityPLASlidingWindow(af::array ts, float maxError)

Applies the Piecewise Linear Approximation (PLA Sliding Window) to the time series.

[1] Zhu Y, Wu D, Li Sh (2007). A Piecewise Linear Representation Method of Time Series Based on Feature Points. Knowledge-Based Intelligent Information and Engineering Systems 4693:1066-1072.

Return
af::array with the reduced number of points.
Parameters
  • ts: Expects an af::array containing the set of points to be reduced. The first component of the points in the first column and the second component of the points in the second column.
  • maxError: The maximum approximation error allowed.

std::vector<Point> khiva::dimensionalityramerDouglasPeucker(std::vector<Point> pointList, double epsilon)

The Ramer–Douglas–Peucker algorithm (RDP) is an algorithm for reducing the number of points in a curve that is approximated by a series of points. It reduces a set of points depending on the perpendicular distance of the points and epsilon, the greater epsilon, more points are deleted.

[1] Urs Ramer, “An iterative procedure for the polygonal approximation of plane curves”, Computer Graphics and Image Processing, 1(3), 244–256 (1972) doi:10.1016/S0146-664X(72)80017-0.

[2] David Douglas & Thomas Peucker, “Algorithms for the reduction of the number of points required to represent a

digitized line or its caricature”, The Canadian Cartographer 10(2), 112–122 (1973) doi:10.3138/FM57-6770-U75U-7727

Return
std:vector<khiva::dimensionality::Point> with the selected points.
Parameters
  • pointList: Set of input points.
  • epsilon: It acts as the threshold value to decide which points should be considered meaningful or not.

af::array khiva::dimensionalityramerDouglasPeucker(af::array pointList, double epsilon)

The Ramer–Douglas–Peucker algorithm (RDP) is an algorithm for reducing the number of points in a curve that is approximated by a series of points. It reduces a set of points depending on the perpendicular distance of the points and epsilon, the greater epsilon, more points are deleted.

[1] Urs Ramer, “An iterative procedure for the polygonal approximation of plane curves”, Computer Graphics and Image Processing, 1(3), 244–256 (1972) doi:10.1016/S0146-664X(72)80017-0.

[2] David Douglas & Thomas Peucker, “Algorithms for the reduction of the number of points required to represent a

digitized line or its caricature”, The Canadian Cartographer 10(2), 112–122 (1973) doi:10.3138/FM57-6770-U75U-7727

Return
af::array with the selected points.
Parameters
  • pointList: Set of input points.
  • epsilon: It acts as the threshold value to decide which points should be considered meaningful or not.

af::array khiva::dimensionalitySAX(af::array a, int alphabetSize)

Symbolic Aggregate approXimation (SAX). It transforms a numeric time series into a time series of symbols with the same size. The algorithm was proposed by Lin et al.) and extends the PAA-based approach inheriting the original algorithm simplicity and low computational complexity while providing satisfactory sensitivity and selectivity in range query processing. Moreover, the use of a symbolic representation opened a door to the existing wealth of data-structures and string-manipulation algorithms in computer science such as hashing, regular expression, pattern matching, suffix trees, and grammatical inference.

[1] Lin, J., Keogh, E., Lonardi, S. & Chiu, B. (2003) A Symbolic Representation of Time Series, with Implications for Streaming Algorithms. In proceedings of the 8th ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery. San Diego, CA. June 13.

Return
result An array of symbols.
Parameters
  • a: Array with the input time series.
  • alphabetSize: Number of element within the alphabet.

std::vector<Point> khiva::dimensionalityvisvalingam(std::vector<Point> pointList, int numPoints)

Reduces a set of points by applying the Visvalingam method (minimum triangle area) until the number of points is reduced to numPoints.

[1] M. Visvalingam and J. D. Whyatt, Line generalisation by repeated elimination of points, The Cartographic Journal, 1993.

Return
std:vector<khiva::dimensionality::Point> where the number of points has been reduced to numPoints.
Parameters
  • pointList: Expects an input vector of points.
  • numPoints: Sets the number of points returned after the execution of the method.

af::array khiva::dimensionalityvisvalingam(af::array pointList, int numPoints)

Reduces a set of points by applying the Visvalingam method (minimum triangle area) until the number of points is reduced to numPoints.

[1] M. Visvalingam and J. D. Whyatt, Line generalisation by repeated elimination of points, The Cartographic Journal, 1993.

Return
af::array where the number of points has been reduced to numPoints.
Parameters
  • pointList: Expects an input vector of points.
  • numPoints: Sets the number of points returned after the execution of the method.

Namespace Distances

namespace khivadistances

Functions

double khiva::distancesdtw(std::vector<double> a, std::vector<double> b)

Calculates the Dynamic Time Warping Distance.

Return
array The resulting distance between a and b.
Parameters
  • a: The input time series of reference.
  • b: The input query.

af::array khiva::distancesdtw(af::array tss)

Calculates the Dynamic Time Warping Distance.

Return
af::array An upper triangular matrix where each position corresponds to the distance between two time series. Diagonal elements will be zero. For example: Position row 0 column 1 records the distance between time series 0 and time series 1.
Parameters
  • tss: Expects an input array whose dimension zero is the length of the time series (all the same) and dimension one indicates the number of time series.

af::array khiva::distanceseuclidean(af::array tss)

Calculates euclidean distances between time series.

Return
af::array An upper triangular matrix where each position corresponds to the distance between two time series. Diagonal elements will be zero. For example: Position row 0 column 1 records the distance between time series 0 and time series 1.
Parameters
  • tss: Expects an input array whose dimension zero is the length of the time series (all the same) and dimension one indicates the number of time series.

af::array khiva::distanceshamming(af::array tss)

Calculates hamming distances between time series.

Return
af::array An upper triangular matrix where each position corresponds to the distance between two time series. Diagonal elements will be zero. For example: Position row 0 column 1 records the distance between time series 0 and time series 1.
Parameters
  • tss: Expects an input array whose dimension zero is the length of the time series (all the same) and dimension one indicates the number of time series.

af::array khiva::distancesmanhattan(af::array tss)

Calculates manhattan distances between time series.

Return
af::array An upper triangular matrix where each position corresponds to the distance between two time series. Diagonal elements will be zero. For example: Position row 0 column 1 records the distance between time series 0 and time series 1.
Parameters
  • tss: Expects an input array whose dimension zero is the length of the time series (all the same) and dimension one indicates the number of time series.

af::array khiva::distancessquaredEuclidean(af::array tss)

Calculates non squared version of the euclidean distance.

Return
array An upper triangular matrix where each position corresponds to the distance between two time series. Diagonal elements will be zero. For example: Position row 0 column 1 records the distance between time series 0 and time series 1.
Parameters
  • tss: Expects an input array whose dimension zero is the length of the time series (all the same) and dimension one indicates the number of time series.

Namespace Features

namespace khivafeatures

Functions

af::array khiva::featuresabsEnergy(af::array base)

Calculates the absolute energy of the time series which is the sum over the squared values.

\[ E = \sum_{i=1,\ldots, n} x_i^2 \]
.

Return
af::array An array with the same dimensions as tss, whose values (time series in dimension 0) contains the sum of the squares values in the time series.
Parameters
  • base: Expects an input array whose dimension zero is the length of the time series (all the same) and dimension one indicates the number of time series.

af::array khiva::featuresabsoluteSumOfChanges(af::array tss)

Calculates the sum over the absolute value of consecutive changes in the time series.

\[ \sum_{i=1, \ldots, n-1} \mid x_{i+1}- x_i \mid \]
.

Return
af::array An array with the same dimensions as tss, whose values (time series in dimension 0) contains absolute value of consecutive changes in the time series.
Parameters
  • tss: Expects an input array whose dimension zero is the length of the time series (all the same) and dimension one indicates the number of time series.

af::array khiva::featuresaggregatedAutocorrelation(af::array tss, af::array (*aggregationFunction)(const af::array&, const bool, const dim_t))

Calculates the value of an aggregation function f_agg (e.g. var or mean) of the autocorrelation (Compare to http://en.wikipedia.org/wiki/Autocorrelation#Estimation), taken over different all possible lags (1 to length of x).

\[ \frac{1}{n-1} \sum_{l=1,\ldots, n} \frac{1}{(n-l)\sigma^{2}} \sum_{t=1}^{n-l}(X_{t}-\mu )(X_{t+l}-\mu), \]
where \(n\) is the length of the time series \(X_i\), \(\sigma^2\) its variance and \(\mu\) its mean.

Return
af::array An array with the same dimensions as tss, whose values (time series in dimension 0) contains the aggregated correlation for each time series.
Parameters
  • tss: Expects an input array whose dimension zero is the length of the time series (all the same) and dimension one indicates the number of time series.
  • aggregationFunction: The function to summarise all autocorrelation with different lags.

af::array khiva::featuresaggregatedAutocorrelation(af::array tss, af::array (*aggregationFunction)(const af::array&, const int))

Calculates the value of an aggregation function f_agg (e.g. var or mean) of the autocorrelation (Compare to http://en.wikipedia.org/wiki/Autocorrelation#Estimation), taken over different all possible lags (1 to length of x).

\[ \frac{1}{n-1} \sum_{l=1,\ldots, n} \frac{1}{(n-l)\sigma^{2}} \sum_{t=1}^{n-l}(X_{t}-\mu )(X_{t+l}-\mu), \]
where \(n\) is the length of the time series \(X_i\), \(\sigma^2\) its variance and \(\mu\) its mean.

Return
af::array An array with the same dimensions as tss, whose values (time series in dimension 0) contains the aggregated correlation for each time series.
Parameters
  • tss: Expects an input array whose dimension zero is the length of the time series (all the same) and dimension one indicates the number of time series.
  • aggregationFunction: The function to summarise all autocorrelation with different lags.

af::array khiva::featuresaggregatedAutocorrelation(af::array tss, af::array (*aggregationFunction)(const af::array&, const dim_t))

Calculates the value of an aggregation function f_agg (e.g. var or mean) of the autocorrelation (Compare to http://en.wikipedia.org/wiki/Autocorrelation#Estimation), taken over different all possible lags (1 to length of x).

\[ \frac{1}{n-1} \sum_{l=1,\ldots, n} \frac{1}{(n-l)\sigma^{2}} \sum_{t=1}^{n-l}(X_{t}-\mu )(X_{t+l}-\mu), \]
where \(n\) is the length of the time series \(X_i\), \(\sigma^2\) its variance and \(\mu\) its mean.

Return
af::array An array with the same dimensions as tss, whose values (time series in dimension 0) contains the aggregated correlation for each time series.
Parameters
  • tss: Expects an input array whose dimension zero is the length of the time series (all the same) and dimension one indicates the number of time series.
  • aggregationFunction: The function to summarise all autocorrelation with different lags.

void khiva::featuresaggregatedLinearTrend(af::array t, long chunkSize, af::array (*aggregationFunction)(const af::array&, const int), af::array &slope, af::array &intercept, af::array &rvalue, af::array &pvalue, af::array &stderrest, )

Calculates a linear least-squares regression for values of the time series that were aggregated over chunks versus the sequence from 0 up to the number of chunks minus one.

Parameters
  • t: The time series to calculate the features of.
  • chunkSize: The chunkSize used to aggregate the data.
  • aggregationFunction: Function to be used in the aggregation.
  • slope: Slope of the regression line.
  • intercept: Intercept of the regression line.
  • rvalue: Correlation coefficient.
  • pvalue: Two-sided p-value for a hypothesis test whose null hypothesis is that the slope is zero, using Wald Test with t-distribution of the test statistic.
  • stderrest: Standard error of the estimated gradient.

void khiva::featuresaggregatedLinearTrend(af::array t, long chunkSize, af::array (*aggregationFunction)(const af::array&, const dim_t), af::array &slope, af::array &intercept, af::array &rvalue, af::array &pvalue, af::array &stderrest, )

Calculates a linear least-squares regression for values of the time series that were aggregated over chunks versus the sequence from 0 up to the number of chunks minus one.

Parameters
  • t: The time series to calculate the features of.
  • chunkSize: The chunkSize used to aggregate the data.
  • aggregationFunction: Function to be used in the aggregation.
  • slope: Slope of the regression line.
  • intercept: Intercept of the regression line.
  • rvalue: Correlation coefficient.
  • pvalue: Two-sided p-value for a hypothesis test whose null hypothesis is that the slope is zero, using Wald Test with t-distribution of the test statistic.
  • stderrest: Standard error of the estimated gradient.

af::array khiva::featuresapproximateEntropy(af::array tss, int m, float r)

Calculates a vectorized Approximate entropy algorithm (https://en.wikipedia.org/wiki/Approximate_entropy). For short time series, this method is highly dependent on the parameters, but should be stable for N > 2000, see:

[1] Yentes et al., The Appropriate Use of Approximate Entropy and Sample Entropy with Short Data Sets, (2012). Other shortcomings and alternatives discussed in: Richman & Moorman, Physiological time-series analysis using approximate entropy and sample entropy, (2000).

Return
af::array An array with the same dimensions as tss, whose values (time series in dimension 0) contains the vectorized Approximate entropy for all the input time series in tss.
Parameters
  • tss: Expects an input array whose dimension zero is the length of the time series (all the same) and dimension one indicates the number of time series.
  • m: Length of compared run of data.
  • r: Filtering level, must be positive.

af::array khiva::featuresautoCorrelation(af::array tss, long maxLag, bool unbiased = false)

Calculates the autocorrelation of the specified lag for the given time series, according to the formula [1].

\[ \frac{1}{(n-l)\sigma^{2}} \sum_{t=1}^{n-l}(X_{t}-\mu )(X_{t+l}-\mu), \]
where \(n\) is the length of the time series \(X_i\), \(\sigma^2\) its variance and \(\mu\) its mean, \(l\) denotes the lag.

[1] https://en.wikipedia.org/wiki/Autocorrelation#Estimation

Return
af::array The autocorrelation value for the given time series.
Parameters
  • tss: Expects an input array whose dimension zero is the length of the time series (all the same) and dimension one indicates the number of time series.
  • maxLag: The maximum lag to compute.
  • unbiased: Determines whether it divides by (n - lag) (if true), or n (if false).

af::array khiva::featuresautoCovariance(af::array xss, bool unbiased = false)

Calculates the auto-covariance the given time series.

Return
af::array The auto-covariance value for the given time series.
Parameters
  • xss: Expects an input array whose dimension zero is the length of the time series (all the same) and dimension one indicates the number of time series.
  • unbiased: Determines whether it divides by n - lag (if true) or n (if false).

af::array khiva::featuresbinnedEntropy(af::array tss, int max_bins)

Calculates the binned entropy for the given time series and number of bins. It calculates the value of:

\[ \sum_{k=0}^{min(max\_bins, len(x))} p_k log(p_k) \cdot \mathbf{1}_{(p_k > 0)}, \]
where \(p_k\) is the percentage of samples in bin \(k\).

Return
af::array The binned entropy value for the given time series.
Parameters
  • tss: Expects an input array whose dimension zero is the length of the time series (all the same) and dimension one indicates the number of time series.
  • max_bins: The number of bins.

af::array khiva::featuresc3(af::array tss, long lag)

This function calculates the value of:

\[ \frac{1}{n-2lag} \sum_{i=0}^{n-2lag} x_{i + 2 \cdot lag}^2 \cdot x_{i + lag} \cdot x_{i}, \]
which is:
\[ \mathbb{E}[L^2(X)^2 \cdot L(X) \cdot X], \]
where \(\mathbb{E}\) is the mean and \(L\) is the lag operator. It was proposed in [1] as a measure of non linearity in the time series.

[1] Schreiber, T. and Schmitz, A., Discrimination power of measures for nonlinearity in a time series, PHYSICAL REVIEW E, VOLUME 55, NUMBER 5, (1997).

Return
af::array The non-linearity value for the given time series.
Parameters
  • tss: Expects an input array whose dimension zero is the length of the time series (all the same) and dimension one indicates the number of time series.
  • lag: The lag.

af::array khiva::featurescidCe(af::array tss, bool zNormalize = false)

This function calculator is an estimate for a time series complexity 1. It calculates the value of:

\[ \sqrt{ \sum_{i=0}^{n-2lag} ( x_{i} - x_{i+1})^2 }. \]
.

[1] Batista, Gustavo EAPA, et al (2014). CID: an efficient complexity-invariant distance for time series. Data Mining and Knowledge Difscovery 28.3 (2014): 634-669.

Return
af::array The complexity value for the given time series.
Parameters
  • tss: Expects an input array whose dimension zero is the length of the time series (all the same) and dimension one indicates the number of time series.
  • zNormalize: Controls whether the time series should be z-normalized or not.

af::array khiva::featurescountAboveMean(af::array tss)

Calculates the number of values in the time series that are higher than the mean.

Return
af::array The number of values in the time series that are higher than the mean.
Parameters
  • tss: Expects an input array whose dimension zero is the length of the time series (all the same) and dimension one indicates the number of time series.

af::array khiva::featurescountBelowMean(af::array tss)

Calculates the number of values in the time series that are lower than the mean.

Return
af::array The number of values in the time series that are lower than the mean.
Parameters
  • tss: Expects an input array whose dimension zero is the length of the time series (all the same) and dimension one indicates the number of time series.

af::array khiva::featurescrossCovariance(af::array xss, af::array yss, bool unbiased = true)

Calculates the cross-covariance of the given time series.

Return
af::array The cross-covariance value for the given time series.
Parameters
  • xss: Expects an input array whose dimension zero is the length of the time series (all the same) and dimension one indicates the number of time series.
  • yss: Expects an input array whose dimension zero is the length of the time series (all the same) and dimension one indicates the number of time series.
  • unbiased: Determines whether it divides by n - lag (if true) or n (if false).

af::array khiva::featurescrossCorrelation(af::array xss, af::array yss, bool unbiased = true)

Calculates the cross-correlation of the given time series.

Return
af::array The cross-correlation value for the given time series.
Parameters
  • xss: Expects an input array whose dimension zero is the length of the time series (all the same) and dimension one indicates the number of time series.
  • yss: Expects an input array whose dimension zero is the length of the time series (all the same) and dimension one indicates the number of time series.
  • unbiased: Determines whether it divides by n - lag (if true) or n (if false).

af::array khiva::featurescwtCoefficients(af::array tss, af::array widths, int coeff, int w)

Calculates a Continuous wavelet transform for the Ricker wavelet, also known as the “Mexican hat wavelet” which is defined by:

\[ \frac{2}{\sqrt{3a} \pi^{ \frac{1} { 4 }}} (1 - \frac{x^2}{a^2}) exp(-\frac{x^2}{2a^2}), \]
where \(a\) is the width parameter of the wavelet function. This feature calculator takes three different parameter: widths, coeff and w. The feature calculator takes all the different widths arrays and then calculates the cwt one time for each different width array. Then the values for the different coefficient for coeff and width w are returned.

Return
af::array Result of calculated coefficients.
Parameters
  • tss: Expects an input array whose dimension zero is the length of the time series (all the same) and dimension one indicates the number of time series.
  • widths: Array that contains all different widths.
  • coeff: Coefficient of interest.
  • w: Width of interest.

af::array khiva::featuresenergyRatioByChunks(af::array tss, long numSegments, long segmentFocus)

Calculates the sum of squares of chunk i out of N chunks expressed as a ratio with the sum of squares over the whole series. segmentFocus should be lower than the number of segments.

Return
af::array The energy ratio by chunk of the time series.
Parameters
  • tss: Expects an input array whose dimension zero is the length of the time series (all the same) and dimension one indicates the number of time series.
  • numSegments: The number of segments to divide the series into.
  • segmentFocus: The segment number (starting at zero) to return a feature on.

af::array khiva::featuresfftAggregated(af::array tss)

Calculates the spectral centroid (mean), variance, skew, and kurtosis of the absolute fourier transform spectrum.

Return
af::array The spectral centroid (mean), variance, skew, and kurtosis of the absolute fourier transform spectrum.
Parameters
  • tss: Expects an input array whose dimension zero is the length of the time series (all the same) and dimension one indicates the number of time series.

void khiva::featuresfftCoefficient(af::array tss, long coefficient, af::array &real, af::array &imag, af::array &abs, af::array &angle)

Calculates the fourier coefficients of the one-dimensional discrete Fourier Transform for real input by using fast fourier transformation algorithm,

\[ A_k = \sum_{m=0}^{n-1} a_m \exp \left \{ -2 \pi i \frac{m k}{n} \right \}, \qquad k = 0, \ldots , n-1. \]
.

Parameters
  • tss: Expects an input array whose dimension zero is the length of the time series (all the same) and dimension one indicates the number of time series.
  • coefficient: The coefficient to extract from the FFT.
  • real: The real part of the coefficient.
  • imag: The imaginary part of the coefficient.
  • abs: The absolute value of the coefficient.
  • angle: The angle of the coefficient.

af::array khiva::featuresfirstLocationOfMaximum(af::array tss)

Calculates the first relative location of the maximal value for each time series.

Return
af::array The first relative location of the maximum value to the length of the time series, for each time series.
Parameters
  • tss: Expects an input array whose dimension zero is the length of the time series (all the same) and dimension one indicates the number of time series.

af::array khiva::featuresfirstLocationOfMinimum(af::array tss)

Calculates the first location of the minimal value of each time series. The position is calculated relatively to the length of the series.

Return
af::array the first relative location of the minimal value of each time series.
Parameters
  • tss: Expects an input array whose dimension zero is the length of the time series (all the same) and dimension one indicates the number of time series.

af::array khiva::featuresfriedrichCoefficients(af::array tss, int m, float r)

Coefficients of polynomial \(h(x)\), which has been fitted to the deterministic dynamics of Langevin model:

\[ \dot(x)(t) = h(x(t)) + R \mathcal(N)(0,1) \]
as described by [1]. For short time series this method is highly dependent on the parameters.

[1] Friedrich et al., Physics Letters A 271, p. 217-222, Extracting model equations from experimental data, (2000).

Return
af::array The coefficients for each time series.
Parameters
  • tss: Expects an input array whose dimension zero is the length of the time series (all the same) and dimension one indicates the number of time series.
  • m: Order of polynom to fit for estimating fixed points of dynamics.
  • r: Number of quantiles to use for averaging.

af::array khiva::featureshasDuplicates(af::array tss)

Computes if the input time series contain duplicated elements.

Return
af::array Array containing True if the time series contains duplicated elements and false otherwise.
Parameters
  • tss: Expects an input array whose dimension zero is the length of the time series (all the same) and dimension one indicates the number of time series.

af::array khiva::featureshasDuplicateMax(af::array tss)

Computes if the maximum within time series is duplicated.

Return
af::array Array containing True if the maximum value of the time series is duplicated and false otherwise.
Parameters
  • tss: Expects an input array whose dimension zero is the length of the time series (all the same) and dimension one indicates the number of time series.

af::array khiva::featureshasDuplicateMin(af::array tss)

Computes if the minimum of input time series is duplicated.

Return
af::array Array containing True if the minimum of the time series is duplicated and false otherwise.
Parameters
  • tss: Expects an input array whose dimension zero is the length of the time series (all the same) and dimension one indicates the number of time series.

af::array khiva::featuresindexMassQuantile(af::array tss, float q)

Calculates the relative index \(i\) where \(q\%\) of the mass of the time series within tss lie at the left of \(i\). For example for \(q = 50\%\) this feature calculator will return the mass center of the time series.

Return
af::array The relative indices i where q% of the mass of the time series lie at the left of i.
Parameters
  • tss: Expects an input array whose dimension zero is the length of the time series (all the same) and dimension one indicates the number of time series.
  • q: The quantile limit.

af::array khiva::featureskurtosis(af::array tss)

Returns the kurtosis of tss (calculated with the adjusted Fisher-Pearson standardized moment coefficient G2).

Return
af::array The kurtosis of tss.
Parameters
  • tss: Expects an input array whose dimension zero is the length of the time series (all the same) and dimension one indicates the number of time series.

af::array khiva::featureslargeStandardDeviation(af::array tss, float r)

Checks if the time series within tss have a large standard deviation.

\[ std(x) > r * (max(X)-min(X)). \]
.

Return
af::array Array containing True for those time series in tss that have a large standard deviation.
Parameters
  • tss: Expects an input array whose dimension zero is the length of the time series (all the same) and dimension one indicates the number of time series.
  • r: The threshold value.

af::array khiva::featureslastLocationOfMaximum(af::array tss)

Calculates the last location of the maximum value of each time series. The position is calculated relatively to the length of the series.

Return
af::array The last relative location of the maximum value of each time series.
Parameters
  • tss: Expects an input array whose dimension zero is the length of the time series (all the same) and dimension one indicates the number of time series.

af::array khiva::featureslastLocationOfMinimum(af::array tss)

Calculates the last location of the minimum value of each time series. The position is calculated relatively to the length of the series.

Return
af::array The last relative location of the minimum value of each series.
Parameters
  • tss: Expects an input array whose dimension zero is the length of the time series (all the same) and dimension one indicates the number of time series.

af::array khiva::featureslength(af::array tss)

Returns the length of the input time series.

Return
af::array The length of tss.
Parameters
  • tss: Expects an input array whose dimension zero is the length of the time series (all the same) and dimension one indicates the number of time series.

void khiva::featureslinearTrend(af::array tss, af::array &pvalue, af::array &rvalue, af::array &intercept, af::array &slope, af::array &stder)

Calculate a linear least-squares regression for the values of the time series versus the sequence from 0 to length of the time series minus one.

Parameters
  • tss: Expects an input array whose dimension zero is the length of the time series (all the same) and dimension one indicates the number of time series.
  • pvalue: The p-values for all time series.
  • rvalue: The r-values for all time series.
  • intercept: The intercept values for all time series.
  • slope: The slope for all time series.
  • stder: The stderr values for all time series.

af::array khiva::featureslocalMaximals(af::array tss)

Calculates all Local Maximals for the time series in tss.

Return
af::array The calculated local maximals for each time series in tss.
Parameters
  • tss: Expects an input array whose dimension zero is the length of the time series (all the same) and dimension one indicates the number of time series.

af::array khiva::featureslongestStrikeAboveMean(af::array tss)

Calculates the length of the longest consecutive subsequence in tss that is bigger than the mean of tss.

Return
af::array the length of the longest consecutive subsequence in the input time series that is bigger than the mean.
Parameters
  • tss: Expects an input array whose dimension zero is the length of the time series (all the same) and dimension one indicates the number of time series.

af::array khiva::featureslongestStrikeBelowMean(af::array tss)

Calculates the length of the longest consecutive subsequence in tss that is below the mean of tss.

Return
af::array The length of the longest consecutive subsequence in the input time series that is below the mean.
Parameters
  • tss: Expects an input array whose dimension zero is the length of the time series (all the same) and dimension one indicates the number of time series.

af::array khiva::featuresmaxLangevinFixedPoint(af::array tss, int m, float r)

Largest fixed point of dynamics \(\max_x {h(x)=0}\) estimated from polynomial \(h(x)\), which has been fitted to the deterministic dynamics of Langevin model:

\[ \dot{x}(t) = h(x(t)) + R \mathcal(N)(0,1) \]
.

[1] Friedrich et al., Extracting model equations from experimental data, Physics Letters A 271, p. 217-222, (2000).

Return
af::array Largest fixed point of deterministic dynamics.
Parameters
  • tss: Expects an input array whose dimension zero is the length of the time series (all the same) and dimension one indicates the number of time series. NOTE: the time series should be sorted.
  • m: Order of polynom to fit for estimating fixed points of dynamics.
  • r: Number of quantiles to use for averaging.

af::array khiva::featuresmaximum(af::array tss)

Calculates the maximum value for each time series within tss.

Return
af::array The maximum value of each time series within tss.
Parameters
  • tss: Expects an input array whose dimension zero is the length of the time series (all the same) and dimension one indicates the number of time series.

af::array khiva::featuresmean(af::array tss)

Calculates the mean value for each time series within tss.

Return
af::array The mean value of each time series within tss.
Parameters
  • tss: Expects an input array whose dimension zero is the length of the time series (all the same) and dimension one indicates the number of time series.

af::array khiva::featuresmeanAbsoluteChange(af::array tss)

Calculates the mean over the absolute differences between subsequent time series values in tss.

\[ \frac{1}{n} \sum_{i=1,\ldots, n-1} | x_{i+1} - x_{i}|. \]
.

Return
af::array The mean over the absolute differences between subsequent time series values.
Parameters
  • tss: Expects an input array whose dimension zero is the length of the time series (all the same) and dimension one indicates the number of time series.

af::array khiva::featuresmeanChange(af::array tss)

Calculates the mean over the differences between subsequent time series values in tss.

\[ \frac{1}{n} \sum_{i=1,\ldots, n-1} x_{i+1} - x_{i}. \]
.

Return
af::array The mean over the differences between subsequent time series values.
Parameters
  • tss: Expects an input array whose dimension zero is the length of the time series (all the same) and dimension one indicates the number of time series.

af::array khiva::featuresmeanSecondDerivativeCentral(af::array tss)

Calculates mean value of a central approximation of the second derivative for each time series in tss.

\[ \frac{1}{n} \sum_{i=1,\ldots, n-1} \frac{1}{2} (x_{i+2} - 2 \cdot x_{i+1} + x_i). \]
.

Return
af::array The mean value of a central approximation of the second derivative for each time series.
Parameters
  • tss: Expects an input array whose dimension zero is the length of the time series (all the same) and dimension one indicates the number of time series.

af::array khiva::featuresmedian(af::array tss)

Calculates the median value for each time series within tss.

Return
af::array The median value of each time series within tss.
Parameters
  • tss: Expects an input array whose dimension zero is the length of the time series (all the same) and dimension one indicates the number of time series.

af::array khiva::featuresminimum(af::array tss)

Calculates the minimum value for each time series within tss.

Return
af::array The minimum value of each time series within tss.
Parameters
  • tss: Expects an input array whose dimension zero is the length of the time series (all the same) and dimension one indicates the number of time series.

af::array khiva::featuresnumberCrossingM(af::array tss, int m)

Calculates the number of m-crossings. A m-crossing is defined as two sequential values where the first value is lower than m and the next is greater, or viceversa. If you set m to zero, you will get the number of zero crossings.

Return
af::array The number of m-crossings of each time series within tss.
Parameters
  • tss: Expects an input array whose dimension zero is the length of the time series (all the same) and dimension one indicates the number of time series.
  • m: The m value.

af::array khiva::featuresnumberCwtPeaks(af::array tss, int maxW)

This feature calculator searches for different peaks. To do so, the time series is smoothed by a ricker wavelet and for widths ranging from 1 to maxW. This feature calculator returns the number of peaks that occur at enough width scales and with sufficiently high Signal-to-Noise-Ratio (SNR).

Return
af::array The number of peaks for each time series.
Parameters
  • tss: Expects an input array whose dimension zero is the length of the time series (all the same) and dimension one indicates the number of time series.
  • maxW: The maximum width to consider.

af::array khiva::featuresnumberPeaks(af::array tss, int n)

Calculates the number of peaks of at least support \(n\) in the time series \(tss\). A peak of support \(n\) is defined as a subsequence of \(tss\) where a value occurs, which is bigger than its \(n\) neighbourgs to the left and to the right.

[1] Bioinformatics (2006) 22 (17): 2059-2065. doi: 10.1093/bioinformatics/btl355, http://bioinformatics.oxfordjournals.org/content/22/17/2059.long

Return
af::array The number of peaks of at least support \(n\).
Parameters
  • tss: Expects an input array whose dimension zero is the length of the time series (all the same) and dimension one indicates the number of time series.
  • n: The support of the peak.

af::array khiva::featurespartialAutocorrelation(af::array tss, af::array lags)

Calculates the value of the partial autocorrelation function at the given lag. The lag \(k\) partial autocorrelation of a time series \(\lbrace x_t, t = 1 \ldots T \rbrace\) equals the partial correlation of \(x_t\) and \(x_{t-k}\), adjusted for the intermediate variables \(\lbrace x_{t-1}, \ldots, x_{t-k+1} \rbrace\) ([1]). Following [2], it can be defined as:

\[ \alpha_k = \frac{ Cov(x_t, x_{t-k} | x_{t-1}, \ldots, x_{t-k+1})} {\sqrt{ Var(x_t | x_{t-1}, \ldots, x_{t-k+1}) Var(x_{t-k} | x_{t-1}, \ldots, x_{t-k+1} )}} \]
with (a) \(x_t = f(x_{t-1}, \ldots, x_{t-k+1})\) and (b) \( x_{t-k} = f(x_{t-1}, \ldots, x_{t-k+1})\) being AR(k-1) models that can be fitted by OLS. Be aware that in (a), the regression is done on past values to predict \( x_t \) whereas in (b), future values are used to calculate the past value \(x_{t-k}\). It is said in [1] that, for an AR(p), the partial autocorrelations \( \alpha_k \) will be nonzero for \( k<=p \) and zero for \( k>p \). With this property, it is used to determine the lag of an AR-Process.

[1] Box, G. E., Jenkins, G. M., Reinsel, G. C., & Ljung, G. M. (2015). Time series analysis: forecasting and control. John Wiley & Sons.

[2] https://onlinecourses.science.psu.edu/stat510/node/62

Return
af::array The partial autocorrelation for each time series for the given lag.
Parameters
  • tss: Expects an input array whose dimension zero is the length of the time series (all the same) and dimension one indicates the number of time series.
  • lags: Indicates the lags to be calculated.

af::array khiva::featurespercentageOfReoccurringDatapointsToAllDatapoints(af::array tss, bool isSorted = false)

Calculates the percentage of unique values, that are present in the time series more than once.

\[ \frac{len(\textit{different values occurring more than once})}{len(\textit{different values})} \]
This means the percentage is normalized to the number of unique values, in contrast to the percentageOfReoccurringValuesToAllValues.

Return
af::array The percentage of unique data points, that are present in the time series more than once.
Parameters
  • tss: Expects an input array whose dimension zero is the length of the time series (all the same) and dimension one indicates the number of time series.
  • isSorted: Indicates if the input time series is sorted or not. Defaults to false.

af::array khiva::featurespercentageOfReoccurringValuesToAllValues(af::array tss, bool isSorted = false)

Calculates the percentage of unique values, that are present in the time series more than once.

\[ \frac{\textit{number of data points occurring more than once}}{\textit{number of all data points})} \]
This means the percentage is normalized to the number of unique values, in contrast to the percentageOfReoccurringDatapointsToAllDatapoints.

Return
af::array The percentage of unique values, that are present in the time series more than once.
Parameters
  • tss: Expects an input array whose dimension zero is the length of the time series (all the same) and dimension one indicates the number of time series.
  • isSorted: Indicates if the input time series is sorted or not. Defaults to false.

af::array khiva::featuresquantile(af::array tss, af::array q, float precision = 100000000)

Returns values at the given quantile.

Return
af::array Values at the given quantile.
Parameters
  • tss: Expects an input array whose dimension zero is the length of the time series (all the same) and dimension one indicates the number of time series.
  • q: Percentile(s) at which to extract score(s). One or many.
  • precision: Number of decimals expected.

af::array khiva::featuresrangeCount(af::array tss, float min, float max)

Counts observed values within the interval [min, max).

Return
af::array Values at the given range.
Parameters
  • tss: Expects an input array whose dimension zero is the length of the time series (all the same) and dimension one indicates the number of time series.
  • min: Value that sets the lower limit.
  • max: Value that sets the upper limit.

af::array khiva::featuresratioBeyondRSigma(af::array tss, float r)

Calculates the ratio of values that are more than \(r*std(x)\) (so \(r\) sigma) away from the mean of \(x\).

Return
af::array The ratio of values that are more than \(r*std(x)\) (so \(r\) sigma) away from the mean of \(x\).
Parameters
  • tss: Expects an input array whose dimension zero is the length of the time series (all the same) and dimension one indicates the number of time series.
  • r: Number of times that the values should be away from.

af::array khiva::featuresratioValueNumberToTimeSeriesLength(af::array tss)

Calculates a factor which is 1 if all values in the time series occur only once, and below one if this is not the case. In principle, it just returns:

\[ \frac{\textit{number\_unique\_values}}{\textit{number\_values}} \]
.

Return
af::array The ratio of unique values with respect to the total number of values.
Parameters
  • tss: Expects an input array whose dimension zero is the length of the time series (all the same) and dimension one indicates the number of time series.

af::array khiva::featuressampleEntropy(af::array tss)

Calculates a vectorized sample entropy algorithm. For short time-series this method is highly dependent on the parameters, but should be stable for N > 2000, see:

[1] Yentes et al., The Appropriate Use of Approximate Entropy and Sample Entropy with Short Data Sets, (2012).

[2] Richman & Moorman,Physiological time-series analysis using approximate entropy and sample entropy, (2000).

[3] https://en.wikipedia.org/wiki/Sample_entropy

[4] https://www.ncbi.nlm.nih.gov/pubmed/10843903?dopt=Abstract

Return
af::array With the same dimensions as tss, whose values (time series in dimension 0) contains the vectorized sample entropy for all the input time series in tss.
Parameters
  • tss: Expects an input array whose dimension zero is the length of the time series (all the same) and dimension one indicates the number of time series.

af::array khiva::featuresskewness(af::array tss)

Calculates the sample skewness of tss (calculated with the adjusted Fisher-Pearson standardized moment coefficient G1).

Return
af::array Containing the skewness of each time series in tss.
Parameters
  • tss: Expects an input array whose dimension zero is the length of the time series (all the same) and dimension one indicates the number of time series.

af::array khiva::featuresspktWelchDensity(af::array tss, int coeff)

Estimates the cross power spectral density of the time series tss at different frequencies. To do so, the time series is first shifted from the time domain to the frequency domain. Welch’s method computes an estimate of the power spectral density by dividing the data into overlapping segments, computing a modified periodogram for each segment and averaging the periodograms.

[1] P. Welch, “The use of the fast Fourier transform for the estimation of power spectra: A method based on time

averaging over short, modified periodograms”, IEEE Trans. Audio Electroacoust. vol. 15, pp. 70-73, 1967.

[2] M.S. Bartlett, “Periodogram Analysis and Continuous Spectra”, Biometrika, vol. 37, pp. 1-16, 1950.

[3] Rabiner, Lawrence R., and B. Gold. “Theory and Application of Digital Signal Processing” Prentice-Hall, pp. 414-419, 1975.

Return
af::array Containing the power spectrum of the different frequencies for each time series in tss.
Parameters
  • tss: Expects an input array whose dimension zero is the length of the time series (all the same) and dimension one indicates the number of time series.
  • coeff: The coefficient to be returned.

af::array khiva::featuresstandardDeviation(af::array tss)

Calculates the standard deviation of each time series within tss.

Return
af::array The standard deviation of each time series within tss.
Parameters
  • tss: Expects an input array whose dimension zero is the length of the time series (all the same) and dimension one indicates the number of time series.

af::array khiva::featuressumOfReoccurringDatapoints(af::array tss, bool isSorted = false)

Calculates the sum of all data points, that are present in the time series more than once.

Return
af::array The sum of all data points, that are present in the time series more than once.
Parameters
  • tss: Expects an input array whose dimension zero is the length of the time series (all the same) and dimension one indicates the number of time series.
  • isSorted: Indicates if the input time series is sorted or not. Defaults to false.

af::array khiva::featuressumOfReoccurringValues(af::array tss, bool isSorted = false)

Calculates the sum of all values, that are present in the time series more than once.

Return
af::array Returns the sum of all values, that are present in the time series more than once.
Parameters
  • tss: Expects an input array whose dimension zero is the length of the time series (all the same) and dimension one indicates the number of time series.
  • isSorted: Indicates if the input time series is sorted or not. Defaults to false.

af::array khiva::featuressumValues(af::array tss)

Calculates the sum over the time series tss.

Return
af::array An array containing the sum of values in each time series.
Parameters
  • tss: Expects an input array whose dimension zero is the length of the time series (all the same) and dimension one indicates the number of time series.

af::array khiva::featuressymmetryLooking(af::array tss, float r)

Calculates if the distribution of tss looks symmetric. This is the case if

\[ | mean(tss)-median(tss)| < r * (max(tss)-min(tss)). \]
.

Return
af::array Denoting if the input time series look symmetric.
Parameters
  • tss: Expects an input array whose dimension zero is the length of the time series (all the same) and dimension one indicates the number of time series.
  • r: The percentage of the range to compare with.

af::array khiva::featurestimeReversalAsymmetryStatistic(af::array tss, int lag)

This function calculates the value of:

\[ \frac{1}{n-2lag} \sum_{i=0}^{n-2lag} x_{i + 2 \cdot lag}^2 \cdot x_{i + lag} - x_{i + lag} \cdot x_{i}^2, \]
which is:
\[ \mathbb{E}[L^2(X)^2 \cdot L(X) - L(X) \cdot X^2], \]
where \( \mathbb{E} \) is the mean and \( L \) is the lag operator. It was proposed in [1] as a promising feature to extract from time series.

[1] Fulcher, B.D., Jones, N.S. (2014). Highly comparative feature-based time-series classification. Knowledge and Data Engineering, IEEE Transactions on 26, 3026–3037.

Return
af::array Containing the time reversal asymmetry statistic value in each time series.
Parameters
  • tss: Expects an input array whose dimension zero is the length of the time series (all the same) and dimension one indicates the number of time series.
  • lag: The lag to be computed.

af::array khiva::featuresvalueCount(af::array tss, float v)

Counts occurrences of value in the time series tss.

Return
af::array Containing the count of the given value in each time series.
Parameters
  • tss: Expects an input array whose dimension zero is the length of the time series (all the same) and dimension one indicates the number of time series.
  • v: The value to be counted.

af::array khiva::featuresvariance(af::array tss)

Computes the variance for the time series tss.

Return
af::array An array containing the variance in each time series.
Parameters
  • tss: Expects an input array whose dimension zero is the length of the time series (all the same) and dimension one indicates the number of time series.

af::array khiva::featuresvarianceLargerThanStandardDeviation(af::array tss)

Calculates if the variance of tss is greater than the standard deviation. In other words, if the variance of tss is larger than 1.

Return
af::array Denoting if the variance of tss is greater than the standard deviation.
Parameters
  • tss: Expects an input array whose dimension zero is the length of the time series (all the same) and dimension one indicates the number of time series.

Namespace Library

namespace khivalibrary

Typedefs

typedef khiva_backend khiva::libraryBackend

Enums

enum khiva::librarykhiva_backend

Values:

khiva::libraryKHIVA_BACKEND_DEFAULT = af::Backend::AF_BACKEND_DEFAULT

Default backend order: OpenCL -> CUDA -> CPU.

khiva::libraryKHIVA_BACKEND_CPU = af::Backend::AF_BACKEND_CPU

CPU a.k.a sequential algorithms.

khiva::libraryKHIVA_BACKEND_CUDA = af::Backend::AF_BACKEND_CUDA

CUDA Compute Backend.

khiva::libraryKHIVA_BACKEND_OPENCL = af::Backend::AF_BACKEND_OPENCL

OpenCL Compute Backend.

Functions

void khiva::librarybackendInfo()

Get the backend info.

void khiva::librarysetBackend(khiva::library::Backend be)

Set the backend.

Parameters
  • be: The desired backend.

khiva::library::Backend khiva::librarygetBackend()

Get the active backend.

Return
khiva::library::Backend The active backend.

int khiva::librarygetBackends()

Get the available backends.

Return
int The available backends.

void khiva::librarysetDevice(int device)

Set the device.

Parameters
  • device: The desired device.

int khiva::librarygetDevice()

Get the active device.

Return
int The active device.

int khiva::librarygetDeviceCount()

Get the device count.

Return
int The device count.

Namespace LinAlg

namespace khivalinalg

Functions

af::array khiva::linalglls(af::array A, af::array b)

Calculates the minimum norm least squares solution \(x\) \((\left\lVert{A·x - b}\right\rVert^2)\) to \(A·x = b\). This function uses the singular value decomposition function of Arrayfire. The actual formula that this function computes is \(x = V·D\dagger·U^T·b\). Where \(U\) and \(V\) are orthogonal matrices and \(D\dagger\) contains the inverse values of the singular values contained in D if they are not zero, and zero otherwise.

Return
af::array Contains the solution to the linear equation problem minimizing the norm 2.
Parameters
  • A: Coefficient matrix containing the coefficients of the linear equation problem to solve.
  • b: Vector with the measured values.

Namespace Matrix

namespace khivamatrix

Functions

af::array khiva::matrixslidingDotProduct(af::array q, af::array t)

Calculates the sliding dot product of the time series ‘q’ against t.

Return
array Returns an array with as many elements as ‘t’ in the first dimension and as many elements as the last dimension of ‘q’ in the last dimension.
Parameters
  • q: Array whose first dimension is the length of the query time series and the last dimension is the number of time series to calculate.
  • t: Array with the second time series in the first dimension.

void khiva::matrixmeanStdev(af::array t, af::array &a, long m, af::array &mean, af::array &stdev)

Calculates the moving average and standard deviation of the time series ‘t’.

Parameters
  • t: Input time series. Multiple time series.
  • a: Auxiliary array to be used in the function calculateDistanceProfile. Use the overloaded method without this parameter.
  • m: Window size.
  • mean: Output array containing the moving average.
  • stdev: Output array containing the moving standard deviation.

void khiva::matrixmeanStdev(af::array t, long m, af::array &mean, af::array &stdev)

Calculates the moving average and standard deviation of the time series ‘t’.

Parameters
  • t: Input time series. Multiple time series.
  • m: Window size.
  • mean: Output array containing the moving average.
  • stdev: Output array containing the moving standard deviation.

af::array khiva::matrixgenerateMask(long m, long batchSize, long batchStart, long tsLength, long nTimeSeries = 1)

Function to generate a band matrix of batchSizeXtsLength with the offset batchStart.

Return
af::array With the resulting band.
Parameters
  • m: Subsequence length used to generate a band of m/2 at each side.
  • batchSize: Size of the first dimension.
  • batchStart: Offset of the band matrix.
  • tsLength: Size of the second dimension of the matrix.
  • nTimeSeries: Number of time series to generate the mask for.

void khiva::matrixcalculateDistanceProfile(af::array qt, af::array a, af::array sum_q, af::array sum_q2, af::array mean_t, af::array sigma_t, af::array mask, af::array &distance, af::array &index)

Calculates the distance between ‘q’ and the time series ‘t’, which produced the sliding. Multiple queries can be computed simultaneously in the last dimension of ‘q’.

Parameters
  • qt: The sliding dot product of ‘q’ and ‘t’.
  • a: Auxiliary array computed using the meanStdev function. This array contains a precomputed fixed value to speed up the distance calculation.
  • sum_q: Sum of the values contained in ‘q’.
  • sum_q2: Sum of squaring the values contained in ‘q’.
  • mean_t: Moving average of ‘t’ using a window size equal to the number of elements in ‘q’.
  • sigma_t: Moving standard deviation of ‘t’ using a window size equal to the number of elements in ‘q’.
  • mask: Mask band matrix to filter the trivial match of a subsequence with itself.
  • distance: Resulting minimal distance.
  • index: Position where the minimum is occurring.

void khiva::matrixcalculateDistanceProfile(af::array qt, af::array a, af::array sum_q, af::array sum_q2, af::array mean_t, af::array sigma_t, af::array &distance, af::array &index)

Calculates the distance between ‘q’ and the time series ‘t’, which produced the sliding. Multiple queries can be computed simultaneously in the last dimension of ‘q’.

Parameters
  • qt: The sliding dot product of ‘q’ and ‘t’.
  • a: Auxiliary array computed using the meanStdev function. This array contains a precomputed fixed value to speed up the distance calculation.
  • sum_q: Sum of the values contained in ‘q’.
  • sum_q2: Sum of squaring the values contained in ‘q’.
  • mean_t: Moving average of ‘t’ using a window size equal to the number of elements in ‘q’.
  • sigma_t: Moving standard deviation of ‘t’ using a window size equal to the number of elements in ‘q’.
  • distance: Resulting minimal distance.
  • index: Position where the minimum is occurring.

void khiva::matrixmass(af::array q, af::array t, af::array a, af::array mean_t, af::array sigma_t, af::array mask, af::array &distance, af::array &index)

Calculates the Mueen distance.

Parameters
  • q: Array whose first dimension is the length of the query time series and the last dimension is the number of time series to calculate.
  • t: Array with the second time series in the first dimension.
  • a: Auxiliary array computed using the meanStdev function. This array contains a precomputed fixed value to speed up the distance calculation.
  • mean_t: Moving average of ‘t’ using a window size equal to the number of elements in ‘q’.
  • sigma_t: Moving standard deviation of ‘t’ using a window size equal to the number of elements in ‘q’.
  • mask: Specifies the elements that should not be considered in the computation.
  • distance: Resulting minimal distance.
  • index: Position where the minimum is occurring.

void khiva::matrixmass(af::array q, af::array t, af::array a, af::array mean_t, af::array sigma_t, af::array &distance, af::array &index)

Calculates the Mueen distance.

Parameters
  • q: Array whose first dimension is the length of the query time series and the last dimension is the number of time series to calculate.
  • t: Array with the second time series in the first dimension.
  • a: Auxiliary array computed using the meanStdev function. This array contains a precomputed fixed value to speed up the distance calculation.
  • mean_t: Moving average of ‘t’ using a window size equal to the number of elements in ‘q’.
  • sigma_t: Moving standard deviation of ‘t’ using a window size equal to the number of elements in ‘q’.
  • distance: Resulting minimal distance.
  • index: Position where the minimum is occurring.

void khiva::matrixstomp(af::array ta, af::array tb, long m, af::array &profile, af::array &index)

STOMP algorithm to calculate the matrix profile between ‘ta’ and ‘tb’ using a subsequence length of ‘m’.

Parameters
  • ta: Query time series.
  • tb: Reference time series.
  • m: Subsequence length.
  • profile: The matrix profile, which reflects the distance to the closer element of the subsequence from ‘ta’ in ‘tb’.
  • index: The matrix profile index, which points to where the aforementioned minimum is located.

void khiva::matrixstomp(af::array t, long m, af::array &profile, af::array &index)

STOMP algorithm to calculate the matrix profile between ‘t’ and itself using a subsequence length of ‘m’. This method filters the trivial matches.

Parameters
  • t: Query and reference time series.
  • m: Subsequence length.
  • profile: The matrix profile, which reflects the distance to the closer element of the subsequence from ‘t’ in a different location of itself.
  • index: The matrix profile index, which points to where the aforementioned minimum is located.

void khiva::matrixfindBestNMotifs(af::array profile, af::array index, long n, af::array &motifs, af::array &motifsIndices, af::array &subsequenceIndices)

This function extracts the best N motifs from a previously calculated matrix profile.

Parameters
  • profile: The matrix profile containing the minimum distance of each subsequence.
  • index: The matrix profile index containing where each minimum occurs.
  • n: Number of motifs to extract.
  • motifs: The distance of the best N motifs.
  • motifsIndices: The indices of the best N motifs.
  • subsequenceIndices: The indices of the query sequences that produced the minimum reported in the motifs output array.

void khiva::matrixfindBestNDiscords(af::array profile, af::array index, long n, af::array &discords, af::array &discordsIndices, af::array &subsequenceIndices)

This function extracts the best N discords from a previously calculated matrix profile.

Parameters
  • profile: The matrix profile containing the minimum distance of each subsequence.
  • index: The matrix profile index containing where each minimum occurs.
  • n: Number of discords to extract.
  • discords: The distance of the best N discords.
  • discordsIndices: The indices of the best N discords.
  • subsequenceIndices: The indices of the query sequences that produced the discords reported in the discords output array.

Namespace Normalization

namespace khivanormalization

Functions

af::array khiva::normalizationdecimalScalingNorm(af::array tss)

Normalizes the given time series according to its maximum value and adjusts each value within the range (-1, 1).

Return
af::array An array with the same dimensions as tss, whose values (time series in dimension 0) have been normalized by dividing each number by 10^j, where j is the number of integer digits of the max number in the time series.
Parameters
  • tss: Expects an input array whose dimension zero is the length of the time series (all the same) and dimension one indicates the number of time series.

void khiva::normalizationdecimalScalingNormInPlace(af::array &tss)

Same as decimalScalingNorm, but it performs the operation in place, without allocating further memory.

Parameters
  • tss: Expects an input array whose dimension zero is the length of the time series (all the same) and dimension one indicates the number of time series.

af::array khiva::normalizationmaxMinNorm(af::array tss, double high = 1.0, double low = 0.0, double epsilon = 0.00000001)

Normalizes the given time series according to its minimum and maximum value and adjusts each value within the range [low, high].

Return
af::array An array with the same dimensions as tss, whose values (time series in dimension 0) have been normalized by maximum and minimum values, and scaled as per high and low parameters.
Parameters
  • tss: Expects an input array whose dimension zero is the length of the time series (all the same) and dimension one indicates the number of time series.
  • high: Maximum final value (Defaults to 1.0).
  • low: Minimum final value (Defaults to 0.0).
  • epsilon: Safeguard for constant (or near constant) time series as the operation implies a unit scale operation between min and max values in the tss.

void khiva::normalizationmaxMinNormInPlace(af::array &tss, double high = 1.0, double low = 0.0, double epsilon = 0.00000001)

Same as maxMinNorm, but it performs the operation in place, without allocating further memory.

Parameters
  • tss: Expects an input array whose dimension zero is the length of the time series (all the same) and dimension one indicates the number of time series.
  • high: Maximum final value (Defaults to 1.0).
  • low: Minimum final value (Defaults to 0.0).
  • epsilon: Safeguard for constant (or near constant) time series as the operation implies a unit scale operation between min and max values in the tss.

af::array khiva::normalizationmeanNorm(af::array tss)

Normalizes the given time series according to its maximum-minimum value and its mean. It follows the following formulae:

\[ \acute{x} = \frac{x - mean(x)}{max(x) - min(x)}. \]

Return
af::array An array with the same dimensions as tss, whose values (time series in dimension 0) have been normalized by substracting the mean from each number and dividing each number by \( max(x) - min(x)\), in the time series.
Parameters
  • tss: Expects an input array whose dimension zero is the length of the time series (all the same) and dimension one indicates the number of time series.

void khiva::normalizationmeanNormInPlace(af::array &tss)

Normalizes the given time series according to its maximum-minimum value and its mean. It follows the following formulae:

\[ \acute{x} = \frac{x - mean(x)}{max(x) - min(x)}. \]

Parameters
  • tss: Expects an input array whose dimension zero is the length of the time series (all the same) and dimension one indicates the number of time series.

af::array khiva::normalizationznorm(af::array tss, double epsilon = 0.00000001)

Calculates a new set of timeseries with zero mean and standard deviation one.

Return
af::array With the same dimensions as tss where the time series have been adjusted for zero mean and one as standard deviation.
Parameters
  • tss: Expects an input array whose dimension zero is the length of the time series (all the same) and dimension one indicates the number of time series.
  • epsilon: Minimum standard deviation to consider. It acts as a gatekeeper for those time series that may be constant or near constant.

void khiva::normalizationznormInPlace(af::array &tss, double epsilon = 0.00000001)

Adjusts the time series in the given input and performs z-norm inplace (without allocating further memory).

Parameters
  • tss: Expects an input array whose dimension zero is the length of the time series (all the same) and dimension one indicates the number of time series.
  • epsilon: Minimum standard deviation to consider. It acts as a gatekeeper for those time series that may be constant or near constant.

Namespace Polynomial

namespace khivapolynomial

Functions

af::array khiva::polynomialpolyfit(af::array x, af::array y, int deg)

Least squares polynomial fit. Fit a polynomial \(p(x) = p[0] * x^{deg} + ... + p[deg]\) of degree \(deg\) to points \((x, y)\). Returns a vector of coefficients \(p\) that minimizes the squared error.

Return
af::array Polynomial coefficients, highest power first.
Parameters
  • x: x-coordinates of the M sample points \((x[i], y[i])\).
  • y: y-coordinates of the sample points.
  • deg: Degree of the fitting polynomial.

af::array khiva::polynomialroots(af::array pp)

Calculates the roots of a polynomial with coefficients given in \(p\). The values in the rank-1 array \(p\) are coefficients of a polynomial. If the length of \(p\) is \(n+1\) then the polynomial is described by:

\[ p[0] * x^n + p[1] * x^{n-1} + ... + p[n-1] * x + p[n] \]
.

Return
af::array Containing the roots of the polynomial.
Parameters
  • pp: Array of polynomial coefficients.

Namespace Regression

namespace khivaregression

Functions

void khiva::regressionlinear(af::array xss, af::array yss, af::array &slope, af::array &intercept, af::array &rvalue, af::array &pvalue, af::array &stderrest)

Calculate a linear least-squares regression for two sets of measurements. Both arrays should have the same length.

Parameters
  • xss: Expects an input array whose dimension zero is the length of the time series (all the same) and dimension one indicates the number of time series.
  • yss: Expects an input array whose dimension zero is the length of the time series (all the same) and dimension one indicates the number of time series.
  • slope: Slope of the regression line.
  • intercept: Intercept of the regression line.
  • rvalue: Correlation coefficient.
  • pvalue: Two-sided p-value for a hypothesis test whose null hypothesis is that the slope is zero, using Wald Test with t-distribution of the test statistic.
  • stderrest: Standard error of the estimated gradient.

Namespace Regularization

namespace khivaregularization

Functions

af::array khiva::regularizationgroupBy(af::array in, af::array (*aggregationFunction)(const af::array&, bool, const dim_t), int nColumnsKey = 1, int nColumnsValue = 1, )

Group by operation in the input array using nColumnsKey columns as group keys and nColumnsValue columns as values. The data is expected to be sorted. The aggregation function determines the operation to aggregate the values.

Return
af::array Array with the values of the group keys aggregated using the aggregationFunction.
Parameters
  • in: Input array containing the keys and values to operate with.
  • aggregationFunction: This param determines the operation aggregating the values.
  • nColumnsKey: Number of columns conforming the key.
  • nColumnsValue: Number of columns conforming the value (they are expected to be consecutive to the column keys).

af::array khiva::regularizationgroupBy(af::array in, af::array (*aggregationFunction)(const af::array&, const int), int nColumnsKey = 1, int nColumnsValue = 1, )

Group by operation in the input array using nColumnsKey columns as group keys and nColumnsValue columns as values. The data is expected to be sorted. The aggregation function determines the operation to aggregate the values.

Return
af::array Array with the values of the group keys aggregated using the aggregationFunction.
Parameters
  • in: Input array containing the keys and values to operate with.
  • aggregationFunction: This param determines the operation aggregating the values.
  • nColumnsKey: Number of columns conforming the key.
  • nColumnsValue: Number of columns conforming the value (they are expected to be consecutive to the column keys).

af::array khiva::regularizationgroupBy(af::array in, af::array (*aggregationFunction)(const af::array&, const dim_t), int nColumnsKey = 1, int nColumnsValue = 1, )

Group by operation in the input array using nColumnsKey columns as group keys and nColumnsValue columns as values. The data is expected to be sorted. The aggregation function determines the operation to aggregate the values.

Return
af::array Array with the values of the group keys aggregated using the aggregationFunction.
Parameters
  • in: Input array containing the keys and values to operate with.
  • aggregationFunction: This param determines the operation aggregating the values.
  • nColumnsKey: Number of columns conforming the key.
  • nColumnsValue: Number of columns conforming the value (they are expected to be consecutive to the column keys).

Namespace Statistics

namespace khivastatistics

Functions

af::array khiva::statisticscovariance(af::array tss, bool unbiased = true)

Returns the covariance matrix of the time series contained in tss.

Return
af::array The covariance matrix of the time series.
Parameters
  • tss: Expects an input array whose dimension zero is the length of the time series (all the same) and dimension one indicates the number of time series.
  • unbiased: Determines whether it divides by n - 1 (if false) or n (if true).

af::array khiva::statisticskurtosis(af::array tss)

Returns the kurtosis of tss (calculated with the adjusted Fisher-Pearson standardized moment coefficient G2).

Return
af::array The kurtosis of tss.
Parameters
  • tss: Expects an input array whose dimension zero is the length of the time series (all the same) and dimension one indicates the number of time series.

af::array khiva::statisticsmoment(af::array tss, int k)

Returns the kth moment of the given time series.

Return
af::array The kth moment of the given time series.
Parameters
  • tss: Expects an input array whose dimension zero is the length of the time series (all the same) and dimension one indicates the number of time series.
  • k: The specific moment to be calculated.

af::array khiva::statisticsljungBox(af::array tss, long lags)

The Ljung–Box test checks that data within the time series are independently distributed (i.e. the correlations in the population from which the sample is taken are 0, so that any observed correlations in the data result from randomness of the sampling process). Data are no independently distributed, if they exhibit serial correlation.

The test statistic is:

\[ Q = n\left(n+2\right)\sum_{k=1}^h\frac{\hat{\rho}^2_k}{n-k} \]

where ‘’n’’ is the sample size, \(\hat{\rho}k \) is the sample autocorrelation at lag ‘’k’‘, and ‘’h’’ is the number of lags being tested. Under \( H_0 \) the statistic Q follows a \(\chi^2{(h)} \). For significance level \(\alpha\), the \(critical region\) for rejection of the hypothesis of randomness is:

\[ Q > \chi_{1-\alpha,h}^2 \]

where \( \chi_{1-\alpha,h}^2 \) is the \(\alpha\)-quantile of the chi-squared distribution with ‘’h’’ degrees of freedom.

[1] G. M. Ljung G. E. P. Box (1978). On a measure of lack of fit in time series models. Biometrika, Volume 65, Issue 2, 1 August 1978, Pages 297–303.

Return
af::array Ljung-Box statistic test.
Parameters
  • tss: Expects an input array whose dimension zero is the length of the time series (all the same) and dimension one indicates the number of time series.
  • lags: Number of lags being tested.

af::array khiva::statisticsquantile(af::array tss, af::array q, float precision = 100000000)

Returns values at the given quantile.

Return
af::array Values at the given quantile.
Parameters
  • tss: Expects an input array whose dimension zero is the length of the time series (all the same) and dimension one indicates the number of time series. NOTE: the time series should be sorted.
  • q: Percentile(s) at which to extract score(s). One or many.
  • precision: Number of decimals expected.

af::array khiva::statisticsquantilesCut(af::array tss, float quantiles, float precision = 0.00000001)

Discretizes the time series into equal-sized buckets based on sample quantiles.

Return
af::array Matrix with the categories, one category per row, the start of the category in the first column and the end in the second category.
Parameters
  • tss: Expects an input array whose dimension zero is the length of the time series (all the same) and dimension one indicates the number of time series. NOTE: the time series should be sorted.
  • quantiles: Number of quantiles to extract. From 0 to 1, step 1/quantiles.
  • precision: Number of decimals expected.

af::array khiva::statisticssampleStdev(af::array tss)

Estimates standard deviation based on a sample. The standard deviation is calculated using the “n-1” method.

Return
af::array The sample standard deviation.
Parameters
  • tss: Expects an input array whose dimension zero is the length of the time series (all the same) and dimension one indicates the number of time series.

af::array khiva::statisticsskewness(af::array tss)

Calculates the sample skewness of tss (calculated with the adjusted Fisher-Pearson standardized moment coefficient G1).

Return
af::array Array containing the skewness of each time series in tss.
Parameters
  • tss: Expects an input array whose dimension zero is the length of the time series (all the same) and dimension one indicates the number of time series.

Bindings

We have developed bindings to enable the execution of Khiva from the following languages. In order to make it work, you should first install Khiva library in your machine, explained in :ref: chapter-gettingstarted.

Python

In order to install the khiva-python binding of the library, you would need to fetch the latest version of the code from:

git clone https://github.com/shapelets/khiva-python.git

After cloning the repository, you can install khiva-python by executing the next commands:

cd /path_to_khiva-python
python3 setup.py install

If the installation is successful, you are ready to start playing with the library.

Java

In order to install the khiva-java binding of the library, you would need to fetch the latest version of the code from:

git clone https://github.com/shapelets/khiva-java.git

Once you have downloaded the code, you have to move to the source code directory and execute the following commands:

cd path_to_java_khiva_dir
mvn install
mvn javadoc:javadoc

If all steps finished as expected, you should be able to use the Khiva from your java projects.

R

In order to install the khiva-r binding of the library, you would need to fetch the latest version of the code from:

git clone https://github.com/shapelets/khiva-r.git

After downloading the code, you would need to open an R console and execute the following commands, to set the work directory and install the Khiva binding:

setwd(<project-root-dir>/)
devtools::install()

Once the installation of the binding has been carried out, you can make the library available by executing:

library(khiva)

If all previous steps were successful you will ready to start working with the library.

MATLAB

In order to install the khiva-matlab binding of the library, you would need to fetch the latest version of the code from:

git clone https://github.com/shapelets/khiva-matlab.git

Once the code is available, we just have to add the path to the khiva-matlab/+khiva folder to the MATLAB path. Thus, the user will be able to import and call our library.

AUTHORS

Core Development Team

Contributions

Cite Us

If you use Khiva Library for a publication, please cite it as:

@misc{khiva-library,
   author = "David Cuesta and Justo Ruiz and Oscar Torreno and Antonio Vilches",
   title = "Khiva Library",
   howpublished = "\url{https://shapelets.io/khiva}"
}

Footnotes

[1]Time series analysis comprises methods for analyzing time series data in order to extract meaningful statistics and other characteristics of the data (Source Wikipedia).