Namespace Dimensionality¶
-
namespace
dimensionality
¶ Typedefs
-
using
khiva::dimensionality::Point = typedef std::pair<float, float>
-
using
khiva::dimensionality::Segment = typedef std::pair<int, int>
Functions
-
std::vector<Point>
PAA
(const std::vector<Point> &points, int bins)¶ Piecewise Aggregate Approximation (PAA) approximates a time series \(X\) of length \(n\) into vector \(\bar{X}=(\bar{x}_{1},…,\bar{x}_{M})\) of any arbitrary length \(M \leq n\) where each of \(\bar{x_{i}}\) is calculated as follows:
\[ \bar{x}_{i} = \frac{M}{n} \sum_{j=n/M(i-1)+1}^{(n/M)i} x_{j}. \]Which simply means that in order to reduce the dimensionality from \(n\) to \(M\), we first divide the original time series into \(M\) equally sized frames and secondly compute the mean values for each frame. The sequence assembled from the mean values is the PAA approximation (i.e., transform) of the original time series.- Return
- result A vector of Points with the reduced dimensionality.
- Parameters
points
: Set of points.bins
: Sets the total number of divisions.
-
af::array
PAA
(const af::array &a, int bins)¶ Piecewise Aggregate Approximation (PAA) approximates a time series \(X\) of length \(n\) into vector \(\bar{X}=(\bar{x}_{1},…,\bar{x}_{M})\) of any arbitrary length \(M \leq n\) where each of \(\bar{x_{i}}\) is calculated as follows:
\[ \bar{x}_{i} = \frac{M}{n} \sum_{j=n/M(i-1)+1}^{(n/M)i} x_{j}. \]Which simply means that in order to reduce the dimensionality from \(n\) to \(M\), we first divide the original time series into \(M\) equally sized frames and secondly compute the mean values for each frame. The sequence assembled from the mean values is the PAA approximation (i.e., transform) of the original time series.- Return
- af::array An array of points with the reduced dimensionality.
- Parameters
a
: Set of points.bins
: Sets the total number of divisions.
-
af::array
PIP
(const af::array &ts, int numberIPs)¶ Calculates the number of Perceptually Important Points (PIP) in the time series.
[1] Fu TC, Chung FL, Luk R, and Ng CM. Representing financial time series based on data point importance. Engineering Applications of Artificial Intelligence, 21(2):277-300, 2008.
- Return
- af::array Array with the most Perceptually Important numPoints.
- Parameters
ts
: Expects an input array whose dimension zero is the length of the time series.numberIPs
: The number of points to be returned.
-
std::vector<Point>
PLABottomUp
(const std::vector<Point> &ts, float maxError)¶ Applies the Piecewise Linear Approximation (PLA BottomUP) to the time series.
[1] Zhu Y, Wu D, Li Sh (2007). A Piecewise Linear Representation Method of Time Series Based on Feature Points. Knowledge-Based Intelligent Information and Engineering Systems 4693:1066-1072.
- Return
- std::vector Vector with the reduced number of points.
- Parameters
ts
: Expects an input vector containing the set of points to be reduced.maxError
: The maximum approximation error allowed.
-
af::array
PLABottomUp
(const af::array &ts, float maxError)¶ Applies the Piecewise Linear Approximation (PLA BottomUP) to the time series.
[1] Zhu Y, Wu D, Li Sh (2007). A Piecewise Linear Representation Method of Time Series Based on Feature Points. Knowledge-Based Intelligent Information and Engineering Systems 4693:1066-1072.
- Return
- af::array with the reduced number of points.
- Parameters
ts
: Expects an af::array containing the set of points to be reduced. The first component of the points in the first column and the second component of the points in the second column.maxError
: The maximum approximation error allowed.
-
std::vector<Point>
PLASlidingWindow
(const std::vector<Point> &ts, float maxError)¶ Applies the Piecewise Linear Approximation (PLA Sliding Window) to the time series.
[1] Zhu Y, Wu D, Li Sh (2007). A Piecewise Linear Representation Method of Time Series Based on Feature Points. Knowledge-Based Intelligent Information and Engineering Systems 4693:1066-1072.
- Return
- std::vector Vector with the reduced number of points.
- Parameters
ts
: Expects an input vector containing the set of points to be reduced.maxError
: The maximum approximation error allowed.
-
af::array
PLASlidingWindow
(const af::array &ts, float maxError)¶ Applies the Piecewise Linear Approximation (PLA Sliding Window) to the time series.
[1] Zhu Y, Wu D, Li Sh (2007). A Piecewise Linear Representation Method of Time Series Based on Feature Points. Knowledge-Based Intelligent Information and Engineering Systems 4693:1066-1072.
- Return
- af::array with the reduced number of points.
- Parameters
ts
: Expects an af::array containing the set of points to be reduced. The first component of the points in the first column and the second component of the points in the second column.maxError
: The maximum approximation error allowed.
-
std::vector<Point>
ramerDouglasPeucker
(const std::vector<Point> &pointList, double epsilon)¶ The Ramer–Douglas–Peucker algorithm (RDP) is an algorithm for reducing the number of points in a curve that is approximated by a series of points. It reduces a set of points depending on the perpendicular distance of the points and epsilon, the greater epsilon, more points are deleted.
[1] Urs Ramer, “An iterative procedure for the polygonal approximation of plane curves”, Computer Graphics and Image Processing, 1(3), 244–256 (1972) doi:10.1016/S0146-664X(72)80017-0.
[2] David Douglas & Thomas Peucker, “Algorithms for the reduction of the number of points required to represent a
digitized line or its caricature”, The Canadian Cartographer 10(2), 112–122 (1973) doi:10.3138/FM57-6770-U75U-7727
- Return
- std:vector<khiva::dimensionality::Point> with the selected points.
- Parameters
pointList
: Set of input points.epsilon
: It acts as the threshold value to decide which points should be considered meaningful or not.
-
af::array
ramerDouglasPeucker
(const af::array &pointList, double epsilon)¶ The Ramer–Douglas–Peucker algorithm (RDP) is an algorithm for reducing the number of points in a curve that is approximated by a series of points. It reduces a set of points depending on the perpendicular distance of the points and epsilon, the greater epsilon, more points are deleted.
[1] Urs Ramer, “An iterative procedure for the polygonal approximation of plane curves”, Computer Graphics and Image Processing, 1(3), 244–256 (1972) doi:10.1016/S0146-664X(72)80017-0.
[2] David Douglas & Thomas Peucker, “Algorithms for the reduction of the number of points required to represent a
digitized line or its caricature”, The Canadian Cartographer 10(2), 112–122 (1973) doi:10.3138/FM57-6770-U75U-7727
- Return
- af::array with the selected points.
- Parameters
pointList
: Set of input points.epsilon
: It acts as the threshold value to decide which points should be considered meaningful or not.
-
af::array
SAX
(const af::array &a, int alphabetSize)¶ Symbolic Aggregate approXimation (SAX). It transforms a numeric time series into a time series of symbols with the same size. The algorithm was proposed by Lin et al.) and extends the PAA-based approach inheriting the original algorithm simplicity and low computational complexity while providing satisfactory sensitivity and selectivity in range query processing. Moreover, the use of a symbolic representation opened a door to the existing wealth of data-structures and string-manipulation algorithms in computer science such as hashing, regular expression, pattern matching, suffix trees, and grammatical inference.
[1] Lin, J., Keogh, E., Lonardi, S. & Chiu, B. (2003) A Symbolic Representation of Time Series, with Implications for Streaming Algorithms. In proceedings of the 8th ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery. San Diego, CA. June 13.
- Return
- result An array of symbols.
- Parameters
a
: Array with the input time series.alphabetSize
: Number of element within the alphabet.
-
std::vector<Point>
visvalingam
(const std::vector<Point> &pointList, int64_t numPoints, int64_t scale = 1000000000)¶ Reduces a set of points by applying the Visvalingam method (minimum triangle area) until the number of points is reduced to numPoints.
[1] M. Visvalingam and J. D. Whyatt, Line generalisation by repeated elimination of points, The Cartographic Journal, 1993.
- Return
- std:vector<khiva::dimensionality::Point> where the number of points has been reduced to numPoints.
- Parameters
pointList
: Expects an input vector of points.numPoints
: Sets the number of points returned after the execution of the method.scale
: Sets the precision used to compute the areas of the triangularization, the longer, the more accurate.
-
af::array
visvalingam
(const af::array &pointList, int numPoints)¶ Reduces a set of points by applying the Visvalingam method (minimum triangle area) until the number of points is reduced to numPoints.
[1] M. Visvalingam and J. D. Whyatt, Line generalisation by repeated elimination of points, The Cartographic Journal, 1993.
- Return
- af::array where the number of points has been reduced to numPoints.
- Parameters
pointList
: Expects an input array formed by to columns where the first column is interpreted as the x cordinate of a point and the second column as the y coordinate.numPoints
: Sets the number of points returned after the execution of the method.
-
using