Namespace Matrix¶
-
namespace
matrix
¶ Functions
-
void
findBestNOccurrences
(const af::array &q, const af::array &t, long n, af::array &distances, af::array &indexes)¶ Calculates the N best matches of several queries in several time series.
The result has the following structure:
- 1st dimension corresponds to the nth best match.
- 2nd dimension corresponds to the number of queries.
- 3rd dimension corresponds to the number of time series.
For example, the distance in the position (1, 2, 3) corresponds to the second best distance of the third query in the fourth time series. The index in the position (1, 2, 3) is the is the index of the subsequence which leads to the second best distance of the third query in the fourth time series.
- Parameters
q
: Array whose first dimension is the length of the query time series and the second dimension is the number of queries.t
: Array whose first dimension is the length of the time series and the second dimension is the number of time series.n
: Number of matches to return.distances
: Resulting distances.indexes
: Resulting indexes.
-
void
mass
(const af::array &q, const af::array &t, af::array &distances)¶ Mueen’s Algorithm for Similarity Search.
The result has the following structure:
- 1st dimension corresponds to the index of the subsequence in the time series.
- 2nd dimension corresponds to the number of queries.
- 3rd dimension corresponds to the number of time series.
For example, the distance in the position (1, 2, 3) correspond to the distance of the third query to the fourth time series for the second subsequence in the time series.
[1] Chin-Chia Michael Yeh, Yan Zhu, Liudmila Ulanova, Nurjahan Begum, Yifei Ding, Hoang Anh Dau, Diego Furtado Silva, Abdullah Mueen, Eamonn Keogh (2016). Matrix Profile I: All Pairs Similarity Joins for Time Series: A Unifying View that Includes Motifs, Discords and Shapelets. IEEE ICDM 2016.
- Parameters
q
: Array whose first dimension is the length of the query time series and the second dimension is the number of queries.t
: Array whose first dimension is the length of the time series and the second dimension is the number of time series.distances
: Resulting distances.
-
void
findBestNMotifs
(const af::array &profile, const af::array &index, long m, long n, af::array &motifs, af::array &motifsIndices, af::array &subsequenceIndices, bool selfJoin = false)¶ This function extracts the best N motifs from a previously calculated matrix profile.
- Parameters
profile
: The matrix profile containing the minimum distance of each subsequence.index
: The matrix profile index containing where each minimum occurs.m
: Subsequence length value used to calculate the input matrix profile.n
: Number of motifs to extract.motifs
: The distance of the best N motifs.motifsIndices
: The indices of the best N motifs.subsequenceIndices
: The indices of the query sequences that produced the minimum reported in the motifs output array.selfJoin
: Indicates whether the input profile comes from a self join operation or not. It determines whether the mirror similar region is included in the output or not.
-
void
findBestNDiscords
(const af::array &profile, const af::array &index, long m, long n, af::array &discords, af::array &discordsIndices, af::array &subsequenceIndices, bool selfJoin = false)¶ This function extracts the best N discords from a previously calculated matrix profile.
- Parameters
profile
: The matrix profile containing the minimum distance of each subsequence.index
: The matrix profile index containing where each minimum occurs.m
: Subsequence length value used to calculate the input matrix profile.n
: Number of discords to extract.discords
: The distance of the best N discords.discordsIndices
: The indices of the best N discords.subsequenceIndices
: The indices of the query sequences that produced the discords reported in the discords output array.selfJoin
: Indicates whether the input profile comes from a self join operation or not. It determines whether the mirror similar region is included in the output or not.
-
void
stomp
(const af::array &ta, const af::array &tb, long m, af::array &profile, af::array &index)¶ STOMP algorithm to calculate the matrix profile between ‘ta’ and ‘tb’ using a subsequence length of ‘m’.
[1] Yan Zhu, Zachary Zimmerman, Nader Shakibay Senobari, Chin-Chia Michael Yeh, Gareth Funning, Abdullah Mueen, Philip Brisk and Eamonn Keogh (2016). Matrix Profile II: Exploiting a Novel Algorithm and GPUs to break the one Hundred Million Barrier for Time Series Motifs and Joins. IEEE ICDM 2016.
- Parameters
ta
: Query time series.tb
: Reference time series.m
: Subsequence length.profile
: The matrix profile, which reflects the distance to the closer element of the subsequence from ‘ta’ in ‘tb’.index
: The matrix profile index, which points to where the aforementioned minimum is located.
-
void
stomp
(const af::array &t, long m, af::array &profile, af::array &index)¶ STOMP algorithm to calculate the matrix profile between ‘t’ and itself using a subsequence length of ‘m’. This method filters the trivial matches.
[1] Yan Zhu, Zachary Zimmerman, Nader Shakibay Senobari, Chin-Chia Michael Yeh, Gareth Funning, Abdullah Mueen, Philip Brisk and Eamonn Keogh (2016). Matrix Profile II: Exploiting a Novel Algorithm and GPUs to break the one Hundred Million Barrier for Time Series Motifs and Joins. IEEE ICDM 2016.
- Parameters
t
: Query and reference time series.m
: Subsequence length.profile
: The matrix profile, which reflects the distance to the closer element of the subsequence from ‘t’ in a different location of itself.index
: The matrix profile index, which points to where the aforementioned minimum is located.
-
void
matrixProfile
(const af::array &tss, long m, af::array &profile, af::array &index)¶ Calculates the matrix profile between ‘t’ and itself using a subsequence length of ‘m’. This method filters the trivial matches.
[1] Yan Zhu, Zachary Zimmerman, Nader Shakibay Senobari, Chin-Chia Michael Yeh, Gareth Funning, Abdullah Mueen, Philip Brisk and Eamonn Keogh (2016). Matrix Profile II: Exploiting a Novel Algorithm and GPUs to break the one Hundred Million Barrier for Time Series Motifs and Joins. IEEE ICDM 2016.
- Parameters
tss
: Query time series.m
: Subsequence length.profile
: The matrix profile, which reflects the distance to the closer element of the subsequence from ‘ta’ in ‘tb’.index
: The matrix profile index, which points to where the aforementioned minimum is located.
-
void
matrixProfile
(const af::array &ta, const af::array &tb, long m, af::array &profile, af::array &index)¶ Calculates the matrix profile between ‘ta’ and ‘tb’ using a subsequence length of ‘m’.
[1] Yan Zhu, Zachary Zimmerman, Nader Shakibay Senobari, Chin-Chia Michael Yeh, Gareth Funning, Abdullah Mueen, Philip Brisk and Eamonn Keogh (2016). Matrix Profile II: Exploiting a Novel Algorithm and GPUs to break the one Hundred Million Barrier for Time Series Motifs and Joins. IEEE ICDM 2016.
- Parameters
ta
: Query and reference time series.tb
: Query and reference time series.m
: Subsequence length.profile
: The matrix profile, which reflects the distance to the closer element of the subsequence from ‘t’ in a different location of itself.index
: The matrix profile index, which points to where the aforementioned minimum is located.
-
void
matrixProfileLR
(const af::array &tss, long m, af::array &profileLeft, af::array &indexLeft, af::array &profileRight, af::array &indexRight)¶ Calculates the matrix profile to the left and to the right between ‘t’ and using a subsequence length of ‘m’.
[1] Yan Zhu, Makoto Imamura, Daniel Nikovski, and Eamonn Keogh. Matrix Profile VII: Time Series Chains: A New Primitive for Time Series Data Mining. IEEE ICDM 2017
Notice that when there is no match the subsequence index is the length of tss.
- Parameters
tss
: Time series to compute the matrix profile.m
: Subsequence length.profileLeft
: The matrix profile distance to the left.indexLeft
: The subsequence index of the matrix profile to the left.profileRight
: The matrix profile distance to the right.indexRight
: The subsequence index of the matrix profile to the right.
-
void
getChains
(const af::array &tss, long m, af::array &chains)¶ Calculates all the chains within ‘tss’ using a subsequence length of ‘m’.
[1] Yan Zhu, Makoto Imamura, Daniel Nikovski, and Eamonn Keogh. Matrix Profile VII: Time Series Chains: A New Primitive for Time Series Data Mining. IEEE ICDM 2017
Notice that the size of the first dimension is the maximum possible size which is n - m + 1. If the number of values belonging to a chain is lower than the maximum, the remaining values and indexes are 0. It implies that 0 is an invalid chain index.
- Parameters
tss
: Time series to compute the chains within them.m
: Subsequence length.chains
: The calculated chains with the following topology:- 1st dimension corresponds to the chains indexes flattened.
- 2nd dimension:
- [0] corresponds to all the indexes in the chains flattened
- [1] corresponds to the index of the chain that the value in [0] belongs to.
- 3rd dimension corresponds to the number of time series.
-
namespace
internal
¶ Typedefs
-
using
khiva::matrix::internal::DistancesVector = typedef std::vector<double>
-
using
khiva::matrix::internal::IndexesVector = typedef std::vector<unsigned int>
-
using
khiva::matrix::internal::MatrixProfilePair = typedef std::pair<DistancesVector, IndexesVector>
-
using
khiva::matrix::internal::LeftRightProfilePair = typedef std::pair<MatrixProfilePair, MatrixProfilePair>
-
using
khiva::matrix::internal::Chain = typedef std::vector<unsigned int>
-
using
khiva::matrix::internal::ChainVector = typedef std::vector<Chain>
Functions
-
af::array
slidingDotProduct
(const af::array &q, const af::array &t)¶ Calculates the sliding dot product of the time series ‘q’ against t.
- Return
- array Returns an array with as many elements as ‘t’ in the first dimension and as many elements as the last dimension of ‘q’ in the last dimension.
- Parameters
q
: Array whose first dimension is the length of the query time series and the last dimension is the number of time series to calculate.t
: Array with the second time series in the first dimension.
-
void
meanStdev
(const af::array &t, af::array &a, long m, af::array &mean, af::array &stdev)¶ Calculates the moving average and standard deviation of the time series ‘t’.
- Parameters
t
: Input time series. Multiple time series.a
: Auxiliary array to be used in the function calculateDistanceProfile. Use the overloaded method without this parameter.m
: Window size.mean
: Output array containing the moving average.stdev
: Output array containing the moving standard deviation.
-
void
meanStdev
(const af::array &t, long m, af::array &mean, af::array &stdev)¶ Calculates the moving average and standard deviation of the time series ‘t’.
- Parameters
t
: Input time series. Multiple time series.m
: Window size.mean
: Output array containing the moving average.stdev
: Output array containing the moving standard deviation.
-
void
calculateDistances
(const af::array &qt, const af::array &a, const af::array &sum_q, const af::array &sum_q2, const af::array &mean_t, const af::array &sigma_t, const af::array &mask, af::array &distances)¶ Calculates the distance between ‘q’ and the time series ‘t’, which produced the sliding. Multiple queries can be computed simultaneously in the last dimension of ‘q’.
- Parameters
qt
: The sliding dot product of ‘q’ and ‘t’.a
: Auxiliary array computed using the meanStdev function. This array contains a precomputed fixed value to speed up the distance calculation.sum_q
: Sum of the values contained in ‘q’.sum_q2
: Sum of squaring the values contained in ‘q’.mean_t
: Moving average of ‘t’ using a window size equal to the number of elements in ‘q’.sigma_t
: Moving standard deviation of ‘t’ using a window size equal to the number of elements in ‘q’.mask
: Mask band matrix to filter the trivial match of a subsequence with itself.distances
: Resulting distances.
-
void
calculateDistances
(const af::array &qt, const af::array &a, const af::array &sum_q, const af::array &sum_q2, const af::array &mean_t, const af::array &sigma_t, af::array &distances)¶ Calculates the distance between ‘q’ and the time series ‘t’, which produced the sliding. Multiple queries can be computed simultaneously in the last dimension of ‘q’.
- Parameters
qt
: The sliding dot product of ‘q’ and ‘t’.a
: Auxiliary array computed using the meanStdev function. This array contains a precomputed fixed value to speed up the distance calculation.sum_q
: Sum of the values contained in ‘q’.sum_q2
: Sum of squaring the values contained in ‘q’.mean_t
: Moving average of ‘t’ using a window size equal to the number of elements in ‘q’.sigma_t
: Moving standard deviation of ‘t’ using a window size equal to the number of elements in ‘q’.distances
: Resulting distances.
-
bool
tileIsFarFromDiagonal
(long bandSize, long numRows, long row, long numColumns, long column)¶ Given a tile indices and sizes it returns true when tile would not be affected by a identity band matrix.
- Return
- If it is far or not.
- Parameters
bandSize
: The band size.numRows
: Number of rows of the tile.row
: Starting row of the tile.numColumns
: Number of columns of the tile.column
: Starting column of the tile.
-
af::array
generateMask
(long m, long numRows, long row, long numColumns, long column, long nTimeSeries = 1)¶ Generate an identity band matrix for a given tile indices.
- Return
- The mask.
- Parameters
m
: The query size.numRows
: Number of rows of the tile.row
: Starting row of the tile.numColumns
: Number of columns of the tile.column
: Starting column of the tile.nTimeSeries
: Number of time series.
-
void
massWithMask
(af::array q, const af::array &t, const af::array &a, const af::array &mean_t, const af::array &sigma_t, const af::array &mask, af::array &distances)¶ Calculates the Mueen distance.
[1] Chin-Chia Michael Yeh, Yan Zhu, Liudmila Ulanova, Nurjahan Begum, Yifei Ding, Hoang Anh Dau, Diego Furtado Silva, Abdullah Mueen, Eamonn Keogh (2016). Matrix Profile I: All Pairs Similarity Joins for Time Series: A Unifying View that Includes Motifs, Discords and Shapelets. IEEE ICDM 2016.
- Parameters
q
: Array whose first dimension is the length of the query time series and the last dimension is the number of time series to calculate.t
: Array with the second time series in the first dimension.a
: Auxiliary array computed using the meanStdev function. This array contains a precomputed fixed value to speed up the distance calculation.mean_t
: Moving average of ‘t’ using a window size equal to the number of elements in ‘q’.sigma_t
: Moving standard deviation of ‘t’ using a window size equal to the number of elements in ‘q’.mask
: Specifies the elements that should not be considered in the computation.distances
: Resulting distances.
-
void
mass
(af::array q, const af::array &t, const af::array &a, const af::array &mean_t, const af::array &sigma_t, af::array &distances)¶ Mueen’s Algorithm for Similarity Search.
- Parameters
q
: Array whose first dimension is the length of the query time series and the last dimension is the number of time series to calculate.t
: Array with the second time series in the first dimension.a
: Auxiliary array computed using the meanStdev function. This array contains a precomputed fixed value to speed up the distance calculation.mean_t
: Moving average of ‘t’ using a window size equal to the number of elements in ‘q’.sigma_t
: Moving standard deviation of ‘t’ using a window size equal to the number of elements in ‘q’.distances
: Resulting distances.
-
void
stomp_batched
(const af::array &ta, af::array tb, long m, long batch_size, af::array &profile, af::array &index)¶
-
void
stomp_batched_two_levels
(af::array ta, af::array tb, long m, long batch_size_b, long batch_size_a, af::array &profile, af::array &index)¶
-
void
stomp_parallel
(const af::array &ta, af::array tb, long m, af::array &profile, af::array &index)¶
-
void
stomp_batched_two_levels
(af::array t, long m, long batch_size_b, long batch_size_a, af::array &profile, af::array &index)¶
-
void
stomp_parallel
(af::array t, long m, af::array &profile, af::array &index)¶
-
void
findBestN
(const af::array &profile, const af::array &index, long m, long n, af::array &distance, af::array &indices, af::array &subsequenceIndices, bool selfJoin, bool lookForMotifs)¶
-
void
scamp
(af::array tss, long m, af::array &profile, af::array &index)¶
-
void
scamp
(af::array ta, af::array tb, long m, af::array &profile, af::array &index)¶
-
void
getChains
(af::array tss, long m, af::array &chains)¶
-
ChainVector
extractAllChains
(const IndexesVector &profileLeft, const IndexesVector &profileRight)¶
-
LeftRightProfilePair
scampLR
(std::vector<double> &&ta, long m)¶
-
void
scampLR
(af::array tss, long m, af::array &profileLeft, af::array &indexLeft, af::array &profileRight, af::array &indexRight)¶
-
using
-
void