The R implementation
Here is the R source code of the main algorithm:
GSP <- function (d,I,MIN_SUP){ f <- NULL c[[1]] <- CreateInitalPrefixTree(NULL) len4I <- GetLength(I) for(idx in 1:len4I){ SetSupportCount(I[idx],0) AddChild2Node(c[[1]], I[idx],NULL) } k <- 1 while( !IsEmpty(c[[k]]) ){ ComputeSupportCount(c[[k]],d) while(TRUE){ r <- GetLeaf(c[[k]]) if( r==NULL ){ break } if(GetSupport(r)>=MIN_SUP){ AddFrequentItemset(f,r,GetSupport(r)) }else{ RemoveLeaf(c[[k]],s) } } c[[k+1]] <- ExtendPrefixTree(c[[k]]) k <- K+1 } f }
The SPADE algorithm
Sequential Pattern Discovery using Equivalent classes (SPADE) is a vertical sequence-mining algorithm applied to sequence patterns; it has a depth-first approach. Here are the features of the SPADE algorithm: ...
Get R: Data Analysis and Visualization now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.