LearnClust package allows users to learn how the algorithms get the solution.
The package implements distances between clusters.
It includes main functions that return the solution applying the algorithms.
It contains .details functions that explain the process used to get the solution. They help the user to understand how it gets the solution.
We initialize some datasets to use in the algorithms:
cluster1 <- matrix(c(1,2),ncol=2)
cluster2 <- matrix(c(2,4),ncol=2)
weight <- c(0.2,0.8)
vectorData <- c(1,1,2,3,4,7,8,8,8,10)
# vectorData <- c(1:10)
matrixData <- matrix(vectorData,ncol=2,byrow=TRUE)
print(matrixData)
#> [,1] [,2]
#> [1,] 1 1
#> [2,] 2 3
#> [3,] 4 7
#> [4,] 8 8
#> [5,] 8 10
dfData <- data.frame(matrixData)
print(dfData)
#> X1 X2
#> 1 1 1
#> 2 2 3
#> 3 4 7
#> 4 8 8
#> 5 8 10
plot(dfData)
cMatrix <- matrix(c(2,4,4,2,3,5,1,1,2,2,5,5,1,0,1,1,2,1,2,4,5,1,2,1), ncol=3, byrow=TRUE)
cDataFrame <- data.frame(cMatrix)
The package includes different types of distance:
edistance(cluster1,cluster2)
#> [1] 2.236068
mdistance(cluster1,cluster2)
#> [1] 3
canberradistance(cluster1,cluster2)
#> [1] 0.6666667
chebyshevDistance(cluster1,cluster2)
#> [1] 2
octileDistance(cluster1,cluster2)
#> [1] 2.414214
Each function has a .details version that explain how the calculus is done.
There are functions where some weights are applied to each element. These function are used in the extra algorithm. These functions are:
edistanceW(cluster1,cluster2,weight)
#> [1] 1.843909
mdistanceW(cluster1,cluster2,weight)
#> [1] 1.8
canberradistanceW(cluster1,cluster2,weight)
#> [1] 0.3333333
chebyshevDistanceW(cluster1,cluster2,weight)
#> [1] 1.6
octileDistanceW(cluster1,cluster2,weight)
#> [1] 1.682843
This algorithm uses some functions according to the theoretical process:
list <- toList(vectorData)
# list <- toList(matrixData)
# list <- toList(dfData)
print(list)
#> [[1]]
#> [,1] [,2] [,3]
#> [1,] 1 1 1
#>
#> [[2]]
#> [,1] [,2] [,3]
#> [1,] 2 3 1
#>
#> [[3]]
#> [,1] [,2] [,3]
#> [1,] 4 7 1
#>
#> [[4]]
#> [,1] [,2] [,3]
#> [1,] 8 8 1
#>
#> [[5]]
#> [,1] [,2] [,3]
#> [1,] 8 10 1
matrixDistance <- mdAgglomerative(list,'MAN','AVG')
print(matrixDistance)
#> [,1] [,2] [,3] [,4] [,5]
#> [1,] 0 3 9 14 16
#> [2,] 3 0 6 11 13
#> [3,] 9 6 0 5 7
#> [4,] 14 11 5 0 2
#> [5,] 16 13 7 2 0
minDistance <- minDistance(matrixDistance)
print(minDistance)
#> [1] 2
groupedClusters <- getCluster(minDistance, matrixDistance)
print(groupedClusters)
#> [1] 4 5
updatedClusters <- newCluster(list, groupedClusters)
print(updatedClusters)
#> [[1]]
#> [,1] [,2] [,3]
#> [1,] 1 1 1
#>
#> [[2]]
#> [,1] [,2] [,3]
#> [1,] 2 3 1
#>
#> [[3]]
#> [,1] [,2] [,3]
#> [1,] 4 7 1
#>
#> [[4]]
#> [,1] [,2] [,3]
#> [1,] 8 8 0
#>
#> [[5]]
#> [,1] [,2] [,3]
#> [1,] 8 10 0
#>
#> [[6]]
#> [,1] [,2] [,3]
#> [1,] 8 8 1
#> [2,] 8 10 1
The complete function that implement the algorithm is:
agglomerativeExample <- agglomerativeHC(dfData,'EUC','MAX')
plot(agglomerativeExample$dendrogram)
print(agglomerativeExample$clusters)
#> [[1]]
#> X1 X2
#> 1 1 1
#>
#> [[2]]
#> X1 X2
#> 1 2 3
#>
#> [[3]]
#> X1 X2
#> 1 4 7
#>
#> [[4]]
#> X1 X2
#> 1 8 8
#>
#> [[5]]
#> X1 X2
#> 1 8 10
#>
#> [[6]]
#> X1 X2
#> 1 8 8
#> 2 8 10
#>
#> [[7]]
#> X1 X2
#> 1 1 1
#> 2 2 3
#>
#> [[8]]
#> X1 X2
#> 1 4 7
#> 2 8 8
#> 3 8 10
#>
#> [[9]]
#> X1 X2
#> 1 1 1
#> 2 2 3
#> 3 4 7
#> 4 8 8
#> 5 8 10
print(agglomerativeExample$groupedClusters)
#> cluster1 cluster2
#> 1 4 5
#> 2 1 2
#> 3 3 6
#> 4 7 8
The package includes some auxiliar functions to implement the algorithm. These functions are:
cleanClusters <- usefulClusters(updatedClusters)
print(cleanClusters)
#> [[1]]
#> [,1] [,2] [,3]
#> [1,] 1 1 1
#>
#> [[2]]
#> [,1] [,2] [,3]
#> [1,] 2 3 1
#>
#> [[3]]
#> [,1] [,2] [,3]
#> [1,] 4 7 1
#>
#> [[4]]
#> NULL
#>
#> [[5]]
#> NULL
#>
#> [[6]]
#> [,1] [,2] [,3]
#> [1,] 8 8 1
#> [2,] 8 10 1
distances <- c(2,4,6,8)
clusterDistanceByApproach <- clusterDistanceByApproach(distances,'AVG')
print(clusterDistanceByApproach)
#> [1] 5
“clusterDistanceByApproach” get the value using approach type. This type could be “MAX”,“MIN”, and “AVG”
clusterDistance <- clusterDistance(cluster1,cluster2,'MAX','MAN')
print(clusterDistance)
#> [1] 3
“clusterDistance” get the distance value between each element from one cluster to the other ones using distance type. This type could be “EUC”, “MAN”, “CAN”, “CHE”, and “OCT”
This algorithm explains every function.
list <- toList.details(vectorData)
#> 'toList' creates a list initializing datas by creating clusters with each one
# list <- toList(matrixData)
# list <- toList(dfData)
print(list)
#> [[1]]
#> [,1] [,2] [,3]
#> [1,] 1 1 1
#>
#> [[2]]
#> [,1] [,2] [,3]
#> [1,] 2 3 1
#>
#> [[3]]
#> [,1] [,2] [,3]
#> [1,] 4 7 1
#>
#> [[4]]
#> [,1] [,2] [,3]
#> [1,] 8 8 1
#>
#> [[5]]
#> [,1] [,2] [,3]
#> [1,] 8 10 1
matrixDistance <- mdAgglomerative.details(list,'MAN','AVG')
#>
#> 'mdAgglomerative' creates the matrix distance of every cluster given by 'list' parameter.
#>
#> It usesMANdistance andAVGapproach.
#>
#> It returns the matrix with all the distances between clusters depending on distance and approach.
#>
#> The matrix distance is:
#> [,1] [,2] [,3] [,4] [,5]
#> [1,] 0 3 9 14 16
#> [2,] 3 0 6 11 13
#> [3,] 9 6 0 5 7
#> [4,] 14 11 5 0 2
#> [5,] 16 13 7 2 0
minDistance <- minDistance.details(matrixDistance)
#>
#> 'minDistance' function gets the minimal value from a matrix.
#>
#> It returns the minimal value avoiding 0 values.
#> [,1] [,2] [,3] [,4] [,5]
#> [1,] 0 3 9 14 16
#> [2,] 3 0 6 11 13
#> [3,] 9 6 0 5 7
#> [4,] 14 11 5 0 2
#> [5,] 16 13 7 2 0
#>
#>
#> 2 is the minimal value of the matrix.
groupedClusters <- getCluster.details(minDistance, matrixDistance)
#>
#> 'getCluster' method searches the clusters which have the distance given.
#> Search for 2 in:
#> [,1] [,2] [,3] [,4] [,5]
#> [1,] 0 3 9 14 16
#> [2,] 3 0 6 11 13
#> [3,] 9 6 0 5 7
#> [4,] 14 11 5 0 2
#> [5,] 16 13 7 2 0
#>
#> The clusters with the minimum distance are: 4, 5
updatedClusters <- newCluster.details(list, groupedClusters)
#>
#> 'newCluster' function creates a new cluster from the clusters given.
#>
#> It adds the new cluster to 'list' and disables the clusters used to create the new one.
#>
#> Using 4 and 5 it searches the clusters in 'list' parameter and creates a new one with
#> their components.
#>
#> After the new cluster is created, the initial clusters must be disabled.
#>
#> The new cluster:
#> [,1] [,2] [,3]
#> [1,] 8 8 1
#> [2,] 8 10 1
#>
#>
#> is added to the list:
The complete function that explains the algorithm is:
agglomerativeExample <- agglomerativeHC.details(vectorData,'EUC','MAX')
#> Agglomerative hierarchical clustering is a classification technique that initializes
#> a cluster for each data.
#>
#> It calculates the distance between datas depending on the approach type given and
#>
#> it creates a new cluster joining the most similar clusters until getting only one.
#> 'toList' creates a list initializing datas by creating clusters with each one
#>
#> These are the clusters with only one element:
#> [[1]]
#> [,1] [,2] [,3]
#> [1,] 1 1 1
#>
#> [[2]]
#> [,1] [,2] [,3]
#> [1,] 2 3 1
#>
#> [[3]]
#> [,1] [,2] [,3]
#> [1,] 4 7 1
#>
#> [[4]]
#> [,1] [,2] [,3]
#> [1,] 8 8 1
#>
#> [[5]]
#> [,1] [,2] [,3]
#> [1,] 8 10 1
#>
#> In each step:
#> - It calculates a matrix distance between active clusters depending on the approach and distance type.
#> - It gets the minimum distance value from the matrix.
#> - It creates a new cluster joining the minimum distance clusters.
#> - It repeats these steps while final clusters do not include all datas.
#>
#> _____________________________________________________________________________________________
#> STEP => 1
#>
#> Matrix Distance (distance type = EUC, approach type = MAX):
#> [,1] [,2] [,3] [,4] [,5]
#> [1,] 0.000000 2.236068 6.708204 9.899495 11.401754
#> [2,] 2.236068 0.000000 4.472136 7.810250 9.219544
#> [3,] 6.708204 4.472136 0.000000 4.123106 5.000000
#> [4,] 9.899495 7.810250 4.123106 0.000000 2.000000
#> [5,] 11.401754 9.219544 5.000000 2.000000 0.000000
#>
#> The minimum distance is: 2
#>
#> The closest clusters are: 4, 5
#>
#> The grouped clusters are added to the solution.
#>
#> Grouping clusters 4 and cluster 5, it is created a new cluster:
#> X1 X2
#> 1 8 8
#> 2 8 10
#>
#> The new cluster is added to the solution.
#>
#> _____________________________________________________________________________________________
#> STEP => 2
#>
#> Matrix Distance (distance type = EUC, approach type = MAX):
#> [,1] [,2] [,3] [,4] [,5] [,6]
#> [1,] 0.000000 2.236068 6.708204 0 0 11.401754
#> [2,] 2.236068 0.000000 4.472136 0 0 9.219544
#> [3,] 6.708204 4.472136 0.000000 0 0 5.000000
#> [4,] 0.000000 0.000000 0.000000 0 0 0.000000
#> [5,] 0.000000 0.000000 0.000000 0 0 0.000000
#> [6,] 11.401754 9.219544 5.000000 0 0 0.000000
#>
#> The minimum distance is: 2.23606797749979
#>
#> The closest clusters are: 1, 2
#>
#> The grouped clusters are added to the solution.
#>
#> Grouping clusters 1 and cluster 2, it is created a new cluster:
#> X1 X2
#> 1 1 1
#> 2 2 3
#>
#> The new cluster is added to the solution.
#>
#> _____________________________________________________________________________________________
#> STEP => 3
#>
#> Matrix Distance (distance type = EUC, approach type = MAX):
#> [,1] [,2] [,3] [,4] [,5] [,6] [,7]
#> [1,] 0 0 0.000000 0 0 0.00000 0.000000
#> [2,] 0 0 0.000000 0 0 0.00000 0.000000
#> [3,] 0 0 0.000000 0 0 5.00000 6.708204
#> [4,] 0 0 0.000000 0 0 0.00000 0.000000
#> [5,] 0 0 0.000000 0 0 0.00000 0.000000
#> [6,] 0 0 5.000000 0 0 0.00000 11.401754
#> [7,] 0 0 6.708204 0 0 11.40175 0.000000
#>
#> The minimum distance is: 5
#>
#> The closest clusters are: 3, 6
#>
#> The grouped clusters are added to the solution.
#>
#> Grouping clusters 3 and cluster 6, it is created a new cluster:
#> X1 X2
#> 1 4 7
#> 2 8 8
#> 3 8 10
#>
#> The new cluster is added to the solution.
#>
#> _____________________________________________________________________________________________
#> STEP => 4
#>
#> Matrix Distance (distance type = EUC, approach type = MAX):
#> [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
#> [1,] 0 0 0 0 0 0 0.00000 0.00000
#> [2,] 0 0 0 0 0 0 0.00000 0.00000
#> [3,] 0 0 0 0 0 0 0.00000 0.00000
#> [4,] 0 0 0 0 0 0 0.00000 0.00000
#> [5,] 0 0 0 0 0 0 0.00000 0.00000
#> [6,] 0 0 0 0 0 0 0.00000 0.00000
#> [7,] 0 0 0 0 0 0 0.00000 11.40175
#> [8,] 0 0 0 0 0 0 11.40175 0.00000
#>
#> The minimum distance is: 11.4017542509914
#>
#> The closest clusters are: 7, 8
#>
#> The grouped clusters are added to the solution.
#>
#> Grouping clusters 7 and cluster 8, it is created a new cluster:
#> X1 X2
#> 1 1 1
#> 2 2 3
#> 3 4 7
#> 4 8 8
#> 5 8 10
#>
#> The new cluster is added to the solution.
#>
#> This loop has been repeated until the last cluster contained every single clusters.
This algorithm uses some functions according to the theoretical process:
# list <- toListDivisive(vectorData)
# list <- toListDivisive(matrixData)
list <- toListDivisive(dfData[1:4,])
print(list)
#> [[1]]
#> [,1] [,2]
#> [1,] 1 1
#>
#> [[2]]
#> [,1] [,2]
#> [1,] 2 3
#>
#> [[3]]
#> [,1] [,2]
#> [1,] 4 7
#>
#> [[4]]
#> [,1] [,2]
#> [1,] 8 8
clustersList <- initClusters(list)
print(clustersList)
#> [[1]]
#> [,1] [,2]
#> [1,] 1 1
#>
#> [[2]]
#> [,1] [,2]
#> [1,] 2 3
#>
#> [[3]]
#> [,1] [,2]
#> [1,] 4 7
#>
#> [[4]]
#> [,1] [,2]
#> [1,] 8 8
#>
#> [[5]]
#> [,1] [,2]
#> [1,] 1 1
#> [2,] 2 3
#>
#> [[6]]
#> [,1] [,2]
#> [1,] 1 1
#> [2,] 4 7
#>
#> [[7]]
#> [,1] [,2]
#> [1,] 1 1
#> [2,] 8 8
#>
#> [[8]]
#> [,1] [,2]
#> [1,] 2 3
#> [2,] 4 7
#>
#> [[9]]
#> [,1] [,2]
#> [1,] 2 3
#> [2,] 8 8
#>
#> [[10]]
#> [,1] [,2]
#> [1,] 4 7
#> [2,] 8 8
#>
#> [[11]]
#> [,1] [,2]
#> [1,] 1 1
#> [2,] 2 3
#> [3,] 4 7
#>
#> [[12]]
#> [,1] [,2]
#> [1,] 1 1
#> [2,] 2 3
#> [3,] 8 8
#>
#> [[13]]
#> [,1] [,2]
#> [1,] 1 1
#> [2,] 4 7
#> [3,] 8 8
#>
#> [[14]]
#> [,1] [,2]
#> [1,] 2 3
#> [2,] 4 7
#> [3,] 8 8
#>
#> [[15]]
#> [,1] [,2]
#> [1,] 1 1
#> [2,] 2 3
#> [3,] 4 7
#> [4,] 8 8
matrixDistance <- mdDivisive(clustersList,'MAN','AVG',list)
print(matrixDistance)
#> [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11]
#> [1,] 0.000000 0.000000 0.000000 0 0 0 0 0 0 0 0
#> [2,] 0.000000 0.000000 0.000000 0 0 0 0 0 0 0 0
#> [3,] 0.000000 0.000000 0.000000 0 0 0 0 0 0 0 0
#> [4,] 0.000000 0.000000 0.000000 0 0 0 0 0 0 0 10
#> [5,] 0.000000 0.000000 0.000000 0 0 0 0 0 0 10 0
#> [6,] 0.000000 0.000000 0.000000 0 0 0 0 0 7 0 0
#> [7,] 0.000000 0.000000 0.000000 0 0 0 0 7 0 0 0
#> [8,] 0.000000 0.000000 0.000000 0 0 0 7 0 0 0 0
#> [9,] 0.000000 0.000000 0.000000 0 0 7 0 0 0 0 0
#> [10,] 0.000000 0.000000 0.000000 0 10 0 0 0 0 0 0
#> [11,] 0.000000 0.000000 0.000000 10 0 0 0 0 0 0 0
#> [12,] 0.000000 0.000000 6.666667 0 0 0 0 0 0 0 0
#> [13,] 0.000000 6.666667 0.000000 0 0 0 0 0 0 0 0
#> [14,] 8.666667 0.000000 0.000000 0 0 0 0 0 0 0 0
#> [15,] 0.000000 0.000000 0.000000 0 0 0 0 0 0 0 0
#> [,12] [,13] [,14] [,15]
#> [1,] 0.000000 0.000000 8.666667 0
#> [2,] 0.000000 6.666667 0.000000 0
#> [3,] 6.666667 0.000000 0.000000 0
#> [4,] 0.000000 0.000000 0.000000 0
#> [5,] 0.000000 0.000000 0.000000 0
#> [6,] 0.000000 0.000000 0.000000 0
#> [7,] 0.000000 0.000000 0.000000 0
#> [8,] 0.000000 0.000000 0.000000 0
#> [9,] 0.000000 0.000000 0.000000 0
#> [10,] 0.000000 0.000000 0.000000 0
#> [11,] 0.000000 0.000000 0.000000 0
#> [12,] 0.000000 0.000000 0.000000 0
#> [13,] 0.000000 0.000000 0.000000 0
#> [14,] 0.000000 0.000000 0.000000 0
#> [15,] 0.000000 0.000000 0.000000 0
maxDistance <- maxDistance(matrixDistance)
print(maxDistance)
#> [1] 10
dividedClusters <- getClusterDivisive(maxDistance, matrixDistance)
print(dividedClusters)
#> [1] 56
Two new subclusters will be created from the initial one and added to the solution.
We repeat from step 2 to 5 until any cluster could be divided again.
The complete function that implement the algorithm is:
divisiveExample <- divisiveHC(dfData[1:4,],'MAN','AVG')
print(divisiveExample)
#> [[1]]
#> X1 X2
#> 1 1 1
#> 2 2 3
#> 3 4 7
#> 4 8 8
#>
#> [[2]]
#> X1 X2
#> 1 8 8
#>
#> [[3]]
#> X1 X2
#> 1 1 1
#> 2 2 3
#> 3 4 7
#>
#> [[4]]
#> X1 X2
#> 1 4 7
#>
#> [[5]]
#> X1 X2
#> 1 1 1
#> 2 2 3
#>
#> [[6]]
#> X1 X2
#> 1 1 1
#>
#> [[7]]
#> X1 X2
#> 1 2 3
The package uses the same auxiliar functions as the previous to implement the algorithm. These functions are:
clusterDistanceByApproach
clusterDistance
complementaryClusters: checks if the clusters we are going to divide are complementary, that is, every initial cluster is in one or in the other cluster, but never in both. This condition allows to not loose any cluster when the division is done.
data <- c(1,2,1,3,1,4,1,5)
components <- toListDivisive(data)
cluster1 <- matrix(c(1,2,1,3),ncol=2,byrow=TRUE)
cluster2 <- matrix(c(1,4,1,5),ncol=2,byrow=TRUE)
cluster3 <- matrix(c(1,6,1,7),ncol=2,byrow=TRUE)
complementaryClusters(components,cluster1,cluster2)
#> [1] TRUE
complementaryClusters(components,cluster1,cluster3)
#> [1] FALSE
Its “.details” version, explains how the functions checks this condition:
complementaryClusters.details(components,cluster1,cluster2)
#>
#> 'complementaryClusters' checks if clusters are complementary.
#>
#> Each element from 'components' list has to be in one cluster, but never included in both.
#>
#> If every element is in one cluster, the function will return 'TRUE',
#>
#> if not, the result will be 'FALSE'
#>
#> The clusters to be checked are:
#> [,1] [,2]
#> [1,] 1 2
#> [2,] 1 3
#> [,1] [,2]
#> [1,] 1 4
#> [2,] 1 5
#>
#> And they have to include:
#> [[1]]
#> [,1] [,2]
#> [1,] 1 2
#>
#> [[2]]
#> [,1] [,2]
#> [1,] 1 3
#>
#> [[3]]
#> [,1] [,2]
#> [1,] 1 4
#>
#> [[4]]
#> [,1] [,2]
#> [1,] 1 5
#>
#>
#> So, the result is TRUE.
#> [1] TRUE
This algorithm explains every function.
# list <- toListDivisive.details(vectorData)
# list <- toListDivisive(matrixData)
list <- toListDivisive(dfData[1:4,])
print(list)
#> [[1]]
#> [,1] [,2]
#> [1,] 1 1
#>
#> [[2]]
#> [,1] [,2]
#> [1,] 2 3
#>
#> [[3]]
#> [,1] [,2]
#> [1,] 4 7
#>
#> [[4]]
#> [,1] [,2]
#> [1,] 8 8
clustersList <- initClusters.details(list)
#>
#> 'initClusters' method initializes the clusters used in the divisive algorithm.
#>
#> To know which are the most different clusters, we need to know the distance between
#> every possible clusters that could be created with the initial elements.
#>
#> This step is the most computationally complex, so it will make the algorithm to get the
#> solution with delay, or even, not to find a solution because of the computers capacities.
#>
#> The clusters created using
#> [[1]]
#> [,1] [,2]
#> [1,] 1 1
#>
#> [[2]]
#> [,1] [,2]
#> [1,] 2 3
#>
#> [[3]]
#> [,1] [,2]
#> [1,] 4 7
#>
#> [[4]]
#> [,1] [,2]
#> [1,] 8 8
#>
#> are:
#> [[1]]
#> [,1] [,2]
#> [1,] 1 1
#>
#> [[2]]
#> [,1] [,2]
#> [1,] 2 3
#>
#> [[3]]
#> [,1] [,2]
#> [1,] 4 7
#>
#> [[4]]
#> [,1] [,2]
#> [1,] 8 8
#>
#> [[5]]
#> [,1] [,2]
#> [1,] 1 1
#> [2,] 2 3
#>
#> [[6]]
#> [,1] [,2]
#> [1,] 1 1
#> [2,] 4 7
#>
#> [[7]]
#> [,1] [,2]
#> [1,] 1 1
#> [2,] 8 8
#>
#> [[8]]
#> [,1] [,2]
#> [1,] 2 3
#> [2,] 4 7
#>
#> [[9]]
#> [,1] [,2]
#> [1,] 2 3
#> [2,] 8 8
#>
#> [[10]]
#> [,1] [,2]
#> [1,] 4 7
#> [2,] 8 8
#>
#> [[11]]
#> [,1] [,2]
#> [1,] 1 1
#> [2,] 2 3
#> [3,] 4 7
#>
#> [[12]]
#> [,1] [,2]
#> [1,] 1 1
#> [2,] 2 3
#> [3,] 8 8
#>
#> [[13]]
#> [,1] [,2]
#> [1,] 1 1
#> [2,] 4 7
#> [3,] 8 8
#>
#> [[14]]
#> [,1] [,2]
#> [1,] 2 3
#> [2,] 4 7
#> [3,] 8 8
#>
#> [[15]]
#> [,1] [,2]
#> [1,] 1 1
#> [2,] 2 3
#> [3,] 4 7
#> [4,] 8 8
matrixDistance <- mdDivisive.details(clustersList,'MAN','AVG',list)
#>
#> 'mdDivisive' creates a matrix distance including every cluster from 'list'.
#>
#>
#> The function checks if the clusters are not valid, if they are the same cluster and
#>
#> if the clusters are not complementary. It will allocate a 0 value if any condition is not 'TRUE'.
#>
#>
#> [[1]]
#> [,1] [,2]
#> [1,] 1 1
#>
#> [[2]]
#> [,1] [,2]
#> [1,] 2 3
#>
#> [[3]]
#> [,1] [,2]
#> [1,] 4 7
#>
#> [[4]]
#> [,1] [,2]
#> [1,] 8 8
#>
#> [[5]]
#> [,1] [,2]
#> [1,] 1 1
#> [2,] 2 3
#>
#> [[6]]
#> [,1] [,2]
#> [1,] 1 1
#> [2,] 4 7
#>
#> [[7]]
#> [,1] [,2]
#> [1,] 1 1
#> [2,] 8 8
#>
#> [[8]]
#> [,1] [,2]
#> [1,] 2 3
#> [2,] 4 7
#>
#> [[9]]
#> [,1] [,2]
#> [1,] 2 3
#> [2,] 8 8
#>
#> [[10]]
#> [,1] [,2]
#> [1,] 4 7
#> [2,] 8 8
#>
#> [[11]]
#> [,1] [,2]
#> [1,] 1 1
#> [2,] 2 3
#> [3,] 4 7
#>
#> [[12]]
#> [,1] [,2]
#> [1,] 1 1
#> [2,] 2 3
#> [3,] 8 8
#>
#> [[13]]
#> [,1] [,2]
#> [1,] 1 1
#> [2,] 4 7
#> [3,] 8 8
#>
#> [[14]]
#> [,1] [,2]
#> [1,] 2 3
#> [2,] 4 7
#> [3,] 8 8
#>
#> [[15]]
#> [,1] [,2]
#> [1,] 1 1
#> [2,] 2 3
#> [3,] 4 7
#> [4,] 8 8
#>
#> The matrix distance for the list above is:
#> [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11]
#> [1,] 0.000000 0.000000 0.000000 0 0 0 0 0 0 0 0
#> [2,] 0.000000 0.000000 0.000000 0 0 0 0 0 0 0 0
#> [3,] 0.000000 0.000000 0.000000 0 0 0 0 0 0 0 0
#> [4,] 0.000000 0.000000 0.000000 0 0 0 0 0 0 0 10
#> [5,] 0.000000 0.000000 0.000000 0 0 0 0 0 0 10 0
#> [6,] 0.000000 0.000000 0.000000 0 0 0 0 0 7 0 0
#> [7,] 0.000000 0.000000 0.000000 0 0 0 0 7 0 0 0
#> [8,] 0.000000 0.000000 0.000000 0 0 0 7 0 0 0 0
#> [9,] 0.000000 0.000000 0.000000 0 0 7 0 0 0 0 0
#> [10,] 0.000000 0.000000 0.000000 0 10 0 0 0 0 0 0
#> [11,] 0.000000 0.000000 0.000000 10 0 0 0 0 0 0 0
#> [12,] 0.000000 0.000000 6.666667 0 0 0 0 0 0 0 0
#> [13,] 0.000000 6.666667 0.000000 0 0 0 0 0 0 0 0
#> [14,] 8.666667 0.000000 0.000000 0 0 0 0 0 0 0 0
#> [15,] 0.000000 0.000000 0.000000 0 0 0 0 0 0 0 0
#> [,12] [,13] [,14] [,15]
#> [1,] 0.000000 0.000000 8.666667 0
#> [2,] 0.000000 6.666667 0.000000 0
#> [3,] 6.666667 0.000000 0.000000 0
#> [4,] 0.000000 0.000000 0.000000 0
#> [5,] 0.000000 0.000000 0.000000 0
#> [6,] 0.000000 0.000000 0.000000 0
#> [7,] 0.000000 0.000000 0.000000 0
#> [8,] 0.000000 0.000000 0.000000 0
#> [9,] 0.000000 0.000000 0.000000 0
#> [10,] 0.000000 0.000000 0.000000 0
#> [11,] 0.000000 0.000000 0.000000 0
#> [12,] 0.000000 0.000000 0.000000 0
#> [13,] 0.000000 0.000000 0.000000 0
#> [14,] 0.000000 0.000000 0.000000 0
#> [15,] 0.000000 0.000000 0.000000 0
maxDistance <- maxDistance.details(matrixDistance)
#>
#> 'maxDistance' function gets the maximal value from a matrix.
#>
#> It returns the maximal value avoiding 0 values.
#> [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11]
#> [1,] 0.000000 0.000000 0.000000 0 0 0 0 0 0 0 0
#> [2,] 0.000000 0.000000 0.000000 0 0 0 0 0 0 0 0
#> [3,] 0.000000 0.000000 0.000000 0 0 0 0 0 0 0 0
#> [4,] 0.000000 0.000000 0.000000 0 0 0 0 0 0 0 10
#> [5,] 0.000000 0.000000 0.000000 0 0 0 0 0 0 10 0
#> [6,] 0.000000 0.000000 0.000000 0 0 0 0 0 7 0 0
#> [7,] 0.000000 0.000000 0.000000 0 0 0 0 7 0 0 0
#> [8,] 0.000000 0.000000 0.000000 0 0 0 7 0 0 0 0
#> [9,] 0.000000 0.000000 0.000000 0 0 7 0 0 0 0 0
#> [10,] 0.000000 0.000000 0.000000 0 10 0 0 0 0 0 0
#> [11,] 0.000000 0.000000 0.000000 10 0 0 0 0 0 0 0
#> [12,] 0.000000 0.000000 6.666667 0 0 0 0 0 0 0 0
#> [13,] 0.000000 6.666667 0.000000 0 0 0 0 0 0 0 0
#> [14,] 8.666667 0.000000 0.000000 0 0 0 0 0 0 0 0
#> [15,] 0.000000 0.000000 0.000000 0 0 0 0 0 0 0 0
#> [,12] [,13] [,14] [,15]
#> [1,] 0.000000 0.000000 8.666667 0
#> [2,] 0.000000 6.666667 0.000000 0
#> [3,] 6.666667 0.000000 0.000000 0
#> [4,] 0.000000 0.000000 0.000000 0
#> [5,] 0.000000 0.000000 0.000000 0
#> [6,] 0.000000 0.000000 0.000000 0
#> [7,] 0.000000 0.000000 0.000000 0
#> [8,] 0.000000 0.000000 0.000000 0
#> [9,] 0.000000 0.000000 0.000000 0
#> [10,] 0.000000 0.000000 0.000000 0
#> [11,] 0.000000 0.000000 0.000000 0
#> [12,] 0.000000 0.000000 0.000000 0
#> [13,] 0.000000 0.000000 0.000000 0
#> [14,] 0.000000 0.000000 0.000000 0
#> [15,] 0.000000 0.000000 0.000000 0
#>
#>
#> 10 is the maximal value of the matrix.
dividedClusters <- getClusterDivisive.details(maxDistance, matrixDistance)
#>
#> 'getCluster' method searches the cluster which have the distance to the target given.
#> Search for 10 in:
#> [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11]
#> [1,] 0.000000 0.000000 0.000000 0 0 0 0 0 0 0 0
#> [2,] 0.000000 0.000000 0.000000 0 0 0 0 0 0 0 0
#> [3,] 0.000000 0.000000 0.000000 0 0 0 0 0 0 0 0
#> [4,] 0.000000 0.000000 0.000000 0 0 0 0 0 0 0 10
#> [5,] 0.000000 0.000000 0.000000 0 0 0 0 0 0 10 0
#> [6,] 0.000000 0.000000 0.000000 0 0 0 0 0 7 0 0
#> [7,] 0.000000 0.000000 0.000000 0 0 0 0 7 0 0 0
#> [8,] 0.000000 0.000000 0.000000 0 0 0 7 0 0 0 0
#> [9,] 0.000000 0.000000 0.000000 0 0 7 0 0 0 0 0
#> [10,] 0.000000 0.000000 0.000000 0 10 0 0 0 0 0 0
#> [11,] 0.000000 0.000000 0.000000 10 0 0 0 0 0 0 0
#> [12,] 0.000000 0.000000 6.666667 0 0 0 0 0 0 0 0
#> [13,] 0.000000 6.666667 0.000000 0 0 0 0 0 0 0 0
#> [14,] 8.666667 0.000000 0.000000 0 0 0 0 0 0 0 0
#> [15,] 0.000000 0.000000 0.000000 0 0 0 0 0 0 0 0
#> [,12] [,13] [,14] [,15]
#> [1,] 0.000000 0.000000 8.666667 0
#> [2,] 0.000000 6.666667 0.000000 0
#> [3,] 6.666667 0.000000 0.000000 0
#> [4,] 0.000000 0.000000 0.000000 0
#> [5,] 0.000000 0.000000 0.000000 0
#> [6,] 0.000000 0.000000 0.000000 0
#> [7,] 0.000000 0.000000 0.000000 0
#> [8,] 0.000000 0.000000 0.000000 0
#> [9,] 0.000000 0.000000 0.000000 0
#> [10,] 0.000000 0.000000 0.000000 0
#> [11,] 0.000000 0.000000 0.000000 0
#> [12,] 0.000000 0.000000 0.000000 0
#> [13,] 0.000000 0.000000 0.000000 0
#> [14,] 0.000000 0.000000 0.000000 0
#> [15,] 0.000000 0.000000 0.000000 0
#>
#> The cluster with the minimum distance is: 56
The complete function that explains the algorithm is:
divisiveExample <- divisiveHC.details(dfData[1:4,],'MAN','AVG')
#>
#> Divisive hierarchical clustering is a classification technique that initializes a cluster
#> with every data.
#>
#> It calculates the distance between datas depending on the approach type given and
#>
#> it divides the most different clusters until any cluster can be divided again.
#>
#> These are the clusters with only one element:
#> [[1]]
#> X1 X2
#> 1 1 1
#>
#> [[2]]
#> X1 X2
#> 1 2 3
#>
#> [[3]]
#> X1 X2
#> 1 4 7
#>
#> [[4]]
#> X1 X2
#> 1 8 8
#>
#> And this is the initial cluster with every element:
#> X1 X2
#> 1 1 1
#> 2 2 3
#> 3 4 7
#> 4 8 8
#>
#> In each step:
#> - It calculates a matrix distance between valid clusters depending on the approach
#> and the distance types.
#> - It gets the maximal distance value from the matrixes of every cluster. We have to
#> look for the most different clusters available in every cluster.
#> - It divides the selected cluster in two new and complementary clusters using tha maximal
#> distance between clusters.
#> - It repeats these steps while there isn't any cluster that can be divided again.
#>
#> _____________________________________________________________________________________________
#> STEP => 1
#>
#> The algorithm calculates a matrix distance from every active cluster, but it has to find
#> the maximal value from all matrix and then chooses which is the selected one.
#>
#> It gets the maximal value from every matrix and then chooses the maximal between them. So...
#>
#> Matrix Distance (distance type = MAN, approach type = AVG):
#> [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11]
#> [1,] 0.000000 0.000000 0.000000 0 0 0 0 0 0 0 0
#> [2,] 0.000000 0.000000 0.000000 0 0 0 0 0 0 0 0
#> [3,] 0.000000 0.000000 0.000000 0 0 0 0 0 0 0 0
#> [4,] 0.000000 0.000000 0.000000 0 0 0 0 0 0 0 10
#> [5,] 0.000000 0.000000 0.000000 0 0 0 0 0 0 10 0
#> [6,] 0.000000 0.000000 0.000000 0 0 0 0 0 7 0 0
#> [7,] 0.000000 0.000000 0.000000 0 0 0 0 7 0 0 0
#> [8,] 0.000000 0.000000 0.000000 0 0 0 7 0 0 0 0
#> [9,] 0.000000 0.000000 0.000000 0 0 7 0 0 0 0 0
#> [10,] 0.000000 0.000000 0.000000 0 10 0 0 0 0 0 0
#> [11,] 0.000000 0.000000 0.000000 10 0 0 0 0 0 0 0
#> [12,] 0.000000 0.000000 6.666667 0 0 0 0 0 0 0 0
#> [13,] 0.000000 6.666667 0.000000 0 0 0 0 0 0 0 0
#> [14,] 8.666667 0.000000 0.000000 0 0 0 0 0 0 0 0
#> [15,] 0.000000 0.000000 0.000000 0 0 0 0 0 0 0 0
#> [,12] [,13] [,14] [,15]
#> [1,] 0.000000 0.000000 8.666667 0
#> [2,] 0.000000 6.666667 0.000000 0
#> [3,] 6.666667 0.000000 0.000000 0
#> [4,] 0.000000 0.000000 0.000000 0
#> [5,] 0.000000 0.000000 0.000000 0
#> [6,] 0.000000 0.000000 0.000000 0
#> [7,] 0.000000 0.000000 0.000000 0
#> [8,] 0.000000 0.000000 0.000000 0
#> [9,] 0.000000 0.000000 0.000000 0
#> [10,] 0.000000 0.000000 0.000000 0
#> [11,] 0.000000 0.000000 0.000000 0
#> [12,] 0.000000 0.000000 0.000000 0
#> [13,] 0.000000 0.000000 0.000000 0
#> [14,] 0.000000 0.000000 0.000000 0
#> [15,] 0.000000 0.000000 0.000000 0
#>
#> The maximal distance is: 10
#>
#> The most distant clusters are:
#> X1 X2
#> 1 8 8
#>
#> X1 X2
#> 1 1 1
#> 2 2 3
#> 3 4 7
#>
#> The divided clusters are added to the solution.
#>
#> - The second divided cluster is added to the active clusters list because it can be divided again.
#>
#>
#> If the divided clusters can't be divided again, then they don`t be added to the active clusters.
#>
#> _____________________________________________________________________________________________
#> STEP => 2
#>
#> The algorithm calculates a matrix distance from every active cluster, but it has to find
#> the maximal value from all matrix and then chooses which is the selected one.
#>
#> It gets the maximal value from every matrix and then chooses the maximal between them. So...
#>
#> Matrix Distance (distance type = MAN, approach type = AVG):
#> [,1] [,2] [,3] [,4] [,5] [,6] [,7]
#> [1,] 0 0.0 0.0 0.0 0.0 6 0
#> [2,] 0 0.0 0.0 0.0 4.5 0 0
#> [3,] 0 0.0 0.0 7.5 0.0 0 0
#> [4,] 0 0.0 7.5 0.0 0.0 0 0
#> [5,] 0 4.5 0.0 0.0 0.0 0 0
#> [6,] 6 0.0 0.0 0.0 0.0 0 0
#> [7,] 0 0.0 0.0 0.0 0.0 0 0
#>
#> The maximal distance is: 7.5
#>
#> The most distant clusters are:
#> X1 X2
#> 1 4 7
#>
#> X1 X2
#> 1 1 1
#> 2 2 3
#>
#> The divided clusters are added to the solution.
#>
#> - The second divided cluster is added to the active clusters list because it can be divided again.
#>
#>
#> If the divided clusters can't be divided again, then they don`t be added to the active clusters.
#>
#> _____________________________________________________________________________________________
#> STEP => 3
#>
#> The algorithm calculates a matrix distance from every active cluster, but it has to find
#> the maximal value from all matrix and then chooses which is the selected one.
#>
#> It gets the maximal value from every matrix and then chooses the maximal between them. So...
#>
#> Matrix Distance (distance type = MAN, approach type = AVG):
#> [,1] [,2] [,3]
#> [1,] 0 3 0
#> [2,] 3 0 0
#> [3,] 0 0 0
#>
#> The maximal distance is: 3
#>
#> The most distant clusters are:
#> X1 X2
#> 1 1 1
#>
#> X1 X2
#> 1 2 3
#>
#> The divided clusters are added to the solution.
#>
#> If the divided clusters can't be divided again, then they don`t be added to the active clusters.
#>
#> This loop has been repeated until there aren't any active cluster, that is, any cluster
#> can be divided again.
print(divisiveExample)
#> [[1]]
#> [,1] [,2]
#> [1,] 1 1
#> [2,] 2 3
#> [3,] 4 7
#> [4,] 8 8
#>
#> [[2]]
#> [,1] [,2]
#> [1,] 8 8
#>
#> [[3]]
#> [,1] [,2]
#> [1,] 1 1
#> [2,] 2 3
#> [3,] 4 7
#>
#> [[4]]
#> [,1] [,2]
#> [1,] 4 7
#>
#> [[5]]
#> [,1] [,2]
#> [1,] 1 1
#> [2,] 2 3
#>
#> [[6]]
#> [,1] [,2]
#> [1,] 1 1
#>
#> [[7]]
#> [,1] [,2]
#> [1,] 2 3
This example shows how the algorithm works step by step. 1. Input data is initialized creating a cluster with each data frame row.
initData <- initData(cDataFrame)
print(initData)
#> [[1]]
#> [,1] [,2] [,3]
#> [1,] 2 4 4
#>
#> [[2]]
#> [,1] [,2] [,3]
#> [1,] 2 3 5
#>
#> [[3]]
#> [,1] [,2] [,3]
#> [1,] 1 1 2
#>
#> [[4]]
#> [,1] [,2] [,3]
#> [1,] 2 5 5
#>
#> [[5]]
#> [,1] [,2] [,3]
#> [1,] 1 0 1
#>
#> [[6]]
#> [,1] [,2] [,3]
#> [1,] 1 2 1
#>
#> [[7]]
#> [,1] [,2] [,3]
#> [1,] 2 4 5
#>
#> [[8]]
#> [,1] [,2] [,3]
#> [1,] 1 2 1
target <- c(1,2,3)
initTarget <- initTarget(target,cDataFrame)
print(initTarget)
#> [,1] [,2] [,3]
#> [1,] 1 2 3
weight <- c(5,7,6)
weights <- normalizeWeight(TRUE,weight,cDataFrame)
print(weights)
#> [1] 0.2777778 0.3888889 0.3333333
cluster1 <- matrix(c(1,2,3),ncol=3)
cluster2 <- matrix(c(2,5,8),ncol=3)
weight <- c(3,7,4)
distance <- distances(cluster1,cluster2,'CHE',weight)
print(distance)
#> [1] 21
Finally, the complete algorithm sorts the distances and sort the clusters aswell. It presents the solution as a sorted clusters list, with the distances or using a dendrogram.
target <- c(5,5,1)
weight <- c(3,7,5)
correlation <- correlationHC(cDataFrame, target, weight)
print(correlation$sortedValues)
#> cluster X1 X2 X3
#> 1 1 2 4 4
#> 2 4 2 5 5
#> 3 6 1 2 1
#> 4 8 1 2 1
#> 5 7 2 4 5
#> 6 2 2 3 5
#> 7 3 1 1 2
#> 8 5 1 0 1
print(correlation$distances)
#> cluster sortedDistances
#> 1 1 2.294922
#> 2 4 2.670830
#> 3 6 2.720294
#> 4 8 2.720294
#> 5 7 2.756810
#> 6 2 3.000000
#> 7 3 3.316625
#> 8 5 3.855732
plot(correlation$dendrogram)
This example shows how the algorithm works step by step.
initData <- initData.details(cDataFrame)
#>
#> This function initializes the input data creating a cluster with each row of the data frame.
#>
#> It gets this data from the user:
#> X1 X2 X3
#> 1 2 4 4
#> 2 2 3 5
#> 3 1 1 2
#> 4 2 5 5
#> 5 1 0 1
#> 6 1 2 1
#> 7 2 4 5
#> 8 1 2 1
#>
#> Each cluster will be a matrix with a row and the same columns as the initial data frame.
#>
#> Initialized data will be:
#> [[1]]
#> [,1] [,2] [,3]
#> [1,] 2 4 4
#>
#> [[2]]
#> [,1] [,2] [,3]
#> [1,] 2 3 5
#>
#> [[3]]
#> [,1] [,2] [,3]
#> [1,] 1 1 2
#>
#> [[4]]
#> [,1] [,2] [,3]
#> [1,] 2 5 5
#>
#> [[5]]
#> [,1] [,2] [,3]
#> [1,] 1 0 1
#>
#> [[6]]
#> [,1] [,2] [,3]
#> [1,] 1 2 1
#>
#> [[7]]
#> [,1] [,2] [,3]
#> [1,] 2 4 5
#>
#> [[8]]
#> [,1] [,2] [,3]
#> [1,] 1 2 1
targetValid <- c(1,2,3)
targetInvalid <- c(1,2)
initTarget <- initTarget.details(targetValid,cDataFrame)
#>
#> This function initializes the target and checks if it is a valid target.
#>
#> It gets this target from the user:
#> [1] 1 2 3
#>
#> After transforming the target into matrix, it checks if it is acceptable
#>
#> If the target has the same columns as main data and only one row, it is a valid target!
#>
#> Target used will be:
#> [,1] [,2] [,3]
#> [1,] 1 2 3
initTarget <- initTarget.details(targetInvalid,cDataFrame)
#>
#> This function initializes the target and checks if it is a valid target.
#>
#> It gets this target from the user:
#> [1] 1 2
#>
#> After transforming the target into matrix, it checks if it is acceptable
#>
#> If the target does not have the same columns as main data or more than one row,
#> it is not a valid target!
#>
#> The function will initialize as a '0's matrix.
#>
#> Target used will be:
#> [,1] [,2] [,3]
#> [1,] 0 0 0
weight <- c(5,7,6)
weights <- normalizeWeight.details(TRUE,weight,cDataFrame)
#>
#> This function normalizes weight values.
#>
#> This are the initial weights:
#> 5
#> 7
#> 6
#>
#> It checks if there is a weights vector. If not, it creates a vector with 3 '1's.
#>
#> Due to the fact that 'normalize' = TRUE, weight vector changes every weight as the
#> initial value divided between the total sum of the vector.
#>
#> FinalWeight[i] = WeightValue[i]/TotalSum
#>
#> These are the new weights:
#> 0.277777777777778
#> 0.388888888888889
#> 0.333333333333333
weights <- normalizeWeight.details(FALSE,weight,cDataFrame)
#>
#> This function normalizes weight values.
#>
#> This are the initial weights:
#> 5
#> 7
#> 6
#>
#> It checks if there is a weights vector. If not, it creates a vector with 3 '1's.
#>
#> Due to the fact that 'normalize' = FALSE, weight vector does not change.
#>
#> These are the new weights:
#> 5
#> 7
#> 6
weights <- normalizeWeight.details(FALSE,NULL,cDataFrame)
#>
#> This function normalizes weight values.
#>
#> It checks if there is a weights vector. If not, it creates a vector with 3 '1's.
#>
#> Due to the fact that 'normalize' = FALSE, weight vector does not change.
#>
#> These are the new weights:
#> 1
#> 1
#> 1
cluster1 <- matrix(c(1,2,3),ncol=3)
cluster2 <- matrix(c(2,5,8),ncol=3)
weight <- c(3,7,4)
distance <- distances.details(cluster1,cluster2,'CHE',weight)
#>
#> This function calculates CHE distance applying weights.
#>
#> It calculates the distance between:.
#> X1 X2 X3
#> 1 1 2 3
#>
#>
#> X1 X2 X3
#> 1 2 5 8
#>
#> Applying these weights:
#> [1] 3 7 4
#>
#> The distance value is: 21.
The complete function that explains the algorithm is:
target <- c(5,5,1)
weight <- c(3,7,5)
correlation <- correlationHC.details(cDataFrame, target, weight)
#> Correlation hierarchical function is a classification technique that initializes a cluster for each data.
#>
#> It calculates the distance between clusters and a target given depending on the distance type.
#>
#> The function applies weights to each property from main data to get weighted results.
#>
#> Due to normalized = TRUE, the initial weights change to a [0,1] values.
#>
#> These are the weight to be used:
#> [1] 0.2000000 0.4666667 0.3333333
#>
#> Initialized data are (more information about how to initialize data in 'initData.details'):
#> [[1]]
#> [,1] [,2] [,3]
#> [1,] 2 4 4
#>
#> [[2]]
#> [,1] [,2] [,3]
#> [1,] 2 3 5
#>
#> [[3]]
#> [,1] [,2] [,3]
#> [1,] 1 1 2
#>
#> [[4]]
#> [,1] [,2] [,3]
#> [1,] 2 5 5
#>
#> [[5]]
#> [,1] [,2] [,3]
#> [1,] 1 0 1
#>
#> [[6]]
#> [,1] [,2] [,3]
#> [1,] 1 2 1
#>
#> [[7]]
#> [,1] [,2] [,3]
#> [1,] 2 4 5
#>
#> [[8]]
#> [,1] [,2] [,3]
#> [1,] 1 2 1
#>
#> Initialized target is (more information about how to initialize data in 'initTarget.details'):
#> [,1] [,2] [,3]
#> [1,] 5 5 1
#>
#> The function calculates the distances between each cluster and the target. It applies
#> weights and uses EUC distance type.
#>
#>
#> The calculated distances are:
#> [1] 2.294922 3.000000 3.316625 2.670830 3.855732 2.720294 2.756810 2.720294
#>
#> The previous distances sorted are:
#>
#> [1] 2.294922 2.670830 2.720294 2.720294 2.756810 3.000000 3.316625 3.855732
#>
#> Then, using sorted distances, the function order the clusters.
#>
#>
#> Finally, the sorted distances are:
#> cluster sortedDistances
#> 1 1 2.294922
#> 2 4 2.670830
#> 3 6 2.720294
#> 4 8 2.720294
#> 5 7 2.756810
#> 6 2 3.000000
#> 7 3 3.316625
#> 8 5 3.855732
#>
#> The sorted clusters are:
#>
#> cluster X1 X2 X3
#> 1 1 2 4 4
#> 2 4 2 5 5
#> 3 6 1 2 1
#> 4 8 1 2 1
#> 5 7 2 4 5
#> 6 2 2 3 5
#> 7 3 1 1 2
#> 8 5 1 0 1
#>
#> And the final dendrogram is on the image.
#>
print(correlation$sortedValues)
#> cluster X1 X2 X3
#> 1 1 2 4 4
#> 2 4 2 5 5
#> 3 6 1 2 1
#> 4 8 1 2 1
#> 5 7 2 4 5
#> 6 2 2 3 5
#> 7 3 1 1 2
#> 8 5 1 0 1
print(correlation$distances)
#> cluster sortedDistances
#> 1 1 2.294922
#> 2 4 2.670830
#> 3 6 2.720294
#> 4 8 2.720294
#> 5 7 2.756810
#> 6 2 3.000000
#> 7 3 3.316625
#> 8 5 3.855732
plot(correlation$dendrogram)