A Comparison between Methods of Selecting Cluster Head

— In wireless sensor networks (WSNs), hierarchical network structures have the advantage of providing scalable and resource efficient solutions. The routing methods which are used in the wireless sensor networks deal with clustering methods to diminish the amount of data transmission from the perspective of energy efficiency. Cluster-based architecture provides an effective architecture for data-gathering in wireless sensor networks. Dynamic clustering is a method which is used to supplement the problem of a high energy demands by distributing energy consumption through the re-selection of the cluster head node. However, this method modifies the cluster structure each time the cluster head node is re-selected, thereby increasing energy demands. In this paper we present a comparison between Selecting Cluster Head Randomly (SCHR) and Well Selected Cluster Heads (WSCH). The results had indicated that, WSCH methods perform better than the SCHR.

INTRODUCTION Wireless sensor networks (WSNs) have become an invaluable research area by providing a connection between the world of nature and that of computation by digitizing certain useful sensory information. One of the most important challenges of WSNs design is to develop a method or protocol such that the randomly deployed numerous sensor nodes behave in a collaborative and organized way, where each sensor node maximizes its own utility function. In addition, the entire network needs balance in resource assignment to perform in a manner that is useful and efficient. Network routing protocol design becomes far more critical to WSN performance than that of conventional communication networks. In hierarchical networks, nodes are separated to play different roles, such as Cluster Head (CH) and cluster members. The higher level nodes, CH, manage the grouped lower level nodes (cluster members) and collect data from the low level nodes. Each CH collects data from the cluster members within its cluster, aggregates the data, and then transmits the aggregated data to the sink. Fig. 1 shows the connection between CHs and the sink (Base Station(BS)). Designing and operating such large networks would require scalable architectural and management strategies. In addition, sensors in such environments are energy constrained and their batteries cannot be recharged.
From the perspective of energy-consumption it is better to communicate using short, multihop paths between the sender and the receiver [1]. Therefore, designing energyaware algorithms becomes an important factor for extending the lifetime of sensors. Other application centric de- Figure 1. Connection between CH and BS sign objectives is deciding which clustering algorithm techniques will be used. Many of Clustering algorithms are mainly concerned with the node reach ability and route stability, without much concern about critical design goals of WSNs such as that of network connectivity and coverage. Recently, a number of clustering algorithms have been specifically designed for WSNs [2][3][4][5][6].
These proposed clustering techniques vary widely depending on the node deployment and bootstrapping schemes, the chosen network architecture, the characteristics of the CH nodes and the network operation model. A CH may be elected by the sensors in a cluster or preassigned by the network designer. Additionally a CH may be one of the sensors or a node with more resources. The cluster membership may be fixed or variable.
CHs may form a second tier network or may just ship the data to interested parties, such as base-stations or command centers. Furthermore CHs can be selected randomly or well selected. In this paper, we show two methods and make a comparison between both approaches for determining which methods are ideal for particular applications. All of the hierarchical routing protocols aim towards selecting the best CHs and clustering the nodes into appropriate clusters in order to save energy. Since the CHs are responsible for the collecting, aggregating, and transmitting data over longer distances to the sink, they consume more energy compared to the other cluster members. Hence, in this paper we aim to show that selection of ideal Cluster Heads (CH) plays a very important role in WSN life time and efficiency. We can select CHs randomly as in part ii, or exactly as in part iii. The hierarchical clustering protocol may execute reclustering and reselecting of CHs periodically in order to distribute the load uniformly among the whole network. The CH is responsible for gathering the sensory data of all group members, aggre-gating it and sending it to the base station(s)(BS), so in this paper we studied different methods for selecting CH exactly or randomly and sowing the effect of each manner in consuming energy and extended WSN life time. The different attributes for clustering Properties are:1-Objective of node grouping(Load balancing -Fault-tolerance -Network connectivity), 2 -Methodology (Hybrid -Centralized -Distributed, 3 -Algorithm complexity (Constant -Variable) and 4 -Cluster-head selection (Preassigned -Random ). The Cluster Head Capabilities are:1-Mobility (Stationary -Re-locatable and Mobile), 2 -Node Types (Sensor and Resource-rich), and 3-cluster Head role (Data aggregation "normal Node"-Relaying node and Sink node). Cluster Properties are:1 -Cluster Connectivity (Multi-hop or Direct link), 2 -Cluster Topology (Fixed or Adaptive), 3 -Cluster Count (Preset or Variable) , 4 -Stability (provisioned or assumed). We divide this paper as the following: ii. Background and related work, iii. Methods of Selected Cluster Head Randomly (SCHR), part iv. Methods of well selected Cluster Head (WSCH), v. Hierarchical agglomerative clustering (HAC) application to WSNs, and practical work results , finally in part vi. our Conclusions.
II. BACKGROUND AND RELATED WORK Network routing protocols are responsible for the network structure and routing scheme. Many researchers have proposed routing solutions for WSNs. The proposed routing protocols can be broken down into different groups based on various criteria [7][8][9]. Network structure, resource awareness, and protocol operation method are basic taxonomies of WSN routing protocols. For example, RRCH [10], The Analytic Hierarchy Process AHP [11] and Hybrid Energy Efficient Distributed clustering (HEED) [5] are hierarchical protocols based on the network structure. HEED is also an energy-aware protocol when considering resource awareness. In this paper, we focus on flat and hierarchical routing schemes based on network structure.
In hierarchical networks, as shown in figure 2 nodes are separated to play different roles, such as CHs and cluster members. The higher level nodes and cluster head (CHs), manage the grouped lower level nodes (cluster members) and collect data from them.
The use of a minimum hop count as in figure 3, or equivalently the shortest path routing protocol as a main criteria for selecting a route for data forwarding has the following drawbacks.
First, nodes along different shortest paths become over utilized and their batteries will be depleted earlier than other nodes which are less used. Thus, the network life time will be potentially decreased. Second, from both the energy consumption point of view and the capacity point of view, it is better to communicate using short multi-hop routes than using a long single hop route [1].
Each CH collects data from the cluster members within its cluster, aggregates the data, and then transmits the aggregated data to the sink. All of the hierarchical routing protocols aim at selecting the best CH and clustering the nodes into appropriate clusters in order to save energy. The CHs are responsible for collecting, aggregating, and transmitting data over longer distances to the sink, consequently requiring consuming more energy compared to the other cluster members. The hierarchical clustering III. METHODS OF SELECTING CLUSTER HEAD RANDOMLY (SCHR) (1) The low Energy Adaptive Clustering Hierarchy (LEACH) protocol [12][13], is proposed to balance the energy dissipation in sensor networks. The main idea of LEACH is that sensor nodes can be randomly selected as CH based on their previous experiences of being a CH. In the cluster formation phase, each sensor node generates a random number between 0 and 1. Each sensor node has its threshold which is related to the predefined percentage of CHs in a network. If the generated random number is less than the threshold, then the node becomes the CH, otherwise it joins a cluster to be a cluster member significantly, LEACH can be used to calculate node thresholds. After clusters are established, the CH broadcasts a transmission schedule within the cluster and asks its members to send data based on a TDMA approach. In the steady phase, CHs are responsible for aggregating and sending data to the sink. After a certain period of time spent in the steady state, the network goes to formation state to repeat clustering. LEACH uses the periodic reclustering to alleviate the deterioration of cluster quality. LEACH is completely distributed and requires no global knowledge of the network. LEACH clustering terminates within a constant number of iterations but it does not guarantee good CH distribution and assumes uniform energy consumption for CHs. Furthermore, the idea of dynamic clustering brings extra overhead, e.g., head changes and advertisements, which may diminish the gain in energy consumption. (2) Belief Propagation (BP) [14], adopts the belief propagation (BP) algorithm based on the probabilistic graph model to iteratively compute marginal probabilities on trees by local message passing. The method considers performance of a multi-hop network. Performance was evaluated against HEED [15] using the TinyOS simulator [16]. The paper shows that the reclustering process is less frequently triggered using the approach with the expense of high initial clustering overhead. Overall, the clustering scheme based on the BP method is more efficient. (3) The energy Residue Aware (ERA) Clustering algorithm [17]. It provides an improvement over LEACH by including the communication cost into the clustering. The communication cost includes residual energy, communication energy from the CH to the sink and communication energy from the cluster members to the CH. There is a difference from HEED: ERA uses the same CH selection scheme as LEACH but provides an improved scheme to help non-CH nodes choose a ''better" CH to join by calculating the clustering cost and determining CH according to the maximum energy residue. (4) RRCH [18] performs cluster formation only once to avoid the high energy consumption during the clustering phase. RRCH uses a similar method as LEACH to establish clusters. Once the clusters are established, RRCH keeps the fixed clusters and uses the round-robin method to choose the node to be the CH within the clusters. Every node has a chance to be CH during a frame. When a node has been detected as an abnormal node, the CH modifies the scheduling information and broadcasts it to the entire cluster during frame modification; then its cluster members delete the abnormal node based on the received schedule information. RRCH has the same defect as LEACH: there is no guarantee of cluster quality. Without the periodic reclustering, the RRCH cannot handle clusters with bad quality, such as overlay of clusters or very small/large clusters. (5) The CPEQ is a cluster-based routing protocol in which nodes with more residual energy are selected as CPEQ [19] adopts the CH selection scheme of LEACH. Instead of using the randomly selected node as a CH directly, CPEQ uses the randomly picked node to choose the node with the highest residue energy from its neighbors.
To build clusters, CPEQ uses a time-to-live (TTL) parameter to limit the size of the cluster and to calculate the optimized routes from cluster members to their CHs. For inter-cluster communication, CPEQ also uses the optimal multi-hop routes among CHs and the sink. By performing data aggregation within clusters and calculating optimized routes, CPEQ reduces traffic collision and data transmission delay. In a large scale WSN, the flooding mechanism adopted by CPEQ in its initial stage may become problematic. Flooding incurs redundancy as a node sends data to its neighbor no matter if it already has requires such data. Further, CPEQ is only appropriate for static and fixed networks due to the high cost of addressing all the nodes in the system, and hence the addresses are hard to maintain. (6) the HEED [20] protocol is an energy-aware hierarchical approach that provides an improvement over the LEACH protocol. HEED focuses on choosing appropriate CHs by adding more network information. It uses residual energy as the primary clustering parameter to select a number of tentative CHs. Figure 4 indicating that CH nodes only communicating with the Base Station (BS) for sending or receiving sensing data, so we have to choose appropriate CH for reducing power consumption.
Those tentative CHs inform their neighbors of their intentions to become CHs. These advertisement messages include a secondary cost measure that is a function of neighbor proximity or node degree. This secondary cost is used to guide the regular nodes in choosing the best cluster to join, and to avoid elected CHs being within the IV. METHODS OF WELL SELECTED CLUSTER HEAD (CH) These can be classified into the following protocols: first, the LEACH-centralized protocol. LEACH-C protocol is similar to the LEACH protocol in terms of formatting clusters at the beginning of each round. However, a centralized algorithm is performed by the sink in LEACH-C, as apposed to the random self-selection of nodes [21]. The sink collects location information from the nodes and broadcasts its decision of which nodes are to become CHs. The overall performance of LEACH-C is better than LEACH as shifts the duty of cluster formation to the sink. However, LEACH-C is sensitive to the sink location. For instance, LEACH-C compromises performance as the energy-cost of communicating with the sink becomes higher than the energy cost for cluster formation. Sinks may be located far from the network in most WSN applications. Hence, the dependence on the sink location is a major disadvantage of LEACH-C. Second PEBECS [22], PEBECS presents the solution for the hot-spot problem by dividing a WSN into several partitions with equal area and therefore grouping the nodes into non-uniform sized clusters. The shorter the distance between the partition and the sink, the more clusters are created within the partition. Furthermore, to select the CH, PEBECS uses the node's residual energy, degree difference and relative location in the network. PEBECS mitigates the hot-spot problem by grouping nodes in smaller clusters to save more energy on their intra-cluster communication. As a result, PEBECS achieves longer network lifetime by efficiently balancing node energy consumption. Third, the MHP protocol [23] proposes an energy-efficient scheme to collect data from the two-layered heterogeneous sensor network. The carefully deployed cluster heads have more energy than the basic sensor nodes. Thereafter, each cluster head launches the discovery process to accept basic sensor nodes into its cluster. After the clusters are established, MHP minimizes the intra-cluster communication energy consumption by using polling scheme [22][23] to collect data from sensor nodes. MHP presents a fast online polling algorithm to solve the problem of finding a contention-free polling schedule. However, MHP has stricter requirements of network deployment. The cluster head nodes must be carefully deployed, otherwise, a part of the network will become non-functional. Moreover, MHP requires the knowledge of the sensor nodes' location. Forth, the Dynamic/Static Clustering protocol (DSC) [24] provides an extension of LEACH-C protocol. Using this scheme, each node obtains its current location using a global positioning iJOE -Volume 6, Issue 4, November 2010 system (GPS) and sends the location information and energy status to the sink. The sink then determines the number of CHs based on the collected information, and broadcast the clustering result to each node. Therefore each CH will also determine a TDMA scheme for its cluster members similar to LEACH. In comparison with LEACH-C, the number of messages received at the sink for DSC is significantly less. However, DSC suffers similar problems to that encountered in LEACH-C. Fifth, EDASC provides an energy-efficient data aggregation protocol based on static clustering (EDASC) [25]. In order to reduce the overhead of dynamic clustering. This approach also adopts the LEACH model. However, EDASC uses the sink to select an initiator to begin the clustering process. The sink then broadcasts the CH schedule to sensor nodes. EDASC calculates the Hausdorff distance to determine CHs and it alternates the role of CH in order to prolong the network lifetime. Additionally, EDASC also has similar sue that LEACH-C encounters. The principle idea of EDASC is to form clusters statically, which is similar to DHAC. Nevertheless, DHAC is fully distributed and does not rely on a centralized sink to start the cluster formation. Sixth, the Analytical Hierarchy Process (AHP) protocol [26] conducts CH selection algorithm through the sink node. AHP supports mobile sensor nodes. Three factors are considered in AHP: energy, mobility, and the distance to the involve cluster centroid. AHP calculates local weight values and global weight by using those three factors. AHP then selects the CHs by combining the results of these two weights. To maintain the clusters, CH reselection only occurs when selected CHs die or move to other clusters. Compared to LEACH, AHP improves the network lifetime based on the un-active time of the last node [26]. Comparing both centralized protocols, it is evident that AHP is more complex than LEACH-C, since AHP considers additional-factors. AHP needs to transmit more information from the network to the sink, with the consequence that communication costs between nodes and the sink may significantly increase energy consumption. Seventy, the EAD protocol [27] presents an energy-aware algorithm to build a broadcast tree that spans all the sensor nodes using maximum number of leaves. EAD disactivates the radios of the leaf nodes and only uses the non-leaf nodes for responsibility of data aggregation and relaying of tasks. Furthermore, EAD ensures that the leaf nodes save more energy without compromising the connectivity of the network. After each data-transmit phase, EAD will re-build the broadcast tree to identify all the dead nodes and orphaned nodes. EAD requires global knowledge of the network to build the optimized spanning tree, which causes higher constraints and more energy consumption. The advantages and disadvantages of two kinds of hierarchical routing protocols are now summarized in the following paragraphs.
Random-Selected-Cluster Head (RSCH) protocols can bring more flexibility and toleration. RSCH approaches have three main disadvantages: Firstly, the randomly picked CH may have a higher communication cost because it has no knowledge of intra-cluster or inter-cluster communication. If periodic CH rotation is used to reduce the effect of CH random selection, the re-selection itself uses extra energy to re-build clusters. Thirdly, the random selection cannot guarantee good protocol performance. The well-selected-Cluster Head (WSCH) protocols can provide better cluster quality, but they usually have a more complex scheme and higher overhead to optimize the CH selection and cluster formation.

V. HIERARCHICAL AGGLOMERATIVE CLUSTERING
(HAC) APPLICATION TO WSNS The Hierarchical Agglomerative Clustering (HAC) algorithm [28][29] is a conceptually and mathematically simple clustering approach to data analysis. It can provide very informative descriptions and visualization for the potential data clustering structures, especially when real hierarchical relationships exist in the data as evident in [30]. To apply the HAC algorithm in WSNs, we proposed a six-step clustering approach to generate an appropriate of clusters [31]. This section presents the performance comparison between the WSCH and the SCHR. The Simulation parameters are presented in table 1. For binary data, where determining the similarity between columns. As in this sample in table 1.   3.2 6.1 Figure 5. A simple 10-node network.
Simple match: We can use simple match method only if negative matches are meaningful. Sorenson's Coefficient probably the most widely used, especially in data sets with not so many positive matches.  Dice (Sorenson's) coefficient: Probably the most widely used, especially in data sets with not so many positive matches.
Simpson's coefficient: Simpson's Coefficient Not a very good method, and only useful when there are many -(more than 100) -variables and not so many mismatches (0's). Jaccard c is Very common method, especially in data sets with a lot of matches. provides a more conservative result than Sorenson's coefficient Jaccard coefficient:  - We can get resemblance matrix with quantitative data using Eculidean distance D ab , where: We start with an equal no. of a live sensors, after consuming 30 % of total energy the no. of a live sensors in WSCH is 520 sensors but is 450 sensors in SCHR as shown in fig. 6. showing that Total Energy Power Dissipation (TEPD) for WSCH is less than SCHR for transferring the same size of data packet when the sink at (50, 50) or at (50, 300). When the percentage of power dissipation is 60%, the Number of Data Packet Received (DRP) at the sink was 1.6 X 104 data packet for WSCH, but DRP was 1.3 X 104 for  Figure 7 shoeing that, after 800 ns, the total number of a live sensors was 87 node for SCHR but 83 nodes for WSCH, and after 100 ns the total number of a live sensors was more than 23 nodes for WSCH and less than 10 nodes for SCHR. SCHR, the sink was located at (50,50) as shown in fig.  8. fig. 9 providing the same result when the sink was at (50, 300).  VI. CONCLUSIONS Wireless sensor networks (WSNs) have attracted significant attention over the past few years. It is composed of a large number of micro sensors can be an effective tool for gathering data in a variety of environments. For adapting the constraints of WSNs, many hierarchical routing protocols have been proposed with different design goals, clustering criteria and basic assumptions. The results demonstrate that WSCH is better than SCHR for efficient network life-time, reducing power-consumption and dissipation. This study shows the improved clustering method can efficiently distribute the power consumption among the nodes from a global perspective, and consequently significantly enhance the lifetime of the system. We highlighted the effect of the selection Cluster Head (CH) method for network models on the re-ward approaches and summarized a number of schemes, stating their strength and limitations.