The state tree stores only those datasets that can be preassigned an identifier or address as an 8-byte number. They are addressed by this identifier.
For example, each account has its own identifier, which derives from the public key. This identifier acts as the address of the object within the state tree. In order to clearly identify the state of an account at a particular point in time, all the states of the accounts are combined into a hash tree of states. Subsequently, the top of the state tree is used as an identifier of the overall state of the application.
Merkle's Discharged Tree
In general, the state stree is similar to the Merkle tree. Classic Merkle tree is focused on storing a fixed list of elements or chronologically added elements ()continuous flow of elements without gaps).
The state tree stores relatively little amount of elements, but has a clear value area and a specific address for each element. These elements are then combined into a Merkle tree. We call this the "unstacked Merkle tree".
The state tree is a binary tree, in which each state of an item has a fixed position (set by item ID).
The tree is built according to the following principles:
The item ID clearly defines the sheet where the state of the item should be stored. The tree sheet stores the current state of the element.
The hash is calculated for each sheet and node.
The sheet hash is calculated from the data it contains (by the state of the item).
The hash of a node is calculated from the hashes of the left and right subtrees. If a node has no left or right subtree, the hash equals to the hash of the subtree it has.
Thanks to this type of structure it was possible to achieve the following:
Storing the state of an element for each point in time (history of changes).
Minimizing the time of receiving the value of an element by its identifier.
Minimizing changes in the state tree over time (changes affect only new or changed sheets and the "path" to the top of the tree.)
In addition to quick access to the state of an item, the state tree provides an additional level of validation of the network state.
If after processing a block of data, the hash at the top of the state tree does not match what was obtained on the remote peer, then the dataset is not accepted, since it was processed differently on different peers.
Merkle's discharged tree work example
For example, the state tree for 16 (2^4) elements looks like the following (the real tree has 9'223'372'036'854'775'808 (2^64) possible positions):
There is a distinction between the virtual tree state and its real representation. For example, for a tree of 1 element (with ID A-0101) only 1 sheet will actually be stored (Virtual - virtual tree representation, Real - real representation, which will be used to search data in the snapshot (the current version)):
If one more element is added to the tree (A-0111), we will get the following (for a binary tree the number of nodes is one less than the number of sheets):
If the state of an existing element is changed, then the change occurs only in the nodes on the path from the node to the element (for example, the state of A-1010 has changed):