Anonymous Asynchronous Ratchet Tree Protocol for Group Messaging

Signal is the first application that applies the double ratchet for its end-to-end encryption protocol. The core of the double ratchet protocol is then applied in WhatsApp, the most popular messaging application around the world. Asynchronous Ratchet Tree (ART) is extended from ratchet and Diffie-Hellman tree. It is the first group protocol that applies Forward Secrecy (FS) with Post-Compromised Security (PCS). However, it does not consider protecting the privacy of user identity. Therefore, it makes sense to provide anonymous features in the conditions of FS and PCS. In this paper, the concepts of Internal Group Anonymity (IGA) and External Group Anonymity (EGA) are formalized. On the basis of IGA and EGA, we develop the “Anonymous Asynchronous Ratchet Tree (AART)” to realize anonymity while preserving FS and PCS. Then, we prove that our AART meets the requirements of IGA and EGA as well as FS and PCS. Finally, the performance and related issues of AART are discussed.


Background
With the help of Internet development, Instant Messaging (IM) applications are much important in people's lives. According to statistics, WhatsApp is the most popular IM application around the world with more than 2 billion active users. Facebook Messenger has 1.3 billion users. The third is WeChat with about 1 billion. In 2018, people spent 27.6 h a week online, of which 15.6% was used for instant messaging. In addition, WeChat is the second IM application of China, and LINE is popular in East Asian countries. A large amount of data containing personal privacy information will be generated through these platforms.
End-to end encryption (E2EE) is used to protect user privacy such that the server or any attackers cannot read messages during the communication of IM. When the secret key is not compromised, Indistinguishability under Chosen Ciphertext Attack (IND-CCA) is considered as a standard to protect IM communication, in which case an attacker can request a prepared ciphertext [1]. However, when the secret key is compromised, there should be Forward Secrecy (FS) [2] and Post-Compromised Security (PCS) [3]. FS is to ensure that the adversary cannot obtain the key or plaintext information of the past secret messages. PCS is to guarantee that after multiple interactions, the compromised communication will be restored to a secure state again.
The group message protocol is extended from one-to-one IM with at least three users during the communication. The sender transmits a message, and the other group members

Related Works
In this section, we analyze the group protocols of IM applications and show that these protocols do not provide anonymity along with FS and PCS.

iMessage
Apple's iMessage is the first popular E2EE application, but it turns out to be insecure under IND-CCA [9]. According to iMessage white paper [10], before sender A transmits a message to receiver B via iMessage, A should get the address of B from Apple's server called APN because APN will store all users' addresses. Furthermore, The group messaging protocol of iMessage is "sender keys". Thus, anonymous features cannot be satisfied with iMessage.

LINE
LINE [11] is an E2EE application that is popular in East Asia. According to the protocol of LINE called Letter Sealing, there are some issues such as impersonating attacks [12]. In group messaging, a group master key is calculated by the creator and sent to other members via "sender keys". This master key will not be changed so that if it is compromised, the contents of communication will be revealed by the attacker. Thus, PCS is not satisfied in LINE.

Signal
OTR [13] is the first application to provide ratchet. In ratchet protocol, users negotiate new Diffie-Hellman (DH) keys of each session, and the old session keys will be deleted and cannot be derived again. Signal's protocol is called double ratchet. It proves that double ratchet can satisfy FS and PCS [5]. It can be observed from this protocol that long-term public keys are included in the associated data. So, the identities of users will be disclosed to the message server. Signal's group messaging protocol is pair-wise, which requires that each member should maintain a one-to-one Signal protocol with other members rather than sending keys. Because of the pair-wise protocol, anonymity cannot be satisfied.

ART
ART [4] is extended from ratchet and Diffie-Hellman tree, which first applies FS and PCS to group messaging protocol. The creator of a group generates DH key pairs for others. DH key pairs are set as the leaves of the DH tree, and the parents' DH key pairs are generated from the ones of their children. The public DH tree is sent to other members. When sending messages, the sender needs to refresh his leaf DH key and the public DH keys from the corresponding leaf to the root of the group tree. The new public keys are sent to others to update their DH trees according to the location of the sender. Because the position of the sender is public and bound to the identity of the sender, ART cannot satisfy anonymity.

WeChat and QQ
WeChat [14] and QQ [15] are the most popular IM applications in China. They apply Secure Sockets Layer (SSL) or Transport Layer Security (TLS) to protect the message security. A TLS connection will be set with the server to which the user is logged in. The group messages are transferred through this channel. TLS 1.3 proves to be FS. Though for PCS, because the later session key is derived from the former one, it cannot be satisfied in TLS, the same as WeChat and QQ. It is claimed that the identities of users can be protected. However, they do not offer technical details as well as the source code. Moreover, it is also not clear whether they are E2EE protocols or not.

Some Anonymous Approaches Applied in E2EE
Tor [16] is an anonymous network composed of many user volunteers. Tok [17] is the IM application based on Tor. When communicating, the sender randomly selects the same volunteer points, then derives long-time session keys of them. These keys are used to encrypt sending messages in sequence. According to the sequence, these messages are passed to the next point and decrypted by each point using the derived key until it is delivered to the receiver. Thus, the address of the sender is only known to the first point. This address of the receiver is only known to the last point. However, FS and PCS cannot be satisfied when the long-term keys are compromised.
Identity-based encryption (IBE) is used to validate and authenticate the anonymous public keys in E2EE [18]. Because of the low efficiency, KEM/DEM is applied to encrypt the secret key of the authenticator [19]. The encrypted secret key is sent to a proxy, then the proxy delays this message to the service provider for validation. As the proxy is trusted, the identity of the sender can be protected. Just like Tor, the secret key of the sender is long-term. So, it cannot provide FS and PCS.

Security Definitions
There are fundamental tools for the security definition. M is the message space. K is the key space. C is the cipher space. Σ is the MAC space. U is finite user identity set. E = (E, D) is the encryption scheme, E(k, m) = c : K × M → C is the encryption algorithm, and D(k, c) = m : K × C → M is the decryption algorithm. I = (S, V) is a MAC system where S(k, c) = σ : K × C → Σ and V(k, (c, σ)) = {0, 1} : K × (C × Σ) → {0, 1}. The output of V is 1 if a MAC pair is from S; if it is 0, V will reject this pair.

Algorithm Definition
The AART is the protocol with the following algorithms: it is the initialization algorithm to create group tree, generate the public group key gpk and public group key gsk. • (C, σ) ← Enc(gpk, gsk, m): it is the encryption algorithm to encrypt the message m with gpk and gsk. The outputs are a ciphertext C and a MAC σ. • m ∪ ⊥ ← Dec(gpk, gsk, C, σ): it is the decryption algorithm to check the σ and decrypt the ciphertext C. The output is the message m if it is decrypted correctly or ⊥ if it does not pass the validation of σ.
The sub-algorithms involved in the AART are defined as follows: • {k 1 , ..., k n } ← SKG(gpk, gsk): it is the session keys generation algorithm where {k 1 , ..., k n } ∈ K n . • (pos, path) ← U pdate(gpk, gsk, pos): it is the update algorithm to refresh the leaf of the sender after he encrypts a message. pos is the position of the leaf to be updated, and path is the updated public key set in the group tree. • gpk ← U pdateGpk(gpk, pos, path): it is the update algorithm to replace part of the public keys of group tree according to path and pos after the receiver decrypts a message.
The encryption oracle Enc and decryption oracle Dec made up of these sub-algorithms and tools are illustrated in Figure 1

97
M is the message space. K is the key space. C is the cipher space. Σ is the MAC space. Assume (CPA) and ciphertext integrity (CI) requirements. The attack game of AE is shown in Figure 1.

Security Model
In the security models, messages queried by A are from M with the same length. In the challenge phase, the messages from A are different from queried messages. The adversaries mentioned in each definition are all probability polynomial time (PPT) attackers.
Unforgeability of MAC. The adversary on a MAC system attacks a chosen message and tries to forge a MAC pair that can pass the MAC system. The attacking game of unforgeability is shown in Figure 2. If Adv UNF = |Pr(V(k, m * , σ * ) = 1)| is negligible, the MAC system can satisfy unforgeability.
Chosen Ciphertext Attack. The adversary of IND-CCA cannot only ask the plaintext encryption query but also has the ability to access decryption of the cipher. The attacking game of IND-CCA is shown in Figure 2.
Forward Secrecy. The definition shows that the adversary cannot reveal the forward session keys when the keys are compromised. The attack game of FS is shown in Figure 2.
Oracle O illustrates the forward encryption. After the challenge phase, the adversary can run decryption oracle Dec.

141
In this subsection, the necessary assumptions and notations for AART are defined. x $ ← − X means 142 choosing a group element x from group X randomly. The size of all groups and spaces is super-poly number.

143
Each adversary is PPT adversary, which means that to exhaust all group and space elements is impossible. An This definition shows that when the key is compromised after at most Q times queries, the channel will be refreshed and secure again. The attacking game of PCS is shown in Figure 2. The adversary can access the decryption oracle before and after the challenge phase.
Internal Group Anonymity. This definition shows that the adversary who knows the secret key cannot distinguish the identity of the target message sender. The attacking game of IGA is shown in Figure 3  group. An encryption scheme S is EGA secure shown in Figure 4 if External Group Anonymity. The security model of EGA is shown in Figure 3.

Figure 2. Forward Secrecy and Post Compromised Security
To make it indistinguishable, the only clue for the adversary is the output of Enc. It includes three parts: associated data pos and path, ciphertext c, and MAC σ. For ART and Signal, identity is an important associated data and easy to be distinguished. If an adversary cannot distinguish those associated data, it means that he cannot locate a user in an exact group.

Security Goals
Our construction aims to ensure security against the five kinds of adversaries in IND-CCA, FS, PCS, IGA, and EGA. All of the adversaries can deliver and modify the message, control the message server, and have the ability to access the decryption oracle. Except for IND-CCA, current random values including secret keys, session keys, and leaf keys can be compromised. To break the security features, the adversary can access the Key Derived Function (KDF) as a random oracle. Our construction does not consider the impersonating attack when the keys are compromised. Besides, the condition is not considered that the initial stage is compromised, and it assumes that the initial stage is based on a trusted third-party.

Security Assumption and Notation
In this subsection, the necessary assumptions and notations for AART are defined.
x $ ← − X means choosing a group element x from group X randomly. A secure pseudorandom generator (PRG) prg is to pick up the update position for group members. Sig is a secure signature, and Decisional Diffie-Hellman Problem (DDHP). DDHP is to distinguish two tuples (a · P, b · P, ab · P) and (a · P, b · P, z · P), where a, b ∈ Z q and z $ ← − Z q . The advantage for any PPT adversary to deal with DDHP is negligible.
Computational Diffie-Hellman Problem (CDHP). CDHP is to compute ab · P, given a tuple (a · P, b · P), where a, b ∈ Z q . The advantage for any PPT adversary to deal with CDHP is negligible.
Pseudo-Random Function Oracle Diffie-Hellman (PRF-ODH) [20]. Assume a secure PRF t(·) is: P → Z q , which maps the group element of P to an element of Z q . If DDHP is held in group P and t is a secure PRF over P, general PRF-ODH assumption is satisfied on P such that if z $ ← − Z q , given (a · P, b · P, t(ab · P)), (a · P, b · P, t(z · P)), the probability adversary distinguishes t(ab · P), and t(z · P) is negligible. Because of PRF-ODH, CDHP is still satisfied over P and t if z $ ← − Z q , given (a · P, b · P), the advantage that the adversary computes t(ab · P) is negligible.
Node. node is the basic unit of group tree. Other operations are outlined: push is to push an element to the end of a list. pop is to get and remove the first element from a list. agt is the tree of public and private keys. size() is to get the number of group members or the number of a list. KeyExchange can be any authentication key exchange (AKE) function or protocol. In signal, KeyExchange is X3DH [5] protocol.
This design involves several random values. The one-time secret key node[i].sk is owned to user i, node[i].pk is the corresponding public key. (ik, IK) is the identity key pair, (ek, EK) is the short-term key pair. ik and ek are kept by the user, and IK, EK are published. j denotes the sequence number of current stage. Session keys mk j , r j , ck j are derived from KDF(ck j−1 , tk j ). mk j is used to encrypt message, r j is used to calculate one-time address, and ck j is used to generate MAC and session key pair for stage j + 1.

Group Setup
Considering the three-member group, let A, B, and C be the group members. The initialization algorithm Init creates an anonymous group tree and sets up a communication channel. The leaves A, B, and C stand for each group member. This tree is created by the group initiator A. An overview of the group tree is shown in Figure 4. The Init procedure is shown as follows: • Ask for public key pairs (IK i , EK i ) of each group member through the third channel.
• Generate setup key suk Send IK A , SUK, ck 0 to other group members via a trusted third-party, which means that the adversary cannot access these messages and reveal the identity of other group members in the initial session. • Generate leaf keys of other members: Set up group tree by agt ← Create(). Let the root private key and public key be (tk 1 , TK 1 ). Set gpk as public group tree that deletes all secret keys from agt. • Run σ 0 ← Sig(ik A , gpk 1 ) and broadcast (gpk 1 , σ 0 ) to other group members.
Create and Init algorithms are illustrated in Algorithm 1. When initiating anonymous group tree, the initiator has the full view of group tree, including the private leaf key of each node. After receiving this tree, other group members should check if (IK A , gpk 1 , σ 0 ) is valid or not. If σ 0 is valid, each group member will accept this tuple. He will only obtain public part gpk 1 and his private leaf key. Leaf keys can be calculated by running After getting θ i 0 , group members should calculate their public leaf keys to ensure the position i of them. If the pk in gpk 1 of kth leaf is equal to θ i 0 · P, the position of this group member is i ← k. Then, he generates the group shared key tk 1 according to procedure KeyGen(i, node[i], gpk 1 ) : Find s's sibling node s.sibling 3.
If p is null, tk ← s.sk, else go to step 2 According to Equation (1), the group initiator knows the location of each member in gpk 1 . However, each other member only knows his own location. Run σ 0 ← Sig(ik A , gpk 1 ) and broadcast (gpk 1 , σ 0 ) to other group members 28: return gpk, agt, node 29: end procedure

Direct Updating
In order to satisfy FS and PCS, when one participant sends a message, the group tree should be updated. In stage j, the root key tk j should be generated from gpk j and the user's leaf secret key. After sending or receiving a message, gpk j should be updated as gpk j+1 , which means that session key should be used only once. In the update phase, group members can decide to update the group tree anonymously or directly. The overview of directly updating is illustrated in Figure 5. Its procedure is described as follows (B stands for the position of the updated node): Update sk 2 ← t(θ B 1 θ y 1 · P); pk 2 ← sk 2 · P 3.
Broadcast B, node [B].pk, pk 2 , pk 3 to all group members tk = t(sk 3 sk 4 · P ); tk · P sk 3 = t(sk 1 sk 2 · P ); sk 3 · P After receiving the updated public keys, others update the public keys of B and its ancestor nodes, and tk j+1 is derived according to KeyGen.

Anonymous Updating
Because the group initiator knows the location of each member, he can see which one is to update group tree. So, the initiator knows who sent the target message. In order to limit the authority of the initiator, the relation between the updated location and identity should be separated. By using random node, this feature can be obtained according to Figure 6. The procedure is shown as follows (b stands for the updated node's position): Update sk 2 ← t(θ B 1 θ y 1 · P); pk 2 ← sk 2 · P 4.
Broadcast b, node[b].pk, pk 2 , pk 3 to all group members tk = t(sk 3 sk 4 · P ); tk · P sk 3 = t(sk 1 sk 2 · P ); sk 3 · P Because in group tree node[i], i ∈ {2, 4, 6, ..., 2n} are random nodes, this means that the leaf keys of these nodes are generated randomly, and thus no group member is located in these nodes. In this way, the initiator cannot bind the sender with a random node. Therefore, he cannot reveal the identity of the sender.

One-Time Address
Although ratchet tree can provide PCS and FS, it delivers messages through central servers. If those servers are controlled by the adversaries, they can know the relations of all users. With the help of the topological net, attackers can perform behavior analysis to infer the identities of the user.
One-time address applied in Monero [8] tries to hide the identity of receiver using Equation (2).
Here, PK s B ← sk s B · P and PK v B ← sk s B · P are the long-term public keys of user Bob. H : P ← Z q is a collision-resistant hash function. If user Alice wants to trade with Bob, she first generates r $ ← − K, calculates addr, and then puts r, addr and transactions onto the block chain. Bob should use r and his secret key pairs to validate the addr. Because addr is changed by r and r is randomly chosen, addr is changed in each transaction. Because DDHP is hard in PRF-ODH, the adversary cannot reveal the identity of Bob from addr. However, because Bob should check all addr, the valid operation will cost a lot of time. The idea from Monero's one-time address is to hide the group public key, so that cloud servers cannot distinguish different messages from different groups according to one-time address. The SKG of our construction contains two parts: Equations (3) and (4).
addr j ← H(t(r j · P)) · P + tk j · P (4) AART generates the pseudorandom value mk j , r j , ck j from tk j and ck j−1 based on KDF : K × Z q → K 3 modeled as random oracle, so that group members can pre-calculate the one-time address for each message.

Encryption and Decryption
Here type ∈ {0, 1} is the updated type: 0 is direct update, 1 is anonymous update.
U pdate is the algorithm to update the group tree during encryption, and U pdateGpk is to update the group tree after receiving updated path. The details of these two algorithms are illustrated in Algorithm 2. Send(msg, addr, server) means putting message msg on the server according to the position of addr. Get(addr, server) means getting the message from the position addr in the server. If sending is wrong or nothing is obtained, the response of the server is ⊥. These messages can be observed and accessed by the adversary. Algorithm 2 Update Group Tree 1: function U pdate(i, gpk j , type j , node j ) 2: if type j = 0, pos j = i, otherwise pos j ← prg ({2, 4, 6, ..., 2n}) 3: return pos,U pdatePath(gpk j , node j+1 , pos j ) 5: end function 6: function U pdatePath(gpk j , node j , pos j ) 7: while current node cur is not the root do 9: the sk of cur's parent is t(cur.sk · cur.sibling.pk), the pk of cur's parent is its sk · P 10: path j .push(cur.pk), let cur move to the parent of cur 11: end while 12: return path j , cur 13: end function 14: function U pdateGpk(pos j , gpk j , path j , node j ) 15: tmp ← node[pos j ] 16: while path j = [ ] do 17: tmp.pk ← path j .pop(), tmp ← tmp.p Proof. In each Game j , b is randomly chosen by C, andb is the output of A. W j is the event that in Game j , b =b. The decryption query is defined in Game 0 as 1.

2.
If it is true, reply D(k 1 , c j ), else ⊥. It should prove that Then, Game 0 is changed into Game 1 .
Step 1 is deleted and step 2 is changed to send "reject" except when j = ω ∈ {1, Q d }. It can be seen that the difference between Game 0 and Game 1 is the event that c ω is queried. According to the definition of Unforgeability, there is To simplify, we will remove the decryption query in accordance with Equation (7) from our proofs. Thus, Game 1 is the IND-CPA game of AART and then is modified into Game 2 .
The random oracle is recorded by MAP. Game 2 is the same as Game 1 except for deleting MAP operation of step 8 from Game 1 . Event Z is defined such that A queries tk Q 1 +1 , ck Q 1 +1 in domain(MAP). The difference between these two games is that event Z happens. So there is Using CDHP. If event Z happens, it means that A queries tk Q 1 +1 , ck Q 1 +1 ∈domain (MAP), which can be used to break CDHP and to construct B PRF−ODH . To break CDHP, one tk, ck pair should be picked out, but B PRF−ODH is not sure which one in domain(MAP) is the right answer. Assume there are at most Q 2 times random oracle queries; the probability to select right pair is at most Pr(Z ) Q 2 . We use Game 2 to construct Game CDHP . Instead of running Init, KeyGen, U pdate, B PRF−ODH should query them from C PRF−ODH . The gray parts with boxes of Game 2 challenger are constructed as C PRF−ODH . Thus, from A's view, there is no difference between Game 2 and Game CDHP . Event Z happens ⇐⇒ tk Q 1 , ck Q 1 ∈ domain(MAP) when B PRF−ODH finishes the game. Let Q ← Q 2 , because the pairs may be queried more than once, the size of domain(MAP) is no greater than Q. So, there is According to Game 2 , to deal with Pr(W 2 ) means to deal with IND-CPA. So Using CPA. Game CPA can be constructed from Game 2 . Let Game 2 challenger be B CPA except that after receiving message from A, B CPA should run encryption query to C CPA such like the gray parts with no boxes in Figure 7. So there is Combining Equations (6)-(11), Theorem 1 can be derived. Because CDHP in PRF-ODH is hard and E is IND-CPA cipher, I is secure MAC system, A cannot win Game 0 . So Adv CCA [A, AART] is negligible. IND-CCA of AART is satisfied.

Forward Secrecy
Theorem 2. Let KDF : K × Z q → K 3 be modeled as a random oracle. When the keys of stage j + 1 are leaked, if adversary A can break FS of AART, there exists adversary B CCA that can break the IND-CCA of stage j with the advantage: Proof. Assume there are Q stages. According to SKG and U pdate, tk j is derived from gpk j , and session keys of stage j are generated by tk j , ck j−1 . So if all random values including sk of each user, tk j , session keys mk j , r j , ck j are compromised, and adversary A wants to get session keys of stage j − 1, he needs to know tk j−1 . If the current leaf key of each user is not compromised, each stage can be reduced to an IND-CCA game in Theorem 1. If the current leaf key is compromised, he can get tk j−1 when the leaf key is not updated. So he can try to get ck j−2 to break FS. In order to get ck j−2 , he should get ck j−3 recursively until the initial stage. However, the initial stage is run through secure AKE and a trusted third-party, and the adversary cannot break FS through this way. Assume challenger C is the group creator. Game 0 is illustrated in Figure 8. For the ith message query, ifb i = b i , A wins Game 0 . By querying each session key, root key, and plaintext encryption from the IND-CCA challenger of Game 0 in Figure 7, Game 0 can be changed into Game CCA,i for each stage i. According to Theorem 1: There are Q times of Game i , so Theorem 2 proves to be true. Because Adv RO AART] is negligible too. Forward Secrecy of AART is satisfied.

Post-Compromised Security
PCS is proved with Theorem 3.
Theorem 3. Let KDF : K × Z q → K 3 be modeled as a random oracle. When the keys of stage j are compromised, if in the challenge stage all leaf keys are updated, the advantage of adversary A to break PCS of AART is equal to the advantage of A to break IND-CCA of stage j + 1, such that Proof. When other keys except for ck j of jth session are compromised, because the keys of the next session j + 1 are based on ck j , the adversary cannot derive them. So, the only way for the adversary is to break the IND-CCA of j + 1 session. Thus, Theorem 3 can be reduced. When all keys are compromised, if the leaf keys adversary holds are not updated until the Q session finished, the advantage for the adversary is 1. However, when each leaf key of the group tree is updated, the advantage of A is reduced to the IND-CCA of Qth session and becomes negligible.

Internal Group Anonymity
IGA of AART is proven with Theorem 4.
Proof. Because the random leaf to be used in the anonymous update is chosen randomly by secure PRG, if the adversary can distinguish between two anonymous users from each other depending on their updated messages, he can break the security of PRG.

External Group Anonymity
Proof. Illustrated as Figure 9, Game EGA includes two parts Game 0 (0) and Game 0 (1) simulating two groups. Challenger C plays Game 0 (b) with adversary A where b $ ← − {0, 1}. A should distinguish which game is played. If the output of A isb andb = b, A win Game EGA . For each Game 0 (b), a DDHP game can be constructed such that tk b is generated from random as Game 1 (b). W b 0 denotes that Game 0 (b) is played and W b 1 denotes that Game 1 (b) is played. According to the definition of EGA, there is According to the definition of DDHP in PRF-ODH, there is Then, Theorem 5 proves to be true. Because DDHP is hard in PRF-ODH, Adv EGA is negligible. So EGA of AART is satisfied.

Discussion
We further discuss the performance and some issues when running AART. Performance. The performance comparison can be seen from Table 1. For n group members, the number of nodes of ART is 2n. The amount of nodes in AART is 4n because of the additional random nodes. Thus, the exponentiation times and storage cost to generate the public tree of AART are two times as ART. Also, the height of the group tree will be log(2n) + 1 in AART, which is increased by one compared with log(n) + 1 in ART. The complexity and storage in update phase will retain the same relationship of the heights. Moreover, there is an additional addr in AART. Above all, the complexity and storage of AART are close to ART. For the exponentiation times, it will be 4n for the sender in AART because of the tree structure. Because of the Update algorithm, the time cost in the following stage will be log(2n). The sender of the pair-wise Signal should update all of the channels with others. Thus, it will cost n, worse than AART. "Sender keys" will not refresh their channels, it will be 0.
For encryption times, only "sender keys" will encrypt the message keys for others. For all of these protocols, there will be only one encryption operation in each stage.
For communication storage, the sender of AART should store the n − 1 long-term public keys of others and broadcast the 4n public key pairs to others; it will be 5n − 1. Each group member should not know the long-term public keys of other group members except for the creator, the cost will be 4n + 1. In ART, each member should get the identity keys of others. The ongoing cost will be log(2n) because of the outputs of the Update operation. "Sender keys" will cost n for sending keys at the beginning, but it is only 1 ongoing since the ciphertext for each member is the same. According to one-to-all channels, it will take up n for both sender and others through pair-wise Signal. In the following sessions, it will cost n to refresh all channels between the sender and receivers. The computation storage is the addition of storage spent on exponentiation and encryption. It can be seen that the cost of AART at the setup stage is the largest. However, because of the tree structure, AART is more efficient in the ongoing stages compared with pair-wise Signal.
Although iMessage provides E2EE features, it cannot resist against the CCA [9] level attacker. LINE applies E2EE, but it cannot achieve FS and PCS. Tor is not an E2EE protocol because the last node of Tor knows the plaintext of the sender. ART is the first group protocol applying PCS, but it cannot cope with identity protection. With the help of the additional cost, AART can achieve FS, PCS, and anonymity at the same time compared with other protocols. The security comparison can be seen in Table 2.
About trusted third-parties. In ART, there is no efficient way to protect the initial stage from being attack. If the first is compromised, it means that all of the users' long-term secret keys can be access, and the identities of group members will be obtained at the beginning. We follow this setting, and we initialize the first stage session key by tk 1 and ck 0 . The later ck j is generated by former root key tk j and ck j−1 . Thus, the ck 0 should be either empty or decided by the group creator. If ck 0 is empty, the FS cannot be satisfied when tk j is compromised. The details can be found in the proof of Theorem 2.
in a message recovery situation, AART cannot resist the IND-CCA adversary. If IND-CCA is required, the message recovery should be given up.
Malicious group member. Malicious users who want to compromise keys or combine two group trees are included in without the help of the leaked keys. For the former situation, because of FS, PCS, IGA, and EGA, messages, as well as identities, can be protected. For the latter, although a malicious user can replace his leaf key in group A with the root of another group B, since the chain keys are different in two groups, members of group A cannot get the addr of B. Therefore, the two groups cannot be combined.
About collusion attacks, in a group of n members, if there are n − 1 members in collusion, including the creator and the rest sending a message, they can reveal the identity of him. However, if the creator is trustworthy, collusion attackers can only know that one member sent a message, but they cannot reveal the identity of him because they do not know the long-term public key of the sender.
Dynamic group member and device. It is easy to add a new group member through KeyExchange. The initial leaf key can be obtained by the creator, and then the creator creates a three-node agt with one root, a new member leaf, and a new random leaf. Then the creator inserts the three-node agt to the current agt to be a complete binary tree (two leaves and their parent are thought to be one unit). The creator uses tk and three nodes agt's root public key to generate new agt's root tk and public TK. Finally, he publishes the new gpk of agt. Deleting a member is easy as well. Consider one unit as a three-node agt including a user leaf, its sibling random leaf, and their parent node, the sibling of one unit has the same parent node with this unit. To remove one user, the creator should replace the parent of the unit where the target user is located with the sibling unit, use the random leaf in the sibling unit to update the agt, and publish the new gpk to all group members.
Regarding the dynamic device, the user can share tk and ck with multiple devices, create a subtree, and let the root of the subtree replace the user leaf. When updating agt, the user should update this subtree and output the path except for the path in this subtree to group members. Then, other group members will believe that they are chatting with a multi-device user.

Conclusions
In this paper, we propose a multi-stage anonymous group messaging protocol called AART, which is based on the design of ART. It can provide anonymity features including IGA and EGA, while it retains the previous features such as FS and PCS of ART. The security of AART is analyzed formally. Finally, we discuss the performance of AART compared with ART, pair-wise Signal, and "sender keys" protocols as well as other problems that may exist in AART and the related solutions to them. In our future work, the effort will be focused on how to limit anonymity by tracing the secret keys and revealing the identity of malicious users.
Author Contributions: Conceptualization, K.C. and J.C.; methodology, K.C.; validation, J.C. and J.Z.; formal analysis, K.C. and J.Z.; writing-original draft preparation, K.C.; writing-review and editing, K.C. and J.C.; supervision, J.C. All authors have read and agreed to the published version of the manuscript.

Conflicts of Interest:
The authors declare no conflict of interest.