Causality Based Propagation History Ranking in Social Networks

Causality Based Propagation History Ranking In Social Networks-Free PDF

  • Date:13 Jan 2020
  • Views:80
  • Downloads:0
  • Pages:7
  • Size:797.09 KB

Share Pdf : Causality Based Propagation History Ranking In Social Networks

Download and Preview : Causality Based Propagation History Ranking In Social Networks


Report CopyRight/DMCA Form For : Causality Based Propagation History Ranking In Social Networks


Transcription:

not always available e g for security reasons The distribu and P x true are two causal effect measures consider. tion records of gateways a kind of propagation history might ing how necessary and sufficient the edge is for the diffusion. be considered to prioritize the diagnosis work respectively It is natural to consider the above two effects. Contributions Our contributions are as follows together i e the Difference of Causal Effects DCE Pearl. 1 To the best of our knowledge we are the first to study the 2000. propagation history ranking problem in SNS We further DCE X P x true P x0 true 1. discuss and adapt Difference of Causal Effects DCE as. Consequently DCE considers both necessary and sufficient. the ranking criterion, faces of causal effects and a rigorous proof can be found. 2 We propose a resp cap ranking strategy to avoid the in Pearl 2000 The similar idea is widely adopted in other. hardness of DCE calculation by adopting indicators re research areas such as network reliability Page and Perry. sponsibility and capability to capture the necessary and 1994 and economics Campbell et al 1997. sufficient faces of causal effects respectively DCE Calculation To calculate this causal measure. 3 We give an approximate algorithm for responsibility cal adopting randomized experiments is the preferred golden. culation since this problem is generally NP hard method Fisher 1925 We briefly describe this procedure. as follows Given the input edge X we first enforce X to. 4 Extensive experiments on real world datasets demon. be non failed or failed Then in each simulation we. strate the feasibility of our ranking mechanism, randomly set the other participant edges to be failed or non. failed and further check if the diffusion would be successful. 2 Preliminary and Problem Statement After a great number of simulations we get the probability of. In this paper we restrict our discussion to the information dif P x true or P x0 true Finally we would get the. fusion from one source node to one target node Without loss DCE value according to Eq 1. of generality we consider edges as propagation participants. Following this we give a formal definition of propagation 4 Integrated resp cap Ranking Strategy. history and its ranking problem Although conducting randomized experiments is preferred. Definition 1 Propagation history of a diffusion The prop for DCE calculation this method needs to run simulations. agation history of a diffusion in SNS records all the actual many times over to obtain a convergent value Wasserstein. propagation trails of an event E from the source S to the tar 1997 To avoid this complication we propose a resp cap. get T and is usually formalized as ranking strategy by adopting two indicators i e responsibil. t1 tn t11 t1l1 tn1 tnln ity and capability to capture the intuition of causal effects. from two faces, where each ti is called a trace constructed by li ordered. edges ti1 tili We drop the subscript E S T when there 4 1 Responsibility. is no ambiguity To capture the necessary face of causal effects in a diffu. sion we introduce the concept of responsibility Chockler. Propagation history ranking Let be the propagation. and Halpern 2004 based on the following definition inspired. history of a diffusion and let T be the edge set of The. by Pearl 2000, goal is to rank those edges in T by their contribution to this. diffusion Definition 2 Causality in a diffusion Suppose the edge set. of the propagation history is T Let t 2 T be a participant. 3 DCE as Ranking Criterion edge and let T be an edge set t is called a cause for this. diffusion w r t if the following two conditions are satisfied. In this section we discuss how to estimate the edge im. 1 The diffusion from s to t remains with T and, portance in propagation histories and introduce the selected.
ranking criterion 2 after removing the removal of t would make this dif. Importance Estimation The concern with regard to the im fusion fail. portance of a specific edge concentrates on two effects What is called the contingency set for t. is the overall effect on this diffusion if this edge fails Or Although checking causality i e evaluating each cause. what is the effect if the edge is non failed Considering the and its related contingency set is NP complete in gen. likelihood of a successful transmission these two effects can eral Eiter and Lukasiewicz 2002 Meliou et al 2010b. be estimated as gave a PTime solution for provenance data of relational. P x0 true is the probability of a successful informa databases This method would be directly applied to the prop. tion transmission if edge X is intervened to be failed agation history in SNS In this paper we only focus on the. P x true is the probability of a successful infor causality based ranking problem. mation transmission if edge X is intervened to be non Definition 3 Responsibility Suppose the edge set of the. failed propagation history is T and let t 2 T be a participant edge. Note that x and x0 stand for interventions to set edge The responsibility of t for this diffusion is. X to be non failed and failed in randomized experi 1. ments Fisher 1925 respectively Therefore P x0 true t. where ranges over all contingency sets for t Example 3 Example 1 continued The capability of edge A. Example 2 Example 1 continued The propagation his is 1 2 since it needs edge B or C to ensure the diffusion. tory of this diffusion is t1 t2 t3 where t1 A B The capability values of B and C are both 1 2 because both. t2 A C and t3 D The responsibility of edge A is 1 2 of them need A to get information transmitted Edge D s. because the smallest contingency set for A is D Similarly capability is 1 because D itself can guarantee the diffusion. the responsibility of D is 1 2 with the smallest contingency The capability of edge t is determined by the minimum. set A The smallest contingency sets for B and C are C edge set whose addition could make t indispensable for a. D and B D respectively Therefore their responsibility successful information transmission which leads to the fol. values are both 1 3 lowing proposition, Responsibility of edge t is determined by the minimum Proposition 2 Capability measures the sufficient face of. edge set whose removal will make t indispensable for a suc causal effects. cessful information transmission which leads to the follow. ing proposition Proof The proof is based on a causal measure Probability. of Sufficiency PS As stated in Pearl 2000 PS is defined as. Proposition 1 Responsibility measures the necessary face of the probability of enabling x would produce y in a situation. causal effects where x and y are in fact absent Therefore PS measures the. Proof The proof is based on a causal measure Probabil sufficient face of causal effects. ity of Necessary PN which is defined as the probability of Continuing with the same definitions of X x x0 S S s. event y would not have occurred in the absence of event x s s0 and s 0 in the proof of Proposition 1 we can calculate the. given that x and y did in fact occur Pearl 2000 Therefore PS value of edge X PS P s 0 s x x0 P s 0 x0 First of. PN measures the necessary face of causal effects all since S consists of the traces which do not contain edge. Let X be a participant edge and let x and x0 stand for the X P s and P s 0 are not affected by X As x the capabil. propositions X non failed and X failed respectively Let ity of X increases P s x increases i e S becomes easier. set S contain all propagation traces which go through edge X to ensure the diffusion Since P s 0 and P x0 are not af. and the rest of the traces are put into another set denoted as fected by x PS will increase when x increases Therefore. S Let s and s stand for the cases that S and S can success capability has a positive relationship with PS i e capability. fully transmit the information respectively and let s0 and s 0 measures the sufficient face of causal effects. denote their complements We could calculate the PN value. of edge X PN P s 0 s P s x P s 0 s Suppose every Complexity of Capability Suppose the propagation history. edge follows the same failure probability 50 Then we get contains N traces and M edges Using an inverted index. the following equations calculating the capability values of all edges can be done in. O LN with O M space complexity where L is the av,PN 1 P s x P s 0 s 1. erage length of all propagation traces Generally L is not a. 1 P s P x P s 0 s 1 large number according to the concept of six degrees of sep. 1 0 5 P s P s 0 s 1 aration Milgram 1967 Therefore the capability problem. has a linear complexity with respect to the number of propa. P s 0 s P s gation traces, As the responsibility of X increases S becomes easier to get. broken P s 0 increases and P s decreases In this case if 4 3 Integrated resp cap Ranking. P s keeps the same PN will increase Therefore responsi By combining the above two indicators we get the integrated. bility has a positive relationship with PN i e responsibility responsibility capability ranking strategy short for resp. measures the necessary face of causal effects cap defined as follows. Complexity of Responsibility In theory to compute the re score f n responsibility 1 f n capability 2. sponsibility one has to iterate over all contingency sets i e where f n stands for a normalized function calculating the. computing responsibility in general is NP hard Chockler and standard score Strang and Aarikka 1986 and 0 1 is a. Halpern 2004 Therefore we propose an approximate algo balance factor Note that these two indicators can be incor. rithm and more details can be found in Section 5 porated into other more complex ranking methods which we. 4 2 Capability leave as our future work, To capture the sufficient face of causal effects in a diffusion. we define the concept of capability 5 Approximate Algorithm for Responsibility. Definition 4 Capability Suppose the edge set of the propa Since responsibility is hard to calculate in this section we. gation history is T and let t 2 T be a participant edge The propose an approximate algorithm which guarantees a fea. capability of t for this diffusion is sible solution We first compare the responsibility problem. with the classical set cover problem SCP Chvatal 1979. t Responsibility vs SCP Suppose the propagation history. min st is organized for SCP c1 cn where ci is a sub. where ranges over all propagation traces going through t set of the whole participant edge set T t1 tm We. and function st returns the edge set of assume function sc tj ci tj 2ci ci 2 sc tj for. short and intuitively sc tj covers contains all the sets in Case B If Appresp cannot find a contingency set In this. containing tj Given an edge t the intuition of SCP is to case suppose we have gotten temporary results SA0 and ST 0. find the minimum k which satisfies sc t1 sc tk when no edge satisfies our constraint rule i e for each left. sc t Responsibility problem has the same intuition edge x we get SA0 sc x SA0 Suppose ct is a set in. besides which it has another constraint that the removal of ST 0 and ca is a set in SA0 For each edge x in ct we will. t1 tk must ensure t is still a cause for this diffusion find sc x SA0 contains ca Thus we get ct ca i e. i e sc t1 sc tk 6 ca is redundant This is opposite to our hypothesis of non. Approximate Algorithm for Responsibility If is the se redundancy. lected contingency set for edge t must satisfy two con Therefore Appresp can guarantee a feasible solution for. straints i if all edges in are removed the diffusion remains the propagation history without redundancy. but the removal of t would make this S diffusion fail and ii. must be the minimum set satisfying x2 sc x sc t Complexity of Appresp Suppose the propagation history. Based on these two constraints we propose a greedy algo has N traces and M edges the average length of traces is. rithm named Appresp described in Alg 1 L and the corresponding contingency set size is k On aver. age the time complexity of Appresp is O k L N M, Note that both k and L are usually small numbers 2 i e our.
Algorithm 1 Appresp method is a linear algorithm, Input The SCP form of the propagation history the in. volved edge set T and an edge t 2 T Proof Given the input edge t we need to loop the following. Output The approximate responsibility of t steps k times to calculate its responsibility. 1 We first get the covered set SA sc t and the uncov 1 Step 2 performs two tasks Firstly it calculates. ered set ST SA sc x ST for each edge x We can use an inverted index. 2 We choose edge x2T which satisfies these two rules to speed up the time complexity is O L N Then it. 1 sc x ST covers the sets in ST as many as possi selects edge x which satisfies these two rules the time. ble complexity is at most O M since SA is usually small. 2 SA 6 sc x SA 2 Step 3 removes all sets in sc x ST from ST the time. 3 We add x to remove sc x SA from SA and remove complexity is O N k and does the similar task in SA. sc x ST from ST Therefore the overall time complexity is O k L. 4 Repeat Step 2 and 3 until ST gets empty N M 2 N k O k L N M. 5 Output the responsibility of t by 1 1,Compared to our method Qin et al 2013 directly. adopted a greedy strategy to solve the corresponding SCP. The key aspects of Appresp are the two rules listed in Step problem i e their method cannot guarantee a feasible so. 2 in Alg 1 The first rule is a heuristic rule for ranking the lution for the responsibility problem We will compare these. Causality Based Propagation History Ranking in Social Networks Zheng Wang1 2 Chaokun Wang1 2 Jisheng Pei1 2 Xiaojun Ye1 2 Philip S Yu3 4 1School of Software Tsinghua University Beijing 100084 P R China 2Tsinghua National Laboratory for Information Science and Technology TNList

Related Books