Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Piggybacking (Internet access) wikipedia , lookup
Backpressure routing wikipedia , lookup
Airborne Networking wikipedia , lookup
Distributed operating system wikipedia , lookup
Recursive InterNetwork Architecture (RINA) wikipedia , lookup
List of wireless community networks by region wikipedia , lookup
IEEE 802.1aq wikipedia , lookup
On the Use of Feedback in an Introduction-Based Reputation Protocol∗ Patrick Caldwell1 and O. Patrick Kreidl2 Abstract— Consider a network environment with no central authority in which each node gains value when transacting with behaving nodes but risks losing value when transacting with misbehaving nodes. One recently proposed mechanism for curbing the harm by misbehaving nodes is that of an introduction-based reputation protocol [1]: transactions are permitted only between two nodes who (i) consent to being connected through introduction via a third node and (ii) provide binary-valued feedback about one another to that introducer when the connection closes. This paper models probabilistically the decision processes by which this feedback is both generated and interpreted—the associated reputation management algorithms account for different modes of misbehavior, respect the inherent information decentralization and are consistent with the utility-maximizing decisions established previously for other parts of the protocol. I. INTRODUCTION This paper builds upon the authors’ previous works [2], [3] to mathematically model and analyze an introductionbased reputation protocol, formally introduced in [1] in the context of secure Internet packet routing. The premise of the protocol is to emulate the “word-of-mouth” mechanism that prevails in any well-functioning system of commerce, where an individual becomes associated to a positive or negative reputation based upon others’ ratings of the experiences of past transactions with that individual. Automated forms of such mechanisms are referred to as reputation systems, tracking the trustworthiness of all parties as a means to preserve the value of transactions between behaving parties while curbing the harm of transactions with misbehaving parties [4]–[7]. Familiar modern-day Internet-based instantiations include Ebay’s “feedback forum,” the “Web-of-Trust” browser plug-in and “Angie’s List.” An introduction-based approach strives to similarly dis-incentivise repeated misbehavior but in a manner that eliminates the dependence upon a central reputation authority (e.g., “Angie”). Fig. 1 illustrates the fundamental aspects of an introduction-based reputation protocol: transactions are allowed only between two parties, or nodes, that are connected, where both nodes consent to the connection through an introduction sequence involving a third node. In other words, *Work supported by the Air Force Research Laboratory (AFRL) under contract FA8750-10-C-0178. The views expressed are those of the authors and do not reflect the official policy or position of the Department of Defense or the U.S. Government. 1 Patrick Caldwell is a Graduate Research Assistant in the Signal Processing and Network Science (SPaNS) Laboratory, School of Engineering, University of North Florida, 1 UNF Drive, Jacksonville, FL 32224, USA [email protected] 2 O. Patrick Kreidl is an Assistant Professor of Electrical Engineering, University of North Florida, 1 UNF Drive, Jacksonville, FL 32224, USA [email protected] Fig. 1. Sequence Diagram of an Introduction-Based Reputation Protocol every new connection between two nodes, the introducees, is preceded by an introduction sequence involving a third node, the introducer, already connected to each introducee. The introducer may or may not offer the introduction (based on its reputations of the introducees) and each introducee may or may not accept the introduction (based on its reputation of the introducer), but if offered and accepted then the connection between the introducees is established and the two nodes can transact and/or request introductions to others. The connection exists indefinitely until either introducee elects to close it and each then provides feedback to the introducer. Note that, depending on the state of all nodes’ connections, forming a new connection may require multiple consecutive introductions; moreover, it is also assumed that the network initializes with every node having at least one a-priori connection in place. Clearly the success of such a protocol depends upon a constructive interplay among the different decision processes within the different roles. The utility of each connection with a behaving node is the sum-reward of (always non-harmful) transactions, but each connection with a misbehaving node yields negative utility due to the risk of harmful transaction (and because non-harmful transactions with misbehaving nodes have zero reward). At the heart of the problem for each node is that the true behavior of any other node cannot be known with certainty but is rather summarized through its reputation. However, as a consequence of there being no central reputation authority, every node must manage its own private pool of reputations to drive its different decisions (e.g., whether to forcibly close an established connection, whether to accept an offered introduction, how feedback is generated in the role of introducee or interpreted in the role of introducer), including how to evolve those reputations based on evidence from only its own connections. These decisions local to each node are prescribed by a so-called policy, which for the encouraging simulation results reported in [1] was selected somewhat ad-hoc i.e., a set of heuristic reputation management rules with parameters tuned via a time-consuming simulation-based search. This paper, building upon its predecessors [2] and [3], presents a key step towards a more model-based optimization approach to policy selection in an introduction-based reputation protocol. As indicated in Fig. 1, the focus of this paper is on the decisions that occur upon closing a connection, namely how each introducee generates feedback and then how the introducer interprets that feedback. The decision on whether to continue an established connection was analyzed in [2], while the decision on whether to accept an offered introduction was analyzed in [3]. Section II summarizes the results in [2] and [3] for these upstream decisions of the protocol, emphasizing the aspects that materialize at the interface to the feedback analysis presented in Section III. In addition to treating a different part of the overall protocol, the contributions of this paper beyond the work of [2] and [3] are twofold: • • generalization to a multi-mode misbehavior model and the associated notion of multi-mode reputations; and explicit consideration of information decentralization across all nodes and the associated notion of multiperspective reputation management. As will be discussed, both contributions rest upon maintaining at each node a correspondence between optimal reputation management and performing probabilistic inference over a dynamic Bayesian network [8], [9]. As a result of decentralization, the observed evidence and the hidden variables influenced by that evidence will be different in an introducee’s perspective than in the introducer’s perspective. We conclude the paper in Section IV, also suggesting items for future work in the broader context of emerging controltheoretic approaches to other cyber-security problems. II. PRELIMINARIES This section summarizes the models and results used in our previous analyses of the introduction-based reputation protocol [2], [3]. The basis of these analyses is to uphold a correspondence between evolving a reputation and revising the probability of misbehavior conditioned on new evidence, which is formalized in the following definition. It is worth noting that the explicit dependence on the mode of misbehavior is looking ahead to the analysis of Section III, when this generalization becomes essential. Definition 1 (Single-Mode Reputation): Consider a probabilistic model that jointly defines a (hidden) binary-valued state variable Xim , indicating whether remote node i is misbehaving in mode m, and a vector Z(n) of random variables to be observed across all connections up to and including time period n. Letting m pm i (n) = P [Xi = 0 | Z(n)] denote the conditional probability that node i is behaving in mode m given the evidence Z(n) realized through time period n, the corresponding reputation is defined by m pi (n) m . Ri (n) = log 1 − pm i (n) Observe that as the posterior behaving probability pm i (n) approaches unity (zero), the reputation Rim (n) approaches positive (negative) infinity. Also observe that the function mapping probability to reputation is bijective, having inverse pm i (n) = exp [Rim (n)] . 1 + exp [Rim (n)] We will suppress the subscript, superscript or parenthetical notation in Definition 1 when the remote node, misbehavior mode or time period in question is clear from context. A. Summary of Continue vs. Close Decision Analysis [2] Consider the protocol sequence of Fig. 1 between the time that a particular connection is first established to the time that it is eventually closed. On a per-transaction basis, each introducee receives new evidence and is then faced with the decision on whether to continue or (forcibly) close the connection. As fully developed in [2], this continue vs. close decision process can be cast as a variant of the sequential detection problem first studied by Wald [10]. The associated utility-maximizing policy is, in turn, a variant of Wald’s original solution (the so-called Sequential Probability Ratio Test) combined with Definition 1 to translate the policy’s probability parameters into units of reputation. The probabilistic model at each introducee involves a hidden variable X T , indicating whether the remote introducee is a misbehaving transactor. It is Bernoulli with parameter pT (0), which is assigned from a given initial reputation RT (0) via Definition 1. The evidence Z(n) evolves in accordance with the detector’s alert stream Y associated to the sequence of transactions, each successive alert or nonalert indicating (perhaps erroneously) that the corresponding transaction is harmful or benign, respectively. This alert stream Y (conditioned on X T ) is a Bernoulli process with a per-transaction alert probability qFP , if X T = 0 qA = , T T 1 − q qFP + q (1 − qFN ) , if X T = 1 (1) where probabilities qFP and qFN capture the falsepositive/false-negative rates of the detector and probability q T captures the attack rate of the misbehaving transactor. Under the assumptions discussed in [2], the utilitymaximizing continue vs. close policy consists of just three parameters: a reputation increment RINC and decrement RDEC that is applied per non-alert and alert, respectively, as well as a reputation threshold RTHR that renders the optimal continue vs. close decision. The increment and decrement are both determined in closed form from model parameters qFP , qFN and q T . The threshold, however, requires solution of a dynamic program [11] that also yields the optimal utility function V ∗ pT (0) , expressing the infinite-horizon expected total discounted reward given initial reputation RT (0) ⇔ pT (0). This optimal threshold policy provably strikes the best balance between foregone reward if forcibly closing on a behaving transactor (type-I misclassification) and increased harm if naturally closing on a misbehaving transactor (type II misclassification). The associated misclassification rates are similarly policy-dependent and a function of initial reputation RT (0) ⇔ pT (0); specifically, letting binary random variable U indicate whether the connection is closed forcibly, the optimal threshold policy achieves a type-I rate function α∗ pT (0) = P U = 1 | X T = 0; RINC , RDEC , RTHR (2) and a type-II rate function β ∗ pT (0) = P U = 0 | X T = 1; RINC , RDEC , RTHR . Exact computation of these rates is often intractable for general optimal stopping problems, but for binary state numerous approximations are known e.g., [10]–[12]. On any active connection, a natural closure occurs because both nodes have elected to continue until all transactions are exhausted, while a forced closure occurs if either node elects to discontinue before transactions are exhausted. In the sequence diagram of Fig. 1, there are actually three active connections: the two previously-established (and presumed continuing) ones between the introducer and each introducee as well as the newly-established (and eventually closed) one between the two introducees. It’s worth noting that, during actual operation, any of these six continue vs. close decision processes can result in a forced closure. Moreover, a forced closure on a connection with the introducer will automatically trigger a forced closure on any connection brokered by that introducer. In any case, whether a connection is closed naturally or forcibly is assumed to be observable to both ends of the connection (i.e., analogously to a phone call ending with a two-way farewell or with an abrupt oneway hang up), raising the question of how each introducee combines evidence from a close event with the accrued transaction evidence. Closing a connection also triggers each introducee to generate feedback for the introducer, raising the similar question of how the introducer combines the feedback evidence with the evidence observed directly from its transactions with both introducees. Indeed, these types of “combining evidence” questions are addressed in Section III. B. Summary of Accept vs. Decline Decision Analysis [3] As described in the preceding section, the utilitymaximizing continue vs. close policy of any modeled connection is a reputation threshold rule defined by solving a dynamic program. We next consider the protocol sequence of Fig. 1 at the time each introducee is faced with the decision on whether to accept or decline an offered introduction. As fully developed in [3], this accept vs. decline decision is akin to deciding whether to continue or close the prospective connection, or the connection that would be established if the offered introduction is in fact accepted. Specifically, upon modeling the prospective connection and solving its associated dynamic program, the offered connection should be accepted only if the initial reputation is above the optimal threshold. It follows that the only additional degree-offreedom in this accept vs. decline decision process is the reputation initialization rule. The analysis in [3] appeals again to Definition 1 and identifies the following initialization rule, expressed assuming that (an already-connected) node A is offering an introduction to remote node B: (3) pTB (0) = p̃TB pIA (0) + 1 − pIA (0) 1 − q I . Here, p̃TB denotes an initial reputation for node B as supplied by introducer A, pIA denotes the current (locally-supplied) reputation of the introducer and probability q I denotes the attack rate of a misbehaving introducer. The assumptions are that a behaving introducer only offers introductions to presumed behaving transactors and always truthfully selects p̃TB from its pool of reputations, while a misbehaving introducer deliberately offers a fraction q I of its introductions to presumed misbehaving transactors and can select p̃TB arbitrarily. Note the underlying multi-mode misbehavior model here— node A’s reputation as a transactor is not reflected in (3) and there need not be any relationship between the attack rates q I and q T of the two misbehavior modes. Clearly this setup couples the reputation of remote node B to that of its introducer A, a matter to be addressed further in Section III. III. FEEDBACK ANALYSIS As summarized in the preceding section, the results in [2] and [3] characterize the decision-making of each introducee in the sequence diagram of Fig. 1 from the moment an introduction is offered up to the moment that the connection is closed. The analysis in this section leverages these results and focuses on the decisions that occur upon closing the connection, namely how each introducee generates feedback and how the introducer interprets that feedback. Fig. 2 illustrates the main questions underlying these decisions for the two1 perspectives, assuming a (just-closed) connection between local node L and remote node B that was originally introduced by (still-connected) node A. The introducee’s perspective is presented to completeness in Subsection III-A, while the introducer’s perspective is more elaborate (and also still under analysis) and thus only its summary is presented in Subsection III-B. As with the protocol’s upstream decisions, 1 There are technically three perspectives, one per node, but in the scope of the feedback analysis that of introducee B parallels that of introducee L. Fig. 2. A Closed Connection in an Introduction-Based Protocol these feedback decisions rest upon maintaining at each node a correspondence between optimal reputation management and probabilistic inference. However, an increased number of random variables are involved so tools from dynamic Bayesian networks [8], [9] are used. The setup also requires generalization to a multi-mode misbehavior model and the associated notion of multi-mode reputations, which is formalized by the following definition. Definition 2 (Multi-Mode Reputation): Consider a collec tion of M binary state variables Xi = Xi1 , Xi2 , . . . , XiM , together indicating whether remote node i is misbehaving in any one of M different modes (with each Xim as described in Definition 1). Letting pi (n) = P Xi1 = Xi2 = · · · = XiM = 0 | Z(n) denote the conditional probability that node i is behaving in every mode given the evidence Z(n) realized through time period n, the corresponding multi-mode reputation is defined by pi (n) . Ri (n) = log 1 − pi (n) A few remarks on Definition 1 and Definition 2: 1) Random variable Xi takes its values in a finite set of cardinality 2M and thus its probabilistic description consists of up to 2M − 1 independent parameters. This becomes unmanageable for even moderate values of M unless there is a known special structure (e.g., sparsity in the full probability vector, conditional independencies among subsets of the M per-mode variables) that admits a more compact representation. 2) Knowing all M per-node reputations Ri1 , ..., RiM is, in general, not sufficient to deduce the multi-mode reputation Ri . The one exception is if the collection of per-mode random variables Xi1 , . .Q . , XiM are mutually M independent, in which case pi = m=1 pm i . Similarly, knowing only the multi-mode reputation is not sufficient to deduce the M per-node reputations, but it is always true that Rim ≥ Ri for every m, or Ri ≤ minm Rim . Another generalization in this section will be to probabilistic models that consider multiple active connections. The main model in Section II was in the scope of only a single active connection, whether to continue or close it on a per-transaction basis or whether to accept or decline it on a per-introduction basis. However, as suggested by Fig. 2, the feedback decisions in each perspective involve evidence that accrues over two connections. The underlying probabilistic models will have to represent possible crossconnection dependencies in how the different remote nodes misbehave or in the evidence provided by the different misbehavior detectors. Throughout this section, the following simplifying assumptions will be made. Assumption 1 (Simple Cross-Connection Dependencies): In any multi-node network with multiple active connections, (a) if two or more nodes are misbehaving, they do so without collusion between them; Fig. 3. Bayesian Network Local to Introducee L’s Perspective (b) if any one node is misbehaving in multiple modes, its per-mode attack sequences are independent processes; (c) the detector on any one connection makes its errors independently of the error sequences on other connections. A. Decisions from Introducee’s Perspective Fig. 3 shows the Bayesian network that captures the inference problem faced by an introducee, in this case node L, when a connection is closed. The total evidence accrued over the lifetime of the connection (indicated by the shaded nodes) are alert streams YA and YB on the respective connections with A and B as well as the (natural or forced) close action UB taken by node B. The hidden variables XA and XB comprise the collection of misbehavior indicators that are influenced by this evidence. In the scope of the analysis of Section II, node A has interacted with node L as both a transactor and as an introducer, whereas node B has so far interacted with node L only as a transactor. Upon observing the close event, however, node L faces the questions depicted in Fig. 2 that begin with whether node B is misbehaving as a closer, deliberately aiming to inject confusion into the nominal workings of the protocol. That is, if a forced close occurs (UB = 1) then either node B (whether behaving or not) has innocently made a type-I misclassification or node B has opted to attack as a misbehaving closer. Alternatively, if a natural close occurs (UB = 0), then node B has made a correct classification and, if a misbehaving closer, also declined to attack on this particular opportunity. Letting XBC indicate whether node B is a misbehaving closer with attack rate q C and letting αB denote B’s type-I misclassification rate as modeled by (2), the associated probabilistic setup is αB , x=0 P UB = 1 | XBC = x = αB + (1 − αB )q C , x = 1 (4) and P UB = 0 | XBC = x = 1 − P UB = 1 | XBC = x . Altogether, node A can misbehave as a transactor or as an introducer and thus XA = (XAT , XAI ), whereas node B can misbehave as a transactor or as a closer and thus XB = (XBT , XBC ). To achieve optimal reputation management via probabilistic inference, it remains to specify the joint distribution P [YA , YB , UB | XA , XB ] P [XA , XB ] between all of the evidence and state variables identified in Fig. 3. Assumption 1 refines the structure implied by the Upon Natural Close Upon Forced Close 0 Reputation Decrement, RC DEC Reputation Increment, RC INC 6 5 4 3 2 1 0 0 0.5 Closer Attack Rate, q C Fig. 4. 1 −1 −2 −3 Fig. 5. −4 αB αB αB αB −5 −6 0 = = = = 0.01 0.10 0.50 0.90 0.5 1 Closer Attack Rate, q C Increment/Decrement Parameters for Closer-Mode Reputation Bayesian network alone: in the priors we have P [XA , XB ] = P [XA ] P XBT | XAI P XBC | XAI , while in the likelihood we have P [YA , YB , UB | XA , XB ] = P YA | XAT P [YB , UB | XB ] with P [YB , UB | XB ] = P YB | XBT P UB | XBC . Here, the quantity P [XA ] refers to the length-4 probability vector that node L held on the connection to node A at the time the introduction to node B was accepted. The quantities P XBm | XAI derive from initialization rules of the type in (3), where now the introducer supplies p̃m B for every misbehavior mode m of node B. The quantities P Yi |XiT derive from the (conditional) Bernoulli process description in (1) and are thus (conditionally) binomial distributions, while the quantity P UB | XBC is given in (4). Having represented the introducee’s Bayesian network, the associated reputation management decisions follow from solving the inference problem and appealing to Definition 1 and Definition 2. The impact of the close event is particularly straightforward: the closer-mode reputation RBC is additively adjusted by ! P UB = u | XBC = 0 , log P UB = u | XBC = 1 which for a natural close (u = 0) yields increment 1 C RINC = log 1 − qC and for a forced close (u = 1) yields decrement αB C . RDEC = log αB + (1 − αB )q C That the increment does not depend on the type-I misclassification rate αB stems from the fact that, in node L’s misbehavior model, node B can close naturally only if it has not misclassified node L. Fig. 4 plots these policy parameters versus the closer attack rate q C , showing the decrement for different values of αB . Observe the softening impact of the close event against stealthier misbehavers (i.e., decreasing q C ) or more error-prone classifiers (i.e., increasing αB ). Bayesian Network Local to Introducer A’s Perspective The alert stream Yi for each node i is processed via the per-transaction increment/decrement parameters applied to the single-mode reputation RiT as described in Section II. However, the threshold test on whether to continue or close a connection is now applied to the multi-mode reputation, reflecting a conservative posture that any form of misbehavior is equally costly. That is, compared to what was described in Section II, it is now the multi-mode reputation Ri (n) that drives the accept vs. decline and continue vs. close decisions. Having characterized the introducee’s reputation management decisions, generalized to the case of multi-mode misbehavior models, it remains to specify the rule by which feedback to the introducer is generated. We equate deciding to send positive or negative feedback to the introducer with deciding whether to accept or decline an immediate reintroduction. That is, given the connection was opened in time period 0 and closed in time period n, node L again applies (3) but using pIA (n) in place of pIA (0) and the locallysupplied pTB (n) in place of p̃TB . If the resulting reputation is above threshold RTHR , then positive feedback is sent and otherwise negative feedback is sent. B. Decisions from Introducer’s Perspective (Summary) Fig. 5 shows the Bayesian network that captures the inference problem faced by an introducer, in this case node A, when a connection it brokered is closed. The total evidence accrued over the lifetime of the connection are alert streams YL and YB on the respective connections with L and B, but the close actions are not observed; rather, the introducer observes only the (positive or negative) feedback FL and FB sent by L and B, respectively. As illustrated in Fig. 2, the primary question to the introducer is how to interpret this composite feedback in relation to the misbehavior modes indicated by hidden variables XL and XB . Previous sections have already introduced three misbehavior modes (as transactor in the continue vs. close decision analysis, as introducer in the accept vs. decline decision analysis and as closer in the feedback analysis for the introducee perspective). This section introduces the misbehaving feedbacker, a fourth mode representing the possibility that either introducee is distorting its feedback to the introducer to deliberately confuse the nominal workings of the protocol. A misbehaving feedbacker’s attack sequence is modeled as a Bernoulli process with rate q F , each attack generating the opposite feedback message than would be sent by a behaving node. Because behaving nodes can make erroneous classifications, an interesting property of this misbehavior model is that the introducer can experience “two wrongs making a right.” For example, consider a feedback attack after the occurrence of a type-I misclassification that would have triggered negative feedback from a behaving node. Altogether, in the introducer’s Bayesian network of Fig. 5, each hidden variable XL = (XLT , XLC , XLF ) and XB = (XBT , XBC , XBF ) is its own collection of three binary variables. The structure of this Bayesian network, in combination with Assumption 1, implies that the priors simplify to P [XL , XB ] = P [XL ] P [XB ] and the likelihood simplifies to P [YL , YB , FL , FB | XL , XB ] = P YL | XLT P YB | XBT P [FL , FB | XL , XB ] . Here, the quantity P [Xi ] for i ∈ {L, B} refers to the length-8 probability vector that node A held on the connection to node i at the time the introduction was established. The quantities P Yi |XiT derive from the (conditional) Bernoulli process description in (1) and, as in the preceding sections, are thus (conditionally) binomial distributions. The only remaining quantity is P [FL , FB | XL , XB ], which is indeed the crux of the model. Numerous avenues for its definition are under exploration, all involving both introducees’ misclassification rates αL , βL , αB and βB as well as both introducees’ attack rates qLC , qBC , qLF and qBF in the closer and feedbacker modes. The differences rest mainly in the assumptions about how the introducer’s model compensates for not having access to information about the policies that each introducee employs, which in turn affects the introducer’s approximation of the true probabilistic processes generating the feedback messages and, ultimately, the achievable network-wide utility. IV. SUMMARY AND FUTURE WORK Four core decision processes of a recently proposed introduction-based reputation protocol [1], aiming to retain the attractive properties of trust systems but without the assumption of a centralized reputation server, have been modeled within a utility-maximizing probabilistic framework. Previous work [2], [3] showed that the decision to accept introductions to new connections is entwined with the decision process by which established connections are managed. This work addressed the decision processes that occur upon closing a connection, concerning the protocol’s use of feedback to signal whether misbehavior is present. For each introducee’s perspective we detailed how its experience with the other introducee is rated, whereas for the introducer’s perspective we summarized how that pair of ratings is subsequently interpreted. These analyses rest upon maintaining an equivalence between evolving reputations of all interacting nodes across time/connections and solving standard inference problems on Bayesian network models. The details of each such model were seen to depend upon the perspective under analysis, reflecting the information decentralization inherent to the protocol. Ongoing work includes (i) the impact of different assumptions in the way an introducer interprets feedback, (ii) extensions that apply to multiple levels of introductions or multiple introductions by one introducer and (iii) analysis of decisions by introducers on whether to offer a requested introduction in the first place. The application of probabilistic graphical models to reputation-driven trust networks appears in multiple fields (e.g., computing, communications, control), but relatively few approaches assume no central authority or allow a multistage analysis like the introduction-based scheme considered here. References [13] and [14], for example, each employ graph-based inference techniques to map a collection of “opinions” into binary-valued trust relationships with minimum error, but sequential decision-making is not represented. Even so, our work has yet to consider the impact of richer adversary models than just the per-mode Bernoulli attack sequences here, such as strategic on/off misbehaviors or the possibility of collusion among multiple nodes. Inquiries along these lines are likely related to the growing body of work on network security games e.g., [15]–[17]. ACKNOWLEDGMENT The authors are grateful to Dr. Gregory L. Frazier and Dr. Brian DeCleene for numerous helpful discussions. R EFERENCES [1] G. Frazier, et al., “Incentivising responsible networking via introduction-based routing,” in Proc. 4th Int. Conf. on Trust and Trustworthy Computing, Springer-Verlag, 2011. [2] R. Al-Bayaty and O. P. Kreidl, “On optimal decisions in an introduction-based reputation protocol,” in Proc. 38th IEEE Int. Conf. on Acoustics, Speech and Signal Processing, May 2013. [3] R. Al-Bayaty, P. Caldwell and O. P. Kreidl, “On trusting introductions from a reputable source: a utility-maximizing probabilistic approach,” in Proc. 1st IEEE Global Conf. on Signal and Info. Proc., Dec 2013. [4] A. Josang, R. Ismail, and C. Boyd, “A survey of trust and reputation systems for online service provision,” Decision Support Systems, vol. 43, no. 2, pp. 618–644, 2007. [5] Y. Yang, et al., “Defending online reputation systems against collaborative unfair raters through signal modeling and trust,” in Proc. 2009 ACM Symposium on Applied Computing, 2009. [6] K. Hoffman, D. Zage, and C. Nita-Rotaru, “A survey of attack and defense techniques for reputation systems,” ACM Comput. Surv., vol. 42, no. 1, pp. 1–31, Dec. 2009. [7] Y. Sun and Y. Liu, “Security of online reputation systems: The evolution of attacks and defenses,” IEEE Signal Processing Magazine, vol. 29, no. 2, pp. 87–97, 2012. [8] J. Pearl, Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan Kauffman, San Mateo, CA, 1988. [9] K. P. Murphy, Dynamic Bayesian Networks: Representation, Inference and Learning. Ph.D. Thesis, Computer Science, UC Berkeley, 2002. [10] A. Wald, Sequential Analysis. New York, NY: Wiley and Sons, 1947. [11] D. P. Bertsekas, Dynamic Programming and Optimal Control (Vols. 1 and 2). Belmont, MA: Athena Scientific, 1995. [12] B. K. Ghosh and P. K. Sen, Handbook of Sequential Analysis. Marcel Dekkar, New York, NY, 1991. [13] G. Theodorakopoulos and J. S. Baras, “On trust models and trust evaluation metrics for ad hoc networks,” IEEE J. On Selected Areas in Communications, vol. 24, no. 2, pp. 318–328, 2006. [14] S. Ermon, L. Schenato and S. Zampieri, “Trust estimation in autonomic networks: a statistical mechanics approach,” in Proc. 48th IEEE Conf. on Decision and Control, 2009. [15] A. Gueye and J. C. Walrand, “Security in networks: a game-theoretic approach,” in Proc. 47th IEEE Conf. on Decision and Control, 2008. [16] T. Alpcan and T. Basar, Network Security: A Decision and GameTheoretic Approach. Cambridge University Press, UK, 2010. [17] N. Bao, O. P. Kreidl and J. Musacchio, “A network security classification game,” GAMENETS, pp. 265-280, 2011.