Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Social Network Data Outline 1. Foundations: Data basics & Software (review/catch up) 2. Collecting Data 1. Relations 2. Level of analysis 3. Sources 3. Data Accuracy 1. How accurate is it? 2. What can we do about it 3. Effect on measurement 4. Network Ethics 1. Data collection 1. Informed Consent 2. Deductive Disclosure 3. Building a frame (closed-answer sets) 4. Illicit Relations 5. Novel Compilations of “extant” data 2. Data Use 1. Risks of identifying R’s (and named non-R’s) positions 2. Action in response to nets (military, police, firm) Foundations Methods From pictures to matrices b b d a c e Undirected, binary a b 1 a b 1 c 1 d e c d 1 1 c e a 1 a b 1 c 1 d e 1 1 a e Directed, binary 1 1 d b 1 c 1 d e 1 1 1 Foundations Methods From matrices to lists a a b 1 c d e b 1 c d e 1 1 1 1 1 1 1 1 Adjacency List ab bac cbde dce ecd Arc List ab ba bc cb cd ce dc de ec ed Foundations Basic Measures Basic Measures & A little graph theory For greater detail, see: http://www.analytictech.com/networks/graphtheory.htm Volume The first measure of interest is the simple volume of relations in the system, known as density, which is the average relational value over all dyads. Under most circumstances, it is calculated as: D= X N ( N - 1) Foundations Basic Measures Basic Measures & A little graph theory Volume At the individual level, volume is the number of relations, sent or received, equal to the row and column sums of the adjacency matrix. a a b 1 c d e b 1 c 1 1 d e 1 1 1 Node In-Degree Out-Degree a 1 1 b 2 1 c 1 3 d 2 0 e 1 2 Mean: 7/5 7/5 Foundations Data Basic Measures & A little graph theory Reachability Indirect connections are what make networks systems. One actor can reach another if there is a path in the graph connecting them. b a a d c b e f c f d e Foundations Basic Matrix Operations One of the key advantages to storing networks as matrices is that we can use all of the tools from linear algebra on the socio-matrix. Some of the basics matrix manipulations that we use are as follows: 1) Definition A matrix is any rectangular array of numbers. We refer to the matrix dimension as the number of rows and columns a b c d e a 0 1 0 0 0 b 1 0 0 0 0 c 0 1 0 1 1 d 0 e 0 0 0 0 0 0 1 1 0 (5 x 5) W B 1 0 1 0 0 1 0 1 0 1 1 0 (5x2) Age 13 10 7 8 16 11 (5x1) Foundations Basic Matrix Operations Matrix operations work on the elements of the matrix in particular ways. To do so, the matrices must be conformable. That means the sizes allow the operation. For addition (+), subtraction (-), or elementwise multiplication (#), both matrices must have the same number of rows and columns. For these operations, the matrix value is the operation applied to the corresponding cell values. 1 3 A= 4 7 2 5 2 3 B= 7 1 0 4 3 6 A+B = 11 8 2 9 3 9 Multiplication by a scalar: 3A = 12 21 6 15 A-B = -1 0 -3 6 2 1 2 9 A#B = 28 7 0 20 Foundations Basic Matrix Operations The transpose (` or T) of a matrix reverses the row and column dimensions. Atij=Aji So a M x N matrix becomes an N x M matrix. a b c d e f T = a c e b d f Foundations Basic Matrix Operations The matrix multiplication (x) of two matrices involves all elements of the matrix, and will often result in a matrix of new dimensions. In general, to be conformable, the inner dimension of both matrices must match. So: A3x2 x B2x3 = C3 x 3 But A3x3 x B2x3 is not defined Substantively, adding ‘names’ to the dimensions will help us keep track of what the resulting multiplications mean: So multiplying (send x receive)x (send x receive) = (send x receive), giving us the two-step distances (the sender’s recipient's receivers). Foundations Basic Matrix Operations The multiplication of two matrices Amxn and Bnxq results in Cmxq n C mq = amk bkq k =1 a b c d e f g h a b c d e f g h i j k l (3x2) (2x3) = = ae+bg ce+dg ag+bj cg+dj eg+fg af+bh cf+dh ah+bk ch+dk eh+fk (3x3) ai+bl ci+dl ei+fl Foundations Basic Matrix Operations The powers (square, cube, etc) of a matrix are just the matrix times itself that many times. A2 = AA or A3 = AAA We often use matrix multiplication to find types of people one is tied to, since the ‘1’ in the adjacency matrix effectively captures just the people each row is connected to. (Preview: This is also how we do compound relations: Mother x Brother “Uncle”) Foundations Data Basic Measures & A little graph theory Reachability The distance from one actor to another is the shortest path between them, known as the geodesic distance. If there is at least one path connecting every pair of actors in the graph, the graph is connected and is called a component. Two paths are independent if they only have the two endnodes in common. If a graph has two independent paths between every pair, it is biconnected, and called a bicomponent. Similarly for three paths, four, etc. Foundations Data Basic Measures & A little graph theory Calculate reachability through matrix multiplication. (see p.162 of W&F) 0 1 0 0 0 1 e d c b a f 1 0 1 0 0 0 X 0 0 1 0 0 1 1 0 1 1 1 0 0 0 1 1 0 0 1 0 1 0 0 0 2 0 2 0 0 0 0 2 0 1 1 2 X2 2 0 0 1 4 1 1 2 1 1 0 1 Distance . 1 2 0 1 . 1 2 2 1 . 1 0 2 1 . 0 2 1 1 1 2 1 2 0 1 1 1 2 1 0 2 1 1 . 2 4 0 6 1 1 0 X3 0 2 6 1 2 5 5 2 5 3 6 1 0 2 0 1 1 2 0 4 0 2 2 4 2 1 5 3 2 1 4 0 6 1 1 0 1 2 1 2 2 . Distance . 1 2 3 3 1 . 1 2 2 2 1 . 1 1 3 2 1 . 1 3 2 1 1 . 1 2 1 2 2 1 2 1 2 2 . Foundations Data Basic Measures & A little graph theory Mixing patterns Matrices make it easy to look at mixing patterns: connections among types of nodes. Simply multiply an indicator of category by the adjacency matrix. e d c b a f 0 1 0 0 0 1 1 0 1 0 0 0 X 0 0 1 0 0 1 1 0 1 1 1 0 0 0 1 1 0 0 1 0 1 0 0 0 Race 1 0 1 0 0 1 0 1 0 1 1 0 R G R 4 2 Race`(X)Race= G 2 6 X(Race) 2 0 1 1 2 2 0 2 0 2 1 1 Foundations Data Basic Measures & A little graph theory Matrix manipulations allow you to look at direction of ties, and distinguish symmetric from asymmetric ties. To transform an asymmetric graph to a symmetric graph, add it to its transpose. 0 1 0 0 0 1 0 1 0 0 X 0 0 0 0 1 0 0 1 0 1 0 0 1 0 0 0 1 0 0 0 1 0 0 0 0 XT 0 0 1 0 0 0 1 0 1 0 0 0 1 1 0 0 2 0 0 0 2 0 1 0 0 0 1 0 1 2 0 0 1 0 1 0 0 2 1 0 Max Sym 0 1 0 0 0 1 0 1 0 0 0 1 0 1 1 0 0 1 0 1 0 0 1 1 0 MIN Sym 0 1 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 Social Network Software UCINET •The Standard network analysis program, runs in Windows •Good for computing measures of network topography for single nets •Input-Output of data is a special 2-file format, but is now able to read PAJEK files directly. •Not optimal for large networks, but much better than it used to be! •Available from: Analytic Technologies Social Network Software PAJEK •Program for analyzing and plotting very large networks •Intuitive windows interface •Used for most of the real data plots in this presentation •Started mainly a graphics program, but has expanded to a wide range of analytic capabilities •Can link to the R & SPSS statistical package •Free •Available from: Social Network Software SPAN - Sas Programs for Analyzing Networks (Moody, ongoing) •is a collection of IML and Macro programs that allow one to: a) create network data structures from nomination data b) import/export data to/from the other network programs c) calculate measures of network pattern and composition d) analyze network models •Allows one to work with multiple, large networks •Easy to move from creating measures to analyzing data http://www.soc.duke.edu/~jmoody77/span/span.zip Social Network Software STATNET •Program designed to estimate statistical models on networks in R. Statnet Team http://csde.washington.edu/statnet/ Other R Resources: Carter Butts (UC-Irvine, Sociology) – SNA & PermNet •Program for general network analysis in R •Does most of what we’ve discussed today… Social Network Software STATNET •Program designed to estimate statistical models on networks in R. Statnet Team http://csde.washington.edu/statnet/ Other R Resources: iGraph Social Network Software Other R Resources: McFarland’s R Labs: Social Network Data Collecting: theory concepts What information do you want to collect? This is ultimately a theory question – about how you think the social network setting matters. Some dimensions to this question: •“actually existing social relations” or “perceived relations” •“Who do you eat lunch with?” vs. “Who is your friend” •“Who do you talk to” vs. “who is important in your life” •Are you more interested in getting the right contacts or the right type of contacts? •Dynamism: “Episodic” relations or “typical”/ “long-term” ties? •Research shows that people have a bias toward naming the normal – so we include people who are “usually” there – is that what you want? •Do you need to be able to distinguish naming flux from structural dynamics? Social Network Data Level of Analysis What scope of information do you want? •Boundary Specification: key is what constitutes the “edge” of the network Local “Realist” (Boundary from actors’ Point of view) Nominalist (Boundary from researchers’ point of view) Global Everyone connected to ego in the relevant manner (all friends, all (past?) sex partners) All relations relevant to social action (“adolescent peers network” or “Ruling Elite” ) Relations defined by a name-generator, typically limited in number (“5 closest friends”) Relations within a particular setting (“friends in school” or “votes on the supreme court”) Social Network Data Level of Analysis Boundary Specification Problem While students were given the option to name friends in the other school, they rarely do. As such, the school likely serves as a strong substantive boundary Social Network Data Level of Analysis Boundary Specification Problem Granis: Boundary can be temporal: effects on characteristics of the PhD exchange graph. Social Network Data Level of Analysis Network Sampling 1. The level of analysis implies a perspective on sampling: 1. Local random probability sampling 2. Complete Census These two are not as dissimilar as they may appear: a) Local nets imply global connectivity: a) Every ego-network is a sample from the population-level global network, and thus should be consistent with a constrained range of global networks. See Jeff Smith’s work on this. b) If you have a clustered setting, many alters in a local network may overlap, making partial connectivity information possible. c) For attribute mixing (proportion of whites with black friends, etc.), egonetwork data is sufficient to draw population inference Social Network Data Level of Analysis Network Sampling 1. The level of analysis implies a perspective on sampling: 1. Local random probability sampling 2. Complete Census These two are not as dissimilar as they may appear: a) Complete networks never are: a) Lots of efficiency in data collection is had because you don’t have to ask ego about alter’s characteristics if you are going to interview alter anyway. But, taking advantage of this efficiency assumes very high coverage rates. Social Network Data Level of Analysis Network Sampling 2. Snowball and “link trace” designs Link-Tracing Designs Ego-networks Complete Census Basic idea is to use “adaptive sampling” – start with (a) seed node(s), identify the network partners, and then interview them. Earliest “snowball” samples are of this type. Most recent work is “respondent driven sampling. (RDS)” -- If done systematically, some inference elements are knowable. Else, you have to try and disentangle the sampling process from the real structure Social Network Data Level of Analysis Snowball Samples: • Start with a name generator, then any demographic or relational questions. • Have a sample strategy, examples include: • Random Walk designs (Klovdahl) • Strong tie designs • All names designs • Cross Links (Project 90) • Get contact information from the people named • If time, as at least some of the people a “network density” module as you would for any ego-network design Snowball samples are very effective at providing network context around focal nodes, RDS tools allow generalizations from these data. Social Network Data Sources Existing Sources of Social Network Data: There are lots of network data archived. Check INSNA for a listing. The PAJEK data page includes a number of exemplars for large-scale networks. Local Network data: • Fairly common, because it is easy to collect from sample surveys. • GSS, NHSL, Urban Inequality Surveys, etc. • Pay attention to the question asked • Key features are (a) number of people named and (b) whether alters are able to nominate each other or not. Social Network Data Sources Existing Sources of Social Network Data: Partial network data: • Much less common, because cost goes up significantly once you start tracing to contacts. • Snowball data: start with focal nodes and trace to contacts • CDC style data on sexual contact tracing • Limited snowball samples: • Colorado Springs drug users data • Geneology data • Small-world network samples • Limited Boundary data: select data within a limited bound • Cross-national trade data • Friendships within a classroom • Family support ties Social Network Data Sources Existing Sources of Social Network Data: Complete network data: • Significantly less common and never perfect. • Start by defining a theoretically relevant boundary • Then identify all relations among nodes within that boundary • Co-sponsorship patterns among legislators • Friendships within strongly bounded settings (sororities, schools) • Examples: • Add Health on adolescent friendships • Hallinan data on within-school friendships • McFarland’s data on verbal interaction • Electronic data on citations or coauthorship (see Pajek data page) • See INSNA home page for many small-scale networks Social Network Data Sources Existing Sources of Social Network Data: Complete network data: • Electronic Trace Data • Examples: • Sensor data (PNAS on high-school) • Cell Phone logs • Email logs • Bluetooth devices • Web traffic cookie data (which sites do you visit) • Often complete and non-intrusive; but meaning is still ambiguous and there are potential ethical issues that run deep. Social Network Data Sources - Survey a) Network data collection can be time consuming. It is better (I think) to have breadth over depth. Having detailed information on <50% of the sample will make it very difficult to draw conclusions about the general network structure. b) Question format: • If you ask people to recall names (an open list format), fatigue will result in under-reporting • If you ask people to check off names from a full list, you can often get over-reporting c) It is common to limit people to a small number if nominations (~5). This will bias network measures, but is sometimes the best choice to avoid fatigue. d) People answer the question you ask, so be clear in what you ask. Social Network Data Sources - Survey Local Network data: • When using a survey, common to use an “ego-network module.” • First part: “Name Generator” question to elicit a list of names • Second part: Working through the list of names to get information about each person named • Third part: asking about relations among each person named. GSS Name Generator: “From time to time, most people discuss important matters with other people. Looking back over the last six months -- who are the people with whom you discussed matters important to you? Just tell me their first names or initials.” Why this question? •Only time for one question •Normative pressure and influence likely travels through strong ties •Similar to ‘best friend’ or other strong tie generators •Note there are significant ambiguities with this name generator Social Network Data Sources - Survey Electronic Small World name generator: Social Network Data Sources - Survey Local Network data: The second part usually asks a series of questions about each person GSS Example: “Is (NAME) Asian, Black, Hispanic, White or something else?” ESWP example: Will generate N x (number of attributes) questions to the survey Social Network Data Sources - Survey Local Network data: The third part usually asks about relations among the alters. Do this by looping over all possible combinations. If you are asking about a symmetric relation, then you can limit your questions to the n(n-1)/2 cells of one triangle of the adjacency matrix: 1 2 3 4 5 1 2 3 4 5 GSS: Please think about the relations between the people you just mentioned. Some of them may be total strangers in the sense that they wouldn't recognize each other if they bumped into each other on the street. Others may be especially close, as close or closer to each other as they are to you. First, think about NAME 1 and NAME 2. A. Are NAME 1 and NAME 2 total strangers? B. ARe they especially close? PROBE: As close or closer to eahc other as they are to you? Social Network Data Sources - Survey Local Network data: The third part usually asks about relations among the alters. Do this by looping over all possible combinations. If you are asking about a symmetric relation, then you can limit your questions to the n(n-1)/2 cells of one triangle of the adjacency matrix: Social Network Data Sources - Survey Snowball Samples: Social Network Data Sources - Survey Complete Network data • Data collection is concerned with all relations within a specified boundary. • Requires sampling every actor in the population of interest (all kids in the class, all nations in the alliance system, etc.) • The network survey itself can be much shorter, because you are getting information from each person (so ego does not report on alters). • Two general formats: • Recall surveys (“Name all of your best friends”) • Check-list formats: Give people a list of names, have them check off those with whom they have relations. Social Network Data Sources - Survey Complete network surveys require a process that lets you link answers to respondents. •You cannot have anonymous surveys. •Recall: •Need Id numbers & a roster to link, or handcode names to find matches •Checklists •Need a roster for people to check through Social Network Data Sources - Survey Some important points that should (?) be obvious: •You cannot have anonymous complete network surveys (you have to match names). •With a recall format: •Get some way to uniquely identify nodes • ID Numbers from a roster link, phone numbers, etc. • If you have to hand-code, compare to a roster or w. an expert informant •Need to have uniform nomination opportunity: •If A is asked to nominate B, but B is never given the opportunity…standard measures don’t work. •Checklists •Need a roster for people to check through Social Network Data Sources - Archive We often have information on links among people or organizations from archival records. Examples: •Citation or Acknowledgements in Science Networks •Co-membership in boards of directors •See as examples: Olimoney.net or theyrule.org both projects that use electronic tools to “scrape” the web for data on companies or campaign contributions. •http://dirtyenergymoney.com/view.php •http://www.theyrule.net/ Social Network Data Name Generator Effects Key Questions – How does the question affect responses? “Cloning Headless Frogs” & “Trivial Topics & Rich Ties” 1) What do people talk about? 2) Given the heterogeneity of the topics discussed, is there a foundation from which one could use the GSS data to describe anything meaningful about core discussion networks? Social Network Data Accuracy & Missing Data – Cloning Headless Frogs Key Questions: 1) What do people talk about? & Who did they talk to? Note that the topic was heavily dependent on the questionnaire order. In this survey, it was the first question. Social Network Data Accuracy & Missing Data – Cloning Headless Frogs Social Network Data Accuracy & Missing Data – Cloning Headless Frogs talks about what with who? Connections are significant cells from table 5. Social Network Data Accuracy & Missing Data – Rich Ties talks about what with who? Brashears Social Network Data Accuracy & Missing Data – Rich Ties talks about what with who? Brashears Social Network Data Accuracy & Missing Data 1) Why do so many people not report talking about anything with anybody? • 44% report nobody to talk to • More likely to be without spouses, unemployed and non-white • 56% report nothing important to talk about. Other key bits on data collection: 2) Structure of the survey matters: lines given one of the best predictors of number of relations observed 3) Reciprocity is not expected, particularly w. limited response formats. - Better to symmetrize based on “or” Social Network Data Effects of missing data Whatever method is used, data will always be incomplete. What are the implications for analysis? Example 1. Ego is a matchable person in the School Out Un Ego M M True Network Out Un Ego M M Out Un M M M M Observed Network Social Network Data Effects of missing data Example 2. Ego is not on the school roster M M Un Un M M M M M M M Un Un Un True Network M Observed Network Social Network Data Effects of missing data Example 3: Node population: 2-step neighborhood of Actor X Relational population: Any connection among all nodes 1-step 2-step 3-step F 1.1 1.2 1.3 1.4 1.5 2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 3.1 3.2 3.3 F 1 2 3 4 5 1 2 3 4 5 6 7 8 1 2 3 Full (0) Full Full (0) F Full Full Full (0) F (0) Full Full UK F (0) Full (0) Unknown UK Social Network Data Effects of missing data Example 4 Node population: 2-step neighborhood of Actor X Relational population: Trace, plus All connections among 1-step contacts 1-step 2-step 3-step F 1.1 1.2 1.3 1.4 1.5 2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 3.1 3.2 3.3 F 1 2 3 4 5 1 2 3 4 5 6 7 8 1 2 3 Full (0) Full Full (0) Full Full (0) F Full F (0) Full Unknown UK F (0) Full (0) Unknown UK Social Network Data Effects of missing data Example 5. Node population: 2-step neighborhood of Actor X Relational population: Only tracing contacts 1-step 2-step 3-step F 1.1 1.2 1.3 1.4 1.5 2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 3.1 3.2 3.3 F 1 2 3 4 5 1 2 3 4 5 6 7 8 1 2 3 Full (0) Full Full (0) Full Full (0) F Unknown F (0) Full Unknown UK F (0) Full (0) Unknown UK Social Network Data Effects of missing data Example 6 Node population: 2-step neighborhood from 3 focal actors Relational population: All relations among actors Focal 1-Step 2-Step 3-Step Focal Full Full Full (0) Full (0) 1-Step Full Full Full Full (0) 2-Step Full (0) Full Full UK 3-Step Full (0) Full (0) Unknown UK Social Network Data Effects of missing data Example 7. Node population: 1-step neighborhood from 3 focal actors Relational population: Only relations from focal nodes Focal 1-Step 2-Step 3-Step Focal Full Full Full (0) Full (0) 1-Step Full Unknown Unknown Full (0) 2-Step Full (0) Unknown Unknown UK 3-Step Full (0) Full (0) Unknown UK Social Network Data Effects of missing data on measures Smith & Moody, 2014 Identify the practical effect of missing data as a measurement error problem: induce error and evaluate effect. Randomly select nodes to delete, remove their edges & recalculate statistics of interest. Social Network Data Effects of missing data on measures Smith & Moody, 2014 Social Network Data Smith & Moody, 2014 Effects of missing data on measures Centrality Social Network Data Smith & Moody, 2014 Effects of missing data on measures Homophily Social Network Data Effects of missing data on measures What to do about missing data? Easy: • Do nothing. If associated error is small ignore it. This is the default, not particularly satisfying. Harder: Impute ties • If the relation has known constraints, use those (symmetry, for example) • If there is a clear association, you can use those to impute values. • If imputing and can use a randomization routine, do so (akin to multiple imputation routines) • All ad hoc. Hardest: • Model missingness with ERGM/Latent-network models. • Build a model for tie formation on observed, include structural missing & impute. Handcock & Gile have new routines for this. • Computationally intensive…but analytically not difficult. Social Network Data Ethics – Data Collection Goal: To gain key social insights in a manner than helps without hurting. - Responsibility to respondents - Fair, honest, safe treatment in return for participation. - Safety has typically relied on some combination of: a) Informed consent (they know what they are in for) b) Anonymity / Confidentiality - Responsibility to other network scientists - Our works should not jeopardize other’s ability to work Key problems: - Need to link respondents means anonymity cannot be complete - Some people named may not be respondents – thus have not given consent - Position within the network may create social hardship Social Network Data Ethics – Data Collection Informed Consent - Respondents have a right to refuse to participate if they feel the work is unethical, burdensome, dangerous, or just plain don’t want to play. What standing do “secondary” respondents have? What dangers? - Imagine link-tracing from a mistress to a spouse…. Social Network Data Ethics – Data Collection Anonymity & Confidentiality Can work without names (in Ego-network) survey. But if the setting is highly clustered, you may still be able to identify through “deductive disclosure” - Deductive Disclosure Risks: Social Network Data Ethics – Data Collection Start with: 536 White, Male, 10th Graders in Two parent Households: Who are Jewish: 10 And Have No Siblings: 1 Start with: 484 White, Male, 7th Graders in Two parent Households: Who Have Ever Been Held Back A Grade in School: 87 And Play Basketball: 5 And Smoke: 1 Deductive Disclosure Risks: Social Network Data Ethics – Data Collection Start with: 87 Black, Female, 12th Graders in Two parent Households: Who have Never been Held Back: 77 And Smoke Regularly: 5 And Have 2 siblings 1 And are Catholic 1 Deductive Disclosure Risks: Social Network Data Ethics – Data Collection Start with: 98 Black, Female, 7th Graders in One parent Households: Who Are Baptist: 41 And have no Siblings: 9 And Play Baskettball: 1 And have one Sibling: 13 And Smoke: 1 And have > one Sibling: 19 And are Born in April: 1 Deductive Disclosure Risks: Social Network Data Ethics – Data Collection This same feature can allow for anonymous interviewing of sensitive partners: use simple information to find implicit matches 0.5 0.5 Long Duration Proportion of Group Proportion of Group Married 0 0 0.4 0.48 0.56 0.5 0.64 0.72 0.8 0.88 0.96 0.4 0.48 0.5 0.64 0.72 0.8 0.88 0.96 Same Sex (known to be false) Proportion of Group Proportion of Group Short Duration 0.56 0 0 0.4 0.48 0.56 0.64 0.72 0.8 0.88 Pair Mean Match Score 0.96 Observed Distribution 0.4 0.48 0.56 0.64 0.72 0.8 0.88 Pair Mean Match Score Random Distribution 0.96 Social Network Data Ethics – Data Collection Other Data Collection issues: -- Building a roster? You may need people’s permission to be on the roster. -- Binding limits of past research promises (if the interviewer knows, does that violate a “we will not tell anyone your answer” clause? -- Illicit or Illegal relations. If you find evidence of a crime in your snowball, do you have to report? (think age of sex partners vs selling/buying drugs) -- “non-invasive” data collection – bluetooth device readers, email logs, web-click or purchase / marketing data? Social Network Data Ethics – Data use How the data are ultimately used is a key issue for the analyst. Consider this diagram. What role does the social network analyst have in deconstructing terrorist networks? Criminal Cartels? Gangs? Covert connectivity is a hallmark of many illicit relations, how do we fit into that work? Social Network Data Ethics – Data use What about performace in a firm? Burt shows that one’s position in the network is key to having “good ideas” – if true, you could imagine linking position in the network to evaluations and promotions…