Download A Structural Model of Employee Behavioral Dynamics in Enterprise

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
Transcript
Carnegie Mellon University
Research Showcase @ CMU
Tepper School of Business
3-2011
A Structural Model of Employee Behavioral
Dynamics in Enterprise Social Media
Yan Huang
Carnegie Mellon University, [email protected]
Param Vir Singh
Carnegie Mellon University, [email protected]
Anindya Ghose
New York University
Follow this and additional works at: http://repository.cmu.edu/tepper
Part of the Management Information Systems Commons
This Working Paper is brought to you for free and open access by Research Showcase @ CMU. It has been accepted for inclusion in Tepper School of
Business by an authorized administrator of Research Showcase @ CMU. For more information, please contact [email protected].
A Structural Model of Employee Behavioral Dynamics in
Enterprise Social Media
Yan Huang
School of Information Systems and Management & i-lab,
Heinz College,
Carnegie Mellon University
[email protected]
Param Vir Singh
Tepper School of Business & i-lab, Heinz College,
Carnegie Mellon University
[email protected]
Anindya Ghose
Stern School of Business, New York University
& i-lab, Heinz College, Carnegie Mellon University
[email protected]
Electronic copy available at: http://ssrn.com/abstract=1785724
A Structural Model of Employee Behavioral Dynamics in
Enterprise Social Media
Abstract
We develop and estimate a dynamic structural framework to analyze social media content creation and
consumption behavior by employees within an enterprise. We focus in particular on blogging behavior by
employees. The model incorporates two key features, which are ubiquitous in blogging forums: users face
a trade-off between blog posting and blog reading as well as a trade-off between work-related content and
leisure-related content. We apply the model to a unique dataset that comprises of the complete details of
blog posting and reading behavior of 2396 employees over a 15-month period at a Fortune 1000 IT
services and consulting firm. We find that blogging has a significant long-term effect in that it is only in
the long term that the benefits from blogging outweigh the costs of content creation. There is also
evidence of strong competition among employees with regard to attracting readership for their posts.
While readership of leisure posts provides little direct utility, employees still post a significant amount of
these posts because there is a significant spillover effect on the readership of work posts from the creation
of leisure posts. In addition, we find that that the utility employees derive from posting work-related
blogging is about 4 times what they derive from leisure blogging. Reading and writing work-related posts
is more costly than leisure-related posts on an average. We conduct counterfactual experiments that
provide insights into how different policies may affect employee behavior. We find that a policy of
prohibiting leisure-related activities can hurt the knowledge sharing in enterprise setting. By
demonstrating that there are positive spillovers from leisure-related blogging to work-related blogging,
our results suggest that a policy of abolishing leisure-related content creation can inadvertently have
adverse consequences on work-related content creation in an enterprise setting.
Keywords: Structural modeling, Dynamics, Enterprise social media, Blog posting, Blog reading, Workrelated content, Leisure-related content.
Electronic copy available at: http://ssrn.com/abstract=1785724
1. Introduction
Over the last couple of years, we have seen several firms adopt various kinds of social media
technologies such as blogs and wikis for internal use. These social media technologies, often termed
collectively as Enterprise 2.0, enable firms to accelerate organizational performance by supporting not
just inward facing collaboration but also to come together to respond to customer support, innovation, and
sales and marketing opportunities. The presence of Enterprise 2.0 tools paves the way for a future where
employee-to-employee collaboration and interaction become extensions of each other and thus a useful
way of providing mutual value for the company and employees alike. Such social media tools share three
fundamental properties (McAfee 2009). First, they are typically easy to learn and use. Second, on a social
media platform, everyone typically starts on an equal footing, contributing to a blank slate. External
reputations typically matter less than internal reputation. Third, and perhaps the most remarkable, is that it
allows individuals to engage in both work and leisure-related content creation and consumption
In this study, we focus on one specific social media tool, namely enterprise-wide blogging.
Increasingly several leading organizations have systems in place in order to encourage their employees to
blog (Lee et al. 2006, Aggarwal et al. 2011, Singh et al. 2010a). Prominent adopters are General Motors,
IBM, HP, Microsoft, Infosys, Google, Charles Schwab etc. The general thinking within such firms is that
enterprise blogging forums can provide a structured platform that supports employee participation in
brainstorming and idea sharing, thus deepening a company’s pool of knowledge. Potentially this could
lead to an increase in the number of successful innovations for new products or services (Mckinsey 2009).
Thus, the use of blogs as a mechanism for knowledge creation and dissemination has increased their
adoption in enterprises (Lee et al. 2006, Huh et al. 2007, Yardi et al. 2009).
That being said, employees’ motivation to blog may not always be in alignment with a firm's
objectives. As noted above, a unique attribute of social media is such that it allows individuals to engage
in both work and leisure-related content generation1. Hence, firms often observe that some employees’
blog postings are not relevant to their work knowledge or expertise. If unchecked, such behavior can
undermine the very goal of enterprise blogging. Therefore, it is becoming increasingly important to
understand the employees’ blogging behavior in an enterprise social media setting in order to help firms
devise strategies for influencing user behavior.
Typically, an employee’s propensity to create or consume content on a blog is constrained by the
time available for such activities. In most cases, employees spend a certain amount of time on blogging
activities on a periodic (daily or weekly) basis. That is, they typically have a fixed budget of time
dedicated for such activities. Hence, their utility from blogging is determined by a choice between work
vs. leisure-related topics. Consequently, this is a trade-off they need to keep in mind. In addition, there is
1
In this paper, by leisure-related blogs we mean to refer to all blog content that is not work-related.
another type of trade-off that employees need to make in this context - how much time to spend on
content generation (i.e., posting blogs) vs. content consumption (i.e., reading blogs).
In general, there can be some reputational benefits from blogging in an enterprise setting. Workrelated blogging allows individuals to express their expertise to a broader audience at a relatively low
cost. Once employees are identified as “experts”, they may receive indirect economic incentives, such as
promotion, salary hike, etc (Kavanaugh et al. 2006, Aggarwal et al. 2011). Leisure-related blog posting,
on the other hand, can help employees become popular among peers and become “opinion leaders”. There
are benefits to reading blogs as well. An increase in work-related knowledge from reading may help them
in becoming more productive professionally, be more informed about new ideas, open up new
opportunities for collaborations, etc (Huh et al. 2007, Yardi et al. 2009). Leisure-related information can
feed to employees' interests and allow them to relax (Singh et al. 2010a), thereby indirectly improving
productivity subsequently.
From the discussion above one can imagine that the benefits of blogging are dependent on the
accumulated “reputation” of an employee over time instead of simply the contemporaneous reputation
accruing from current period participation. Hence, employees in an enterprise wide setting need to be
forward-looking, rather than myopic, when making their decisions about the two kinds of trade-offs
described above. Take work-related posting and reading as examples. As mentioned above, work-related
posting can bring many indirect benefits to employees. These benefits are based on their reputational
status from blogging, which is a long-term measure of the quality of one’s contribution over an extended
period of time. An employee with a high work-related reputation accumulated over time can still obtain
reputation-based benefits even if he/she does not post in the current period. In contrast, an employee with
a low work-reputation will find it much harder to build a high reputational status based only on the
current period posting. In other words, employees typically cannot get an immediate benefit from a single
posting in a given period; rather this posting-related reputation accrues over time based on their blogging
contributions over a longer time horizon. Similarly, an employee’s knowledge gain from reading blogs
can lead to an improvement in productivity and performance but only over a longer time horizon
(McAfee, 2006)2. All this suggests the presence of dynamics in employee behavior in this setting.
Another important characteristic of enterprise blogging is that the audience for the enterprise
bloggers is purely internal – consisting of the set of employees themselves. Since employees also have
limited time available to devote to blog reading, this causes intense competition among employees for
readership in order to vie for their attention. Blogs, in general, exhibit a strong preferential attachment
effect, which is "the rich get richer and the poor get poorer" syndrome. Employees with high reputation
continuously tend to receive higher new readership, which further reinforces their high reputations; while
2
In addition, the CEO of the Fortune 1000 firm, which provided us the data has stated on record that "Employees’
blogging activities are strongly correlated with their productivity."
2
employees with low reputation find it hard to attract readers’ attention, and consequently, their reputation
tends to further decrease over time.
Though “work reputation” and “leisure reputation” provide employees with very different
benefits, these two types of reputation can have a strong interdependence. This interdependence could be
either positive or negative. Readers may follow bloggers, not specific topics or posts. Work or leisure
posts by a given blogger appear on the same forum. Hence, employees with high levels of leisure
reputation can attract readers their work-related posts being read, and vice versa. If this were to occur,
there could be positive spillovers between work and leisure reputation. On the other hand, it is also
possible that employees with high levels of leisure reputation are seen as “unprofessional” in a work
environment. If this were to occur, readers would not expect their work-related posts to be of high quality.
In this case, leisure reputation can have a negative impact on work reputation. Because of these two
countervailing effects, it becomes useful to understand the direction and magnitude of the
interdependency between work and leisure reputation, since they each have different business and policy
implications.
Our paper aims to understand employees’ motivations for reading and writing work-related and
leisure-related posts within an enterprise setting and then draw policy implications based on potential
incentive structures. To be more specific, in this paper, we examine the following questions: 1) How do
employees in an enterprise social media setting allocate their time to the different types of blogging
activities when facing the trade-offs between work- and leisure-related content and between posting and
reading content? 2) Is there any interdependence between work- and leisure-related reputations accruing
from blogging dynamics, and if so, are these spillovers positive or negative in nature? 3) What are the key
factors that explain the observed participation patterns resulting from inter-temporal trade-offs that
employees make in such a context?
The nature of these questions and unique characteristics of enterprise blogging discussed thus far
necessitates the use of a dynamic structural model. We formulate an employee's decision on when to write
or read and what to write or read (in terms of work-related or leisure-related) as a dynamic competition
game in the tradition of structural dynamic competition games such as Bajari et al. (2007). Following
Erickson and Pakes (1995), we focus on Markov Perfect Equilibrium (MPE) as a solution concept. We
exploit revealed preference arguments from economic theory to estimate the parameters in the model.
There are two main components of the employee's utility function, which forms the core of this
paper. The first component is reputation. The reputation incentive is proportional to the cumulative
readership of an employees' blog. In each period, an employee receives utility from her cumulative
readership, or “reputation status”. Further, the work-related and leisure-related posts provide separate
kinds of reputation incentives. Individuals also derive utility from the “knowledge state” they are in,
which can be improved by reading blog posts (Singh et al. 2010, Yardi et al. 2009, Huh et al. 2007). This
knowledge state forms the second component. Similar to the reputation incentive, we allow the
3
knowledge-based utility derived from work and leisure-related posts to be different.3 Our model allows
competition among employees for attracting readers to their blogs. The model also incorporates an
employee-specific blogging time constraint. Each employee maximizes his/her utility subject to this
constraint. This captures the inherent tradeoff between devoting time to work related blogging, leisure
related blogging and other non-blogging activities.
We apply the model to a unique dataset that comprises of the complete details of blog posting and
reading behavior for a large dataset of 2396 employees over a 15-month period at a Fortune 1000 IT
services and consulting firm. Our results show that employees are indeed forward-looking. Blogging has a
significant long-term effect in that it is only in the long term that the benefits of blogging outweigh the
costs. Employees derive higher utility from readership of their work-related posts than their leisurerelated posts. There is also evidence of strong competition among employees with regard to attracting
readership for their posts. While readership of leisure posts provides less direct utility compared to work
posts, employees still create leisure-related posts as there is a significant spillover effect on the readership
of work posts from the creation of leisure posts.. In addition, we find that that the utility employees derive
from posting work-related blogging is about 4 times what they derive from leisure blogging. On an
average, we find that reading and writing work-related posts is more costly than leisure-related posts.
Using the point estimates of the parameters, we then implement two counterfactual experiments
to analyze the effects of two different policies. In the first experiment, we disallow leisure-related posting.
Interestingly, this policy does not help increase work-related posting. Instead, it discourages work-related
posting. This suggests that a policy of prohibiting leisure-related activities can hurt the knowledge sharing
in enterprise setting. Basically, two countervailing forces are created in this setting. On one hand, a policy
of disallowing leisure posting can provide employees with more time to generate work posting. On the
other hand, this policy can also eliminate spillover effects in readership from leisure to work posting and
hence reduce employees’ incentives to create work-related posting. In our setting, the second force
dominates the first, leading to an overall reduction in work-related posting.
In another policy experiment, employees are provided incentives for blogging by enabling a
reduction in the cost of blogging. Two different levels of incentives are examined - a “Medium-Incentive”
(indicating a scenario with a medium cost of blogging) and a “High-Incentive” (indicating a scenario with
zero cost of blogging). It turns out that such policies lead to an increase in both work-related and leisurerelated activities. More interestingly, we find that reading (both work and leisure) increases much more
than posting. In addition, we see that the probabilities of both work and leisure posting increase by a
larger amount than the probabilities of work and leisure reading when the firm switches from a “MediumIncentive” to a “High-Incentive” policy. More generally, this suggests that the tradeoff between posting
and reading is different under different levels of incentives.
3
Note that in this paper, an employee’s knowledge state measures the knowledge he/she gains from blogging
internally. The knowledge he/she achieves from external offline activities are not included since we do not observe
that in our data.
4
Our study aims to makes a number of contributions. First, it provides insights into an emerging
and important phenomenon of why employees in an enterprise social media setting incur the cost of
reading and writing content when there are no explicit monetary incentives for doing so. Second, to our
knowledge we are the first study that provides a structural framework to analyze employee blogging
activities in order to derive some important insights into the tradeoffs between work and leisure-related
blogging and between content creation (blog writing) and content consumption (blog reading). While
prior studies have investigated why individuals create content on blogs, those studies are based on
surveys/questionnaires, which can be affected by self-reporting bias. In contrast, our study uses actual
micro-level blogging activity data from a large enterprise-wide setting to shed light on why individuals
blog and model employee behavior accordingly. Third, we quantify the nature and magnitude of the
interdependence between work and leisure reputation. Our model thus generates some policy implications
for firms who may be contemplating prohibiting leisure-related blogging within the enterprise. By
demonstrating that there are positive spillovers from work-related blogging to leisure-related blogging,
our results suggest that a policy of abolishing leisure-related content creation can have unintended adverse
consequences on work-related content creation. Finally, a unique methodological contribution of our
study is providing a way to account for unobserved states in the Bajari et al (2007) framework through the
use of hidden Markov model (HMM) framework.
The rest of paper is organized as follows. In Section 2, we discuss the literature relevant to this
paper. Section 3 presents our model of employee blogging dynamics. Section 4 describes our data,
estimation strategies and estimation results. Some counterfactual experiments are discussed in Section 5.
Section 6 summarizes and concludes our study.
2. Literature Review
Our research is related to multiple streams of literatures. The first steam of relevant literature
relates to the impact of social media and user-generated content. Several studies have investigated how
social media or user-generated content influences variables of economic interest. Many studies have
focused on how different aspects of user-generated blogs affect product sales and market structure
(Venkataraman et al. 2009, Dhar and Chang 2009, Chintagunta et al. 2010, Dewan and Ramaprasad
2011). Hofstetter et al. (2009) study causal effect content generated in blogs has - on the users’ ability to
form social ties and the positive feedback users’ social ties has on their propensity to create content.
Another stream of work is related to enterprise adoption of social media. Internally,
organizational adoption of social media has been found to have improved access to knowledge experts,
increased the rate of solving problems, increased the rate of new product development, reduced costs of
internal communication (McAfee, 2006, Grossman et al. 2007, Gurram et al. 2008). Some recent research
examines how employees’ participation influences the benefits firms can get from implementing Web 2.0
applications (Levy 2009). Huh et al. (2007) reveal that blogs facilitates access to tacit knowledge and
5
resources vetted by experts, and, most importantly, contribute to the emergence of collaboration across a
broad range of communities within the enterprise. Yardi et al. (2009) study a large internal corporate
blogging community using log files and interviews and find that employees expect to receive attention
when they contribute to blogs, but these expectations often are not met commensurately. Singh et al.
(2010a) study blog reading dynamics of employees within a large firm and find that most of the
employees' time is devoted to reading and writing leisure-related posts. Aggarwal et al. (2010) find
negative posts act as catalyst and can exponentially increase the readership to employee blogs, and this
effect is generally enough to offset the negative effect of few negative posts on employee blogs. . Wattal
et al. (2010) study the role of network externalities on the use of blogs in an organization and show that
usage of blogs within an individual’s network is associated with an increase in one’s own usage and the
network effects are stronger for younger generations.
This paper is also related to the emerging literature on the dual role played by users in social
media. In online settings, users need to allocate resources between content generation and content usage
activities since they can take on the dual role of creators as well as consumers (Trusov et al. 2010).
However, there is little research that models or quantifies the interdependencies between how content on a
social media platform is created and consumed. The primary reason for such a gap in literature is that
while data on content creation is easily available, data on content consumption is typically not available to
researchers. However, the consumption information can be retrieved from access logs. A small but
emerging stream of work has begun to look at both content creation and consumption data (Ghose and
Han 2009, 2010, Albuquerque et al. 2010). Ghose and Han (2011) estimate a dataset encompassing users’
multimedia content creation and consumption behavior using mobile phones and find that there exists a
negative temporal interdependence between the content generation and usage behavior for a given user.
Ghose and Han (2010) find evidence of dynamic learning in such two-sided forums created by mobile
Internet based multimedia content. Albuquerque et al. (2010) use data from print-on-demand service of
user-created magazines and find that content price and content creator marketing actions have strong
effects on purchases.
Another relevant research stream is the motivation for individuals to contribute to online
communities, or social media. Many studies consistently show that status is an important factor that
explains knowledge creation, retention and transfer (Argote et al. 2003, Thomas-Hunt et al. 2003). At the
same time, status, or reputation is also one of the most important motivations for individuals to contribute
to online communities (Roberts et al. 2006). Others like Kumar and Sun (2009) and Lu et al. (2010)
empirically model various kinds of inter-temporal tradeoffs that contributors to online communities have
to make. Forman et al. (2008) and Moe and Schweidel (2010) examine how previously posted opinions in
a ratings environment may affect a subsequent individual’s posting behavior. The former study provides
evidence that community norms are an antecedent to reviewer disclosure of identity-descriptive
information; while the latter study shows that individuals vary in their underlying behavior and in their
6
reactions to previously posted product ratings: less frequent posters exhibit bandwagon behavior, and
more active customers reveal differentiation behavior when posting online opinions.
Finally, our paper is related to the literature in dynamic structural models. The dynamic game
model, in which multiple agents make decisions simultaneously and the utility each agent gets depends on
others decisions, is one specific type of dynamic structural model. It has been widely adopted applied in
industrial organization research (for example, Pakes and McGuire 1994, Ericson and Pakes 1995, Bajari
et al. 2007, Aguirregabiria and Mira 2007, Aguirregabiria and Ho 2009) and marketing (for example,
Dube et al. 2005, Sweeting 2007, Ryan and Tucker 2008, Chung et al. 2009, Misra and Nair 2009, Kumar
and Sun 2009, Ching 2010, Lu et al. 2010).4 Of these papers, the most closely related to our work are
Kumar and Sun (2009) who use a dynamic game to examine why users contribute to connected goods in
social networking sites and Lu et al. (2010) who study how the social structure of individuals on a social
media platform affects their willingness to share knowledge with peers. However, none of these papers
examine the implications of a firm’s adoption of enterprise social media on internal employee behavior
nor do they examine on their incentives for creating and consuming content internally within a firm.
3. Model
3.1 Per-Period Utility
Employees i=1,…, I decide about blogging decisions on a periodic basis for time periods t=1,...,
T. In this paper, we define a period as one week. In enterprise blogging there are two types of blog posts
that employees can generate represented by j={w, l} where w represents work-related posts and l
represents leisure-related posts. In each period, an employee decides whether to read (or post a blog of
type j). We use p to indicate ‘post’ and r to indicate ‘read’. In other words, the action that an employee
takes in each period is composed of four discrete elements, i.e. , , and , where is
an indicator variable, which equals 1 if employee i posts one type j post at time t; and is a count
variable, which takes values 0, 1, 2 and 3. We allow individuals to post up to one blog in each period, but
read up to 3 blogs in each period. If an individual posts more than one blog or read more than 3 blogs, the
number will be truncated to the maximum number.5 Hence, in total, there are 64 possible combinations of
choices an employee can make. For notational convenience, we convert the four-dimensional action space
to a one-dimensional action space , which is defined as = {0,1,2, … 63}, a finite set of 64 elements.
In every period, every employee chooses an action ∈ . In addition, = , … . , denotes the
set of actions that all employees choose at time t.
4
For a complete review of the literature, see Dube et al. (2005).
5
This assumption is based on the descriptive statistics of the data. It is motivated by the fact that fewer than 0.1% of
users post more than 1 blog or read more than 3 blogs. Further, it is also necessary for computational tractability.
7
Note that each value of is associated with only one combination of the four activities. For
instance, = 0 corresponds to the situation where = 0, = 0, = 0 and = 0 ;
= 1 indicates the situation where = 0, = 0, = 0 and = 1, etc. In other words,
knowing is equivalent to knowing , , , .
We assume that an employee's per period utility function at time t comprises of utility from
reputation (denoted by R), knowledge (denoted by K), an unobserved private shock, and everything else
in the form of an outside good. An employee's utility at time t, , is given by:
= , , , + " # , # , $ , % + & + ' .
(1)
Here denotes reputation-based utility and " denotes knowledge-based utility. is the
discounted cumulated readership employee i receives from type j posts up until period t and and are
corresponding parameters that capture the effect of reputation on utility. #
is the knowledge level of
type j for an employee i at the end of time period t and $ and % are corresponding parameters that
capture the effect of knowledge on utility. & indicates the consumption of outside goods, with the utility
from a per unit consumption of the outside good being normalized to one. The outside good option
denotes different kinds of activities that an employee can engage in besides blogging. ' is the
action specific random shock associated with the utility that may affect an employee's decisions. Before
choosing
his/her
actions,
employee
i
receives
a
vector
of
choice-specific
shocks,
γ,- = γ,- 0, γ,- 1, … , γ,- 63 Each element in ' has type one extreme value distribution and is i.i.d.
.
across individuals and actions 6 . When employee i chooses an action , the choice-specific shock
associated with this particular action, i.e. ' , is realized and goes into individual’s current period
utility. The first four elements in the utility function are further explained and operationalized later. A
summary of all notations and variables is given in Table 1.
6The error term captures the private information of an individual i, and it is i.i.d. across individual, time and choices.
Suppose an employee comes back from an interesting leisure trip and generates a blog posting describing his
experiences during the trip. This would satisfy the independence assumption between his reading and writing.
Suppose an employee uses some technical tool in the workplace and discovers that by using this tool he is better
able to increase his productivity or he is able to acquire some deeper set of knowledge. This would be an example,
which satisfies the independence assumption between individuals. We need to assume the distribution of the error
term to achieve identification. If we assume other distribution of the error term, the new model will be
observationally equivalent to our current model (Arcidiacono and Miller, 2006).
8
Table 1: Summary of Notations
Variable Meaning and Corresponding Parameters in Parenthesis
log ( )7
log # ($ )
# %
/ (0 )
1 2
i, j,t
, 3
, 3
' "
4
& , 45
6
7
8, 9
Natural log of cumulative reputation (measured by depreciated past readership and current
period readership) of work-related posts for employee i in period t.
Natural log of cumulative reputation (measured through depreciated past readership and
current period readership) of leisure-related posts for employee i in period t.
Work-related knowledge state (measured in levels. The number of levels specified after
HMM estimation)
Leisure-related knowledge state (measured in levels. The number of levels specified after
HMM estimation)
Binary variable with 1 denoting employee i posts work-related post in period t and 0
otherwise (Cost of posting work-related posts in terms of time)
Binary variable with 1 denoting employee i posts leisure-related post in period t and 0
otherwise (Cost of posting leisure-related posts in terms of time)
Number of work-related posts employee i read in period t (Cost of reading one work-related
post in terms of time and effort)
Number of leisure-related posts employee i read in period t (cost of reading one leisurerelated post in terms of time and effort)
Other notations
Indices of employee, post types and periods (week)
Employee i’s action and states in period t
Vectors of actions and states of all employees in period t
Employee i’s choice specific private shock in period t
Employee i’s reputation-based utility in period t
Employee i’s knowledge-based utility in period t
Time cost of post (read) type j posts each period
Consumption of outside goods and its associated time cost.
Budget constraint assigned to individual i in period t
New readership employee i receives in period t from type j posting
Discount factors in life time utility function and depreciation factor in accumulation of the
readership
3.1.1 Reputation
Prior work has shown that those individuals who are identified as possessing expertise are often
afforded power and status within the organization (French and Raven 1959). Hence, expertise sharing can
produce significant personal benefit in terms of increase in reputation within the organization (Constant et
al. 1994, Thomas-Hunt et al. 2003, Lu et al. 2010). These benefits are applicable in blog settings as well,
since intuitively, when employees blog they are sharing their expertise with others. Employees derive
reputation benefits from the posts they write based on their area of expertise (Nardi et al 2004). We
assume that an employee derives reputational benefit consistent with her readership. In the utility
function, the reputation-based utility is incorporated as where
, , , = :;< + :;< (2)
Note that natural log of zero is -∞. Hence we use the following transformations = + 1 and R ,-> =
R,-> + 1.
7
9
We argue that employees derive reputation-based utility from their cumulative readership
(accumulated over time), and not just the contemporaneous readership (readership they receive in the
current period). This is because employees’ readership in previous periods can be carried on to the next
period with a certain discount rate. Even if one does not write any posts in the current period, he/she can
still enjoy the benefits from the reputation he/she has built in the past. We apply a log transformation here
to adjust the over dispersion of the cumulative readership. The log transformation also results in a
concave relationship between reputation-based utility and cumulative readership. This makes sense
because one additional unit of readership does not provide as much additional utility to those who have a
very high level of cumulative readership compared to those who have a low level of cumulative
readership.
We allow for the work and leisure reputations to enter separately into the model as they may lead
to different kinds of incentives for blogging. Work-related posts can express a user’s expertise in workrelated knowledge, which may help an employee derive indirect/direct economic incentives within the
enterprise (McAfee 2006). Leisure-related posts may benefit the individual for becoming more popular
and help in ego-boosting or gaining higher self-satisfaction (Nardi et al. 2004).
3.1.2 Knowledge
Blogs facilitate access to tacit knowledge and resources vetted by experts. The primary reason
why corporations allow their employees to participate in blogging activities during their work hours is
because the employee blogs act as a new source of work relevant knowledge sharing within the enterprise
(Huh et al. 2007; Lee et al. 2006; Singh et al. 2010a; Yardi et al. 2009). Employees can acquire
knowledge by reading other's posts. When employees read others’ work-related posts they can become
more productive, more informed about new ideas, and more aware of their colleagues’ expertise, all of
which opens up new opportunities for collaborations and for expressing their own opinions in future posts
(Singh et al 2010). Leisure posts can help the reader relax and refresh. Further, individuals have an
inherent need for leisure, which leisure reading and posting can provide. As in the case of reputation, the
two types of posts affect the type and level of incentives differently. Hence, we incorporate them
separately. In the utility function, the knowledge-based utility is captured by " where
" # , # , $, % = $ # + % #
(3)
3.1.3 Budget Constraint
Every week an employee has a time budget (6 ) in order to engage in different activities. Note
that 6 is basically 24×7 hours per week. Let 4
be the cost of identifying and reading a type j post and
4
be the cost of developing and writing a type j post. Then we have the following budget constraint:
6 = @ 4
+ @ 4
+ 45 & .
10
Here, & is the outside good consumption and 45 is the associated coefficients that capture the
per unit time cost of consuming outside good. This budget constraint allows us to capture the tradeoff that
an individual would consider while deciding time to allocate between blogging and non-blogging
activities. Let us define / = 4 ⁄4A ; 0 = 4 ⁄4A ; 1 = 4 ⁄4A ; 2 = 4 ⁄4A, which can be interpreted
as the relative cost of participating in the four kinds of blogging activities compared to participating in
non-blogging activities. Then the budget constraint can be written as:
6
= / + 0 + 1 + 2 + & .
45
Solving for & , we get
& =
DEF
G
− / − 0 − 1 − 2 Substituting Equations (2), (3) and (4) into the utility function (1) gives:
= log + log + $ # + % # +
(4)
DEF
G
−/ − 0 − 1 − 2 + ' Since, 6 /45 8 affects all choices in the same way, we drop it from the utility function and rewrite the
utility function as
= log + log + $ # + % #
−/ − 0 − 1 − 2 + ' (5)
Note that here, 6 , the total time budget is exogenous, which is basically 24×7 hours per week. However,
the blogging budget is endogenous. One of the many (64) choices available to employees includes 0
work-related posting, 0 leisure-related posting, 0 work-related reading and 0 leisure-related reading. If an
individual chooses this option, he/she spends the entire time budget on the outside good, and none on
blogging activities. If an individual chooses to post both work-related and leisure-related blogs and read
multiple work-related posts and multiple leisure-related posts, he/she spends a lot of time on blogging and
less time on the outside good. In other words, by including the possibility of 0 posting and 0 reading in
the action space, we are able to capture the trade-off individuals make between blogging and the spending
time on the outside good.
8
Please refer to General Appendix 1 for details.
11
3.2 Dynamic Game
3.2.1 State Variables
Note that the first four elements in Equation (5) are all state variables that evolve according to the
actions employees take in each period. The reputation states evolve as follows:
= 9M
+ 7
where 7
= N , M , , M , # , # , ;7<344O3 (6)
Here, 7
is the number of people who read i's type j post in period t. It is a function of individual i’s own
reputation states and his/her competitors’ reputation states, both work-related and leisure related, as well
as his own knowledge states. We also include the variable “orgstatus” to capture any systematic
differences in readership of blogs posted by employees with different organizational status. We have data
on employees’ organizational rank in the firm and use that to categorize employees into “high” or “low”
status groups. We allow a very flexible non-parametric specification of the dependence between 7
and
the state variables. In addition, δ is a depreciation factor, which is set at 0.95.9 This implies that the
contribution of past readership to current reputation declines as time passes by. Individuals will have to
continue posting to maintain a high reputation.
The third and fourth state variables are the knowledge levels of employee i, # and # . While,
we observe the reading behavior among employees, the evolution of individuals’ knowledge states is
unobserved to us. Hence, we follow the spirit of Arcidiacono and Miller (2011) and employ a HMM
framework to identify the unobserved knowledge states and their evolution. An HMM is a model of a
stochastic process that cannot be observed directly but can only be viewed through another set of
stochastic processes that produce a set of observations (Rabiner 1989). We treat the knowledge evolution
to be an unobserved stochastic process. Further, we treat the combined four actions to be the statedependent observed process. Specifically, we consider knowledge to be a discrete unobserved state and
assume that knowledge evolution follows a Markov process. This Markovian transition of knowledge
states is consistent with literature on knowledge evolution (Singh et al 2010b). We further enforce that the
probability of transitioning to a high knowledge state is higher if an individual reads a post than
otherwise. The two knowledge state variables are assumed to evolve independent of each other.
We define sit as the set of the state variables for an employee i at period t. Then 3 =
P , , # , # , ;7<344O3 Q is set of values of the state variables for an employee i at time t. We
further define 3M = M , M , ;7<344O3M as the set of observed state variables of i’s peers. Here,
we assume individuals do not observe other people’s knowledge states. This is a reasonable assumption
because it is unlikely that individuals would know of the extent of knowledge gain from blog reading by
others.. In our Robustness Check section, we discuss how the data also supports this assumption. All
9
The qualitative nature of the results is robust to several other values of the depreciation factor as shown in the
section on robustness checks.
12
employees of the firm constitute the group of peers for a given employee. Then the strategy profile for i
depends on 3 = (3 , 3M ). 10
3.2.2 Sequence of Events
The specific sequence of events in our model is as follows:
1. Employees observe their states 3 and their peers’ states 3M .
2. Employees receive a set of choice-specific random shocks (' .
3. Employees make an expectation of the readership (7
they may get in the current period if they write
a post.
4. The employee probability of moving up a knowledge state is just a function of his/her current
knowledge state and reading and writing actions. The employee has perfect information about the
knowledge state transition probabilities.
5. Employees make decisions as to what to write or read in the current period.
6. Employees receive utility because of their decisions.
7. Employee states evolve to 3R because of their and their peers' decisions.
3.2.3 Long Term Utility Function
We model the employee's writing and reading decisions as a dynamic optimization problem. The
employee's tasks are to decide when and whether to read work post, read leisure post, write work post,
and write leisure post to maximize the sum of the discounted expected future utility over the infinite
horizon.
STUEF V∑Y
Z 8 , 3 , ' |3 , ' (7)
The variable 8 is the common discount factor. The operator V [∙] denotes the conditional
expectation operator given the employee's information at time t. There are three components of the model
that need to be emphasized. The employee in our model maximizes his lifetime utility, which makes the
model dynamic. Since, the present actions and states of an employee affect his/her future utility by
affecting the states, the employee is forward looking. The employee balances his/her time between
reading and writing, and work and leisure due to the budget constraint. Hence, his/her actions are
interdependent. At the same time, the utility of an employee is a function of the decisions made by his/her
10
One notable point about this strategy profile is that ideally, st represents the states of all employees. A key
assumption of Markov perfect equilibrium is that each agent considers the states of all other agents. However, in our
case, there are 2396 employees. Tracking every employee’s state would make our model intractable. To deal with
this problem, following Aguirregabiria and Ho (2010), we introduce a simplifying but more realistic assumption that
employees make their decisions in accordance with their own state and the average state of other employees. This is
a reasonable assumption because in reality, it is infeasible for employees to track all other employees’ states and
make decisions accordingly. Instead, employees may only want to get a general idea about what the other employees’
states are and make decision based on his/her own state and the moments of others' states. In other words, a strategy
is a function of employees’ own states and the moments of states of the competing group.
13
peers (through 7
) making it a multi-agent dynamic game. As will be explained later, the readership that
the employee gets for his/her posts is not only a function of his/her states but also a function of his/her
peers' states.
3.2.4 Equilibrium concept
Following Ericson and Pakes (1995) we focus on MPE as a solution concept. We assume that the
behavior is consistent with MPE. In an MPE, each employee's behavior depends only on the current states
and his current private shock. Formally, a Markov strategy for an employee i is a function σ : ` × a → ,
where ` denote states of all individuals, a denotes employee i's private random shock, and Ai denotes the
set of individual i's actions. A profile of Markov strategies is a vector, c = d , … , d , where c: ` × a ×
… .× ae → . Here, we drop the time index because the strategy profile is time invariant. If behavior is
driven by a Markov strategy profile c , employee i's expected utility given state s can be written
recursively as Bellman Equation:
f g; c = Vh ij cg, k, g, ' + 8 l f g′; cng o |cp, k, pq pr.
Here, f is a value function, which reflects the expected value for employee i at the beginning of a
period before private shocks are realized. Following the literature, a profile c is a MPE if, given the
opponent profile cMs , each employee i prefers its strategy d to all alternate strategies d ′. That is, for c to
be MPE
f g; c, cM ≥ f g; do , cM .
3.3 Identification and Normalization
There are a few identification issues that need to be addressed before the model can be
consistently estimated. First, we assume that the private shocks are extreme value type 1 distributed as is
common in the literature. Second, in the HMM we assume that the probability of knowledge increase is
higher for an individual as a result of reading than otherwise. We allow this by assuming that reading has
a non-negative impact on knowledge increase. Third, we cannot identify 45 and the cost-specific
parameters together. Hence, we normalize 45 =1. Fourth, we assume that the knowledge state of an
employee is a private state and make an exclusion restriction. An employee's actions, while depending on
his/her own knowledge state, are independent of his/her peer's knowledge states. Such exclusion
restrictions are commonly imposed in dynamic structural models (Bajari et al 2009). In Section 4.5 we
relax some of these assumptions to check for robustness.
We do not know the initial values of the state variables, which may raise the well-known “initial
conditions” problem (Erdem et al. 2003; Hendel and Nevo 2006). If there is no unobserved state in our
model, Bajari et al. (2007) (hereafter BBL) estimation method can actually avoid this problem. One key
assumption of BBL estimation is that all agents are playing stationary equilibrium strategies (Bajari et al
14
2007); therefore, one can back out the optimal decision rule directly from the data. The only thing one
needs to make sure is that only the data after the equilibrium is reached is used for estimation. To ensure
this, we use a part of our data (16 weeks) as the pre-estimation sample and estimate our model using the
rest of the data (48 weeks). We also do a robustness check with a 30-week hold out period and find no
change in the qualitative nature of the results. However, since knowledge states in our model are
unobserved states, there is another type of initial condition problem that we have to account for: we do
not know the initial distributions of unobserved states at the first data point we use and need to use them
to estimate the equilibrium decision rule. Hence, we follow an approach that is similar in the spirit of
Arcidiacono and Miller (2011). We estimate the initial unobserved state distributions and decision rules
jointly (Details are in a Technical Appendix to this paper).
4. Empirical Estimation
4.1 Data
Our data comes from a large, Fortune 1000 information technology services, business process
outsourcing, and consulting firm. Fortune named this firm one of the fastest growing companies in 2009
(Fortune 2009). Its annual revenues in year 2009 were a few billion dollars. It is a US-based firm with
significant presence and operations in several other countries across multiple continents: Europe, Asia,
and Americas being the major areas. To influence more knowledge and information sharing across as well
as within locations, the firm has undertaken several measures. Prominent among these measures is the use
of Web 2.0 technologies such as blogs for use within the enterprise.
These blogs are hosted on an internal platform and are not accessible by people outside the
organization. Every employee is allowed to host his/her own blog on this platform. These blogs are
accessible to all the employees of the firm across the entire hierarchy. There is no explicit reward
structure offered by the firm. The identity of the blogger is also revealed on the blog. A brief description
of the most recent posts by an employee are displayed on her blog homepage in the order in which they
were posted (the most recent post appears on top). This listing is done irrespective of the category or topic
of post. That is, both work- and leisure-related posts appear on the homepage. The titles of these posts act
as hyperlinks and the users can click on these links to get to the full content of the post.
Bloggers classify their posts into one of several categories (for example, software testing, movies,
history, knowledge management, senior management, etc). To be able to measure the knowledge sharing
aspect, the firm tracks which employee reads which blog at what time. The readership of each post is
displayed along with the blog post. This readership count is observable to everyone. Since the blogs are
only internally accessible, the firm does not impose any restrictions on the kind of posts that can be
written by the employees. To analyze the type of content that is being shared in the internal blogosphere,
the firm broadly classifies the blog article categories into two topics: Work-related (w) and Leisurerelated (l). Table 2 presents the sub-categories that constitute each topic.
15
Table 2: Blog Post Classification
Topics
Sub Categories
Leisure-Related
Work-Related
Fun; Movies-TV-Music; Sports;
Puzzles; Chip-n-putt; ReligionSpiritual-Culture; History-Culture;
Photography; Arts; Poetry-Stories;
Books; Geographies
FLOSS;
Technology;
Testing;
Domains;
Corporate Functions; Knowledge Management;
Project Management; Business Development;
Senior
Management;
Practices-ProgramsAccounts
Our data consists of blogging activities of 2396 employees over a 15-month (64 weeks) period.
We have data on exact timestamps of blog reading and posting activities. For the purpose of estimation,
we define a period as one week. This provides us data for 64 periods in total. We treat the first 16 weeks
as the hold-out period, and estimate the model using data from the remaining 48 weeks.
High-level descriptive statistics of our data are shown in Table 3. These employees wrote 26075
posts during the 64 week period, indicating that not every employee generates a blog posting in every
period. Out of these, 9967 posts are work-related and 16108 posts are leisure related. There were 26831
readings of work posts and 37265 readings of leisure posts by these employees during the 64 week
period. We classify the organizational status of an employee into “high” and “low” based on his position
in the organizational hierarchy. Because this firm is in the IT industry, anyone at or above the rank of a
Director in the company is classified as a “high status” individual. Of the 2396 employees in our sample,
214 employees are classified as “high status” while the remaining 2182 are classified as “low status”.
The descriptive statistics for the key variables used to construct the model variables are presented
in Table 4. Both reading and writing exhibit a long tail distribution where a few employees read and write
a lot while the majority of others participate little. On average, every week, 6% of employees post a workrelated blog, 11% employees post a leisure-related blogs, 17% read work-related blogs and 24% read
leisure–related blogs. 40.03 employees on an average read a given work-related post. 107.48 employees
on an average read a given leisure-related post.
Table 3: Overall Sample Statistics
Number of Employees
Period length
Number of periods
Total posts written
Work-related posts written
Leisure-related posts written
Work-related post reading
Leisure-related post reading
16
2396
1 week
64
26075
9967
16108
26831
37265
Table 4: Descriptive Statistics of Key Variables
Variable
#
#
= 1
= 2
= 3
= 1
= 2
= 3
Mean
16.28
86.19
0.12
0.16
0.06
0.11
0.08
0.07
0.02
0.12
0.07
0.05
Standard Deviation
91.34
305.22
0.33
0.37
0.18
0.26
0.16
0.13
0.06
0.23
0.14
0.09
0
0
0
0
0
0
0
0
0
0
0
0
Minimum
Maximum
4043.83
7438.58
1
1
1
1
1
1
1
1
1
1
4.2 Estimation Strategy
The model parameters that need to be estimated are: ž = , , $ , % , / , 0 , 1 , 2. To
estimate the structural model, we follow the two-stage estimation procedure suggested by Bajari et al.
(2007) for this scenario. There are several other estimation procedures suggested by, for example, Pakes
et al. (2007), Pesendorfer et al. (2003), and Aguirregabiria and Mira (2007). However, these methods are
applicable to the case of dynamic games with discrete states. In contrast, the BBL estimator applies to
both the discrete and continuous state cases. We combine the BBL estimator with the Bajari et al. (2009)
and Arcidiacono and Miller (2011) to estimate the model.
In the first stage, we estimate the policy functions ( d : ` × a → ), the state transition
probabilities n: × ` → ∆`, 11 and the value functions. Our choice variables are discrete. In addition, the
utility function defined above implies additive separability and conditional independence (Rust 1994). We
define the choice-specific value function as
, p = V¡¢s [ , cMs p, ¡Ms , p + 8 l f P3′, c n3′| , cMs p, ¡Ms , pQ]
With this notation, action is employee i’s optimal choice when
, p + ' ≥ o , p + ' o ,
∀o ∈ Hotz and Miller (1993) showed how to recover the choice-specific value functions by inverting
the observed choice probabilities at each state:
o , p − , p = :¤n7o |p − :¤n7 |p
Given this, we only need to estimate the probability distribution of actions at each state, i.e.
conditional choice probability (CCP), from the data. Note that we face two challenges: (1) reputation
d : ` × a → means that policy d is a mapping from current state S and current random shock a to action .
n: × ` → ∆` means that state transition probability P is a mapping from current state S and current action A to the
change between next state and current state
11
17
states are continuous; and (2) knowledge states are unobserved. As discussed earlier, we follow
Arcidiacono and Miller (2011) and employ an HMM to identify the knowledge states, the Markovian
transition probabilities of the knowledge states as a function of actions and the most probable knowledge
state for each individual at every time point. However, several of state variables are continuous
(reputation states and the average states of peers). Therefore, n7 |3 will be a function, rather than
discrete values. Following Bajari et al. (2009), we use the sieve logit method to estimate CCP, with a
second degree orthogonal polynomial as basis functions (see Technical Appendix 1).
The estimation strategy suggested by BBL is inspired by Hotz and Miller (1993). The estimation
process uses forward simulation to estimate an employee's value functions for a given strategy profile
(including the equilibrium profile) given an estimate of state transition probabilities. Let f g; ¥, ž
denote the value function of employee i at state s, assuming i follows Markov strategy d and all his/her
peers follow cM . Then
Y
f 3; d; ž = V ¦j@ 8 cg§ , k§ , g§ , ' ; ž¨ pA = p; ž©.
ZA
If the policy profile used in this step is the policy profile estimated in the first stage then the
¬ M ; ž) from
resultant value over a number of simulated paths is an estimate of the payoff (fª g; d« , c
¬ M .
playing d« in response to all peers playing c
In the second stage, we use the estimates from first stage combined with the equilibrium
conditions of the model to estimate the underlying structural parameters.
Let ­ ∈ ® index the equilibrium conditions, so that each z denotes a particular ¯, g; d′ combination. Let us further define:
<­; ž, ℵ = f g; d , cM ; ž; ℵ − f g; d′ , cM ; ž; ℵ
(8)
Here, ℵ reflects that c and state transitions are parameterized by ℵ. The equilibrium condition is
satisfied at ž, ­ if <­; ž, ℵ ≥ 0. This is estimated through simulated minimum distance estimator. One
needs to find a vector Θ that minimize the objective function:
±Θ, ℵ = lmin<­; ž, ℵ, 0 ³´
where ³´ is the distribution of the alternative policies. The function min<­; ž, ℵ, 0 can be
interpreted as a penalty when the alternative policy can generate higher value. The estimate of Θ should
minimize the penalty score. In our estimation, we choose 1000 alternative policies from ³´. In addition,
for each of the alternative policy, we run 1500 iterations of simulations. Please refer to a Technical
Appendix to this paper for full details of the estimation.
18
4.3 Equilibrium Behavior12 (First Stage Results)
As discussed earlier, organizations adopt Enterprise 2.0 systems to facilitate knowledge sharing
by employees. Firms typically want to encourage work posting and discourage leisure posting. Hence, it
becomes interesting to investigate how the decisions to engage in work or leisure posting are affected by
the reputation and knowledge of an employee in both work and leisure.
Spillover from Leisure to Work and from Reading to Posting
The key to understand the relationship between work and leisure posting is to first investigate
how work and leisure reputations affect the readership of an employee's new post. In
Figure 1 (Figure
2), we plot how the readership one can gain from new work (leisure) post varies with his/her reputation
states. In constructing these figures we assume all other state variables are at their mean values. The
figures highlight several interesting insights, which we discuss briefly here.
First, the readership for a new post that an employee may receive increases with his/her
reputation. On average, for low status individuals the log readership that a new work post may receive
increases from 2.7 to 4.6 (70%) as work reputation increases from 0 to 10 and the leisure reputation is at
its minimum value of 0. In the same way, for low status individuals on average, the log readership that a
new leisure post may receive increases from 3.5 to 4.6 (31%) as leisure reputation increases from 0 to 10
and the work reputation is at its minimum value of 0. This finding highlights the presence of a strong
"rich get richer" effect. Once an employee builds up reputation, his/her new posts will attract more
readers.
Interestingly, we also observe that readership of new work-related posting by a given employee
increases with that employee’s leisure reputation. Leisure reputation could influence work readership in
two countervailing ways. First, a blogger with a high leisure reputation can attract people who are
interested in leisure-related posts. In the process of doing so, they can get exposed to the work-related
posts by this same blogger. This leads to a positive spillover effect from leisure-based reputation to work
readership. Second, there could be a negative signaling effect. A high leisure reputation could be harmful,
because this might signal lack of commitment to work-related blogging and to work-related activities in
general. Thus, potential readers could have a lower expectation of the quality of their work-related blogs.
We also find that high status individuals receive higher readership for their work-related posts compared
to low status individuals. However, for leisure-related posts, we do not find any effect of orgstatus of
blogger on readership.
12
Note that in the BBL estimation a key assumption is that all agents play equilibrium strategies. Hence, the data
that we observe is generated from the equilibrium. Following BBL, one does not need to solve for an equilibrium
policy function and it is obtained directly by performing reduced form regression on the observed data. This is the
main difference between BBL and other dynamic game structural model estimations such as Aguirregabiria and
Mira (2007). Also note that the policy function is time invariant. The change in blogging behavior is captured
through change in agent states.
19
From Figure 1, we can see that when an individual’s work reputation is low, the positive spillover
effect is the dominant effect, and hence leisure reputation has a positive interdependence with readership
of a new work-related post. As employee’s work readership goes up, the negative signaling effect
dominates the positive spillover effect, and hence in this region, leisure readership negatively affects
readership of a new work-related post. On average, as the leisure reputation of a blogger increases from 0
to 10, the log readership that a new work post receives increases from 2.7 to 3.7 (37%) when work
reputation is 0. In contrast, on average, as the leisure reputation of the blogger increases from 0 to 10, the
log readership that a new work post receives decreases from 3.3 to 2.6 (21%) when work reputation is 10.
In comparison, if we look at the relation between new leisure readership and reputation states (Figure 2),
work reputation seems to have no impact on readership of a new leisure-related post. It seems that the two
countervailing effects, positive spillover and negative signaling cancel out each other. In other words, the
reputation on work-related posts would have no effect on a blogger's ability to attract readers to his/her
leisure-related posts.
Low Leisure Knowledge
High Leisure Knowledge
5
M ean Log (R eadership of N ew L eisure Post)
Mean Log (Readership of New Work Post)
Low Work Knowledge
High Work Knowledge
4.5
4
3.5
3
2.5
10
10
5
Log (Employee Leisure Reputation) 0
6
2
0
8
6
5.5
5
4.5
4
3.5
3
10
8
10
6
8
6
4
4
4
2
Log (Employee Leisure Reputation)
Log (Employee Work Reputation)
Figure 1: Readership for New Work Post
2
0
0
Log (Employee Work Reputation)
Figure 2: Readership for New Leisure Posts
Comparing the two planes in the Figure 1, we can see that work-related posts written by bloggers
with a higher level of work knowledge attract more readers than those written by bloggers with a lower
level of work knowledge. This relationship also holds for leisure-related posts and the leisure knowledge
of their authors as shown in Figure 2. These two relationships reveal the interdependence between posting
and reading.
Decisions on Work Posting and Leisure Posting
We find that given the same level of leisure reputation, employees are more likely to post workrelated blogs when their work reputation is high than otherwise. This is because individuals expect higher
readership when they have higher levels of reputation. The same reasoning also applies to probability of
leisure posting. Further, leisure posting always increases with leisure reputation, thereby confirming the
20
rich get richer effect. However, leisure posting increases when work reputation is low but decreases when
work reputation is high. This shows the interdependence among work and leisure, which was also
highlighted by the readership spillover (please refer to Figure A1 and Figure A2 in the General Appendix
1). Not surprisingly, we also find that high status employees post more work-related and less leisurerelated content than low-status individuals, given the same reputation level.
Knowledge States and Reading Decisions
Employees also face a trade-off between work and leisure when choosing which blogs to read.
We find that the model with two knowledge levels for each type (leisure and work) fits our data best. We
define the two levels as “high knowledge” and “low knowledge”. The choice-specific state transition
probabilities of the two types of knowledge states are given in Table A1-A4 in the General Appendix 1.
Overall, the state transition shows that knowledge states seem to be very sticky suggesting that it is hard
to move from one state to another, which is consistent with the finding in Singh et al. (2011). We find that
individuals learn by both reading and posting. We find that individuals with high knowledge and
reputation have higher probability of reading. (Please refer to Figures A3 and A4 in the General Appendix
1 for a graphical illustration). We notice that there is little interdependence between work knowledge and
leisure knowledge. Probability of work reading is only affected by employees’ work knowledge level, but
not by leisure knowledge level, and vice versa.
Competitive Dynamics
The readership of a new work post decreases with the average work reputation of peers, and the
readership of the new leisure post decreases with average leisure reputation of peers (please refer to
Figure and Figure in the General Appendix 1). When peers' work readership is high, the proportion of
readers that an employee can attract from his/her new work-related post is smaller. This provides
evidence of the readership competition among employees.
4.4 Second stage results
The results for the second stage (structural parameters) are presented in Table 5. For estimation
purposes, the discount factor 8 is set at 0.95. 13 The results show that and are positive and
significant. This indicates that employees gain positive utility from work-related reputation and leisurerelated reputation. This set of results verifies that reputation gain provides employees incentives to write
blogs. The magnitude of is more than 4 times that of , which means that the utility increase resulted
from one percent increase in work reputation is over four times of that resulted from one percent increase
in leisure reputation. More importantly, the marginal utilities are even more in favor of work than leisure
13
The qualitative nature of the results is robust to several other values of the discount factor.
21
since log is concave. This tells us that in the enterprise blog environment, people do appreciate workrelated reputation more than leisure-related reputation.
From the HMM, we know that both work knowledge and leisure knowledge have two state
levels. The output of HMM also implies that employees have higher probability to stay in high level at the
end of a period if they read in the current period, and have lower probability to stay in high level if they
do not read. The knowledge states actually count for both past knowledge base and addition knowledge
acquired in current period. Individuals derive knowledge-based utility from reading other’s posts as
indicated by positive and significant $ and % . Again, individuals value work-related knowledge much
more than leisure related knowledge (0.619 vs. 0.057).
Table 5: Structural Parameter Results (***p<0.01)
Parameter
$
%
/
0
1
2
Estimates
4.277***
0.972***
0.619***
0.057***
1.419***
0.745***
1.206***
0.264***
Standard error
0.719
0.342
0.096
0.007
0.252
0.195
0.441
0.069
The parameters / to 2 are positive and significant. Note that in our utility function, we have a
negative sign in front of these four parameters. Hence, the parameters are expected to be positive. Also
note that / to 2 are normalized costs, which are relative to the cost of consuming outside goods. Hence,
the costs can be compared with each other. Our analyses show that reading or writing a work post is more
costly than reading or writing a leisure post. One possible explanation could be individuals who read or
write work posts spend a lot of time digesting or creating the knowledge they learn from the blogs. On
comparison, one can see that $ < 1 and % < 2 . This provides further evidence of employees being
forward looking as the cost of reading is much higher than the utility one can get from staying at high
knowledge state in the current period. If employees were myopic, they would not read at all.
Following Xu et al (2008), to evaluate the overall fit of the estimated model, we use the estimated
parameters to simulate the blogging dynamics and then compare the simulated data moments with real
data moments. The results are reported in Table 6. From the comparison, we can see that simulated data
moments are all reasonably consistent with the real data moments, indicating that our model does a good
job in fitting the observed data.
22
Table 6: Model Fit
Key Statistics
Real Data
Moments
Simulated Data
Moments
16.28
86.19
0.12
0.16
0.06
0.11
0.17
0.24
16.74
87.81
0.12
0.15
0.07
0.11
0.18
0.23
Average Work Reputation
Average Leisure Reputation
Average Work Knowledge
Average Leisure Knowledge
Probability of Work Posting
Probability of Leisure Posting
Probability of Work Reading
Probability of Leisure Reading
4.5 Robustness Checks
We performed several tests to check the robustness of our results. First, we allow employees with
different status to have different sets of parameters, i.e. they may value reputation and knowledge
differently and incur different relative time cost when they participate in blogging activities. To do this,
we add an indicator H for high-status individuals, and interact H with every variable in Equation (5). The
estimation results are displayed in Table 7.
= + ¶ · log + + A · log + $ + · # + % + · #
−/ + $ · − 0 + % · − 1 + / · − 2 + 0 · + ' Table 7: Status Specific Structural Parameter Results (***p<0.01)
Parameter
$
%
/
0
1
2
¶
A
$
%
/
0
Estimates
4.039***
1.072***
0.618***
0.063***
1.295***
0.632***
1.134***
0.241***
1.093***
-0.653***
0.004
-0.024**
0.826***
0.759***
0.048
0.158
23
Standard error
0.542
0.276
0.089
0.007
0.231
0.154
0.359
0.054
0.345
0.219
0.017
0.012
0.261
0.219
0.031
0.093
From Table 7, we can see that compared to low status employees, high status employees value
work reputation more, and leisure reputation less. Leisure knowledge also provides high status employees
with lower utility compared to work knowledge. These results suggest that high status employees
generate lower utility from leisure related activities than low status employees do. In addition, in
comparison to low status employees, high status employees incur a higher cost of posting blogs, either
work-related or leisure-related. However, reading costs do not differ significantly across high and low
status employees for either kind of blog post.
Besides organizational status, unobserved cross-sectional heterogeneity among employees could
be a potential driver of the differences in blogging activity. To address this concern, we allow for crosssectional heterogeneity among employees following the approach by Arcidiacono and Miller (2011) and
estimate the model. Our results are robust to the inclusion of unobserved individual cross-sectional
heterogeneity.
Second, we try other alternative constructions of reputation to see whether the results are robust
under different constructions. These are given in Table 8. The three alternatives are (1) reputation
depreciates at a faster rate of 0.8; (2) reputation is measured as average log-readership per post with
depreciation =0.9; (3) reputation is measured as average log-readership per post with depreciation =0.8.
Under the second and the third option, an individual has to get a high readership for all her posts to
achieve a high reputation. In comparison, under the original and first option, an individual can achieve a
high reputation by posting a lot but getting only moderate readership per post. The most significant
difference in results across alternatives is the estimates of the first two parameters. When reputation is
measured by average readership, the estimates of and are much bigger, due to the smaller value of
log . For similar reasons, when the depreciation factor is smaller, the estimates of and are
slightly larger. However, the nature of the results remains intact in different construction of reputation.
Table 8: Structural Parameter Results under Different Construction of Reputation (***p<0.01)
$
%
/
0
1
2
Main Model
4.277***(0.719)
0.972***(0.342)
0.619***(0.096)
0.057***(0.007)
1.419***(0.252)
0.745***(0.195)
1.206***(0.441)
0.264***(0.069)
Alternative (1)
Alternative (2)
Alternative (3)
5.419***(0.667)
1.217***(0.351)
0.617***(0.095)
0.058***(0.009)
1.398***(0.261)
0.771***(0.201)
1.211***(0.445)
0.259***(0.065)
6.949***(1.319)
2.613***(0.711)
0.663***(0.091)
0.056***(0.008)
1.691***(0.246)
0.963***(0.132)
0.941***(0.219)
0.184***(0.045)
7.048***(1.414)
2.782***(0.816)
0.671***(0.089)
0.056***(0.009)
1.695***(0.243)
0.954***(0.137)
0.932***(0.231)
0.203***(0.044)
24
Table 9: Structural Parameter Results under Different Robustness Checks(***p<0.01)
Parameter
$
%
/
0
1
2
Main Model
4.277***(0.719)
0.972***(0.342)
0.619***(0.096)
0.057***(0.007)
1.419***(0.252)
0.745***(0.195)
1.206***(0.441)
0.264***(0.069)
Check (1)
4.272***(0.711)
0.964***(0.348)
0.625***(0.103)
0.067***(0.010)
1.404***(0.267)
0.728***(0.184)
1.213***(0.429)
0.269***(0.064)
Check (2)
4.262***(0.608)
1.011***(0.325)
0.659***(0.086)
0.058***(0.009)
1.496***(0.241)
0.825***(0.134)
0.972***(0.239)
0.216***(0.054)
Check (3)
4.272***(0.738)
0.928***(0.316)
0.674***(0.121)
0.061***(0.009)
1.483***(0.255)
0.801***(0.231)
1.168***(0.370)
0.257***(0.059)
Third, we conduct a series of additional robustness checks that show that the (second stage)
structural parameter estimates remain the same under different scenarios. As the first robustness check
(column 2 Table 9), we allow individual decision making to be a function of peer knowledge states. Note,
while in this robustness check peer knowledge state is observable to employees it is unobserved to us.
This changes step one of our estimation where we estimate the CCP. We follow the Expectation
Maximization (EM) algorithm proposed by Arcidiacono and Miller (2011) to estimate such scenarios.
This method allows for an individual's decisions to be a function of own and peers’ observed and
unobserved states. The results are statistically similar to the main model results indicating that peer
knowledge states do not significantly affect the decision making of individuals. This provides further
support for our assumption that peer knowledge states should not enter the decision making of individuals
In addition, we conducted another robustness check (column 3 Table 9) to ensure that the system
has reached a stationary equilibrium. We checked the CCP estimates with a 30-week hold out period
instead of that used in the main results (16 week hold out period). The CCP estimates in this case are not
very different from that with those in the main results. This shows that the CCP is not affected by the
length of the holdout period.
A third robustness check (column 3 Table 9) involved examining a scenario where an employee
may choose to track a subset of other employees who the employee may consider part of the relevant
group. We have classified individuals based on their status in the organization. We calculated first
moments of states of both high and low status individuals. In the CCP where we estimate how individuals
make decisions based on their and peers' states, we allow an individual to consider high status individuals'
states as well as low status individuals' states instead of just the population states. We find that individuals
care about both same and high or low status individuals' states because individuals are competing with
everyone for readership to their posts.
Finally, we conducted a series of additional robustness checks to examine the external validity of
the reputation and knowledge measures. These details are in the General Appendix 2.
25
5. Policy Experiments
Enterprise blogs are adopted primarily to promote work-relevant knowledge sharing. Hence,
firms typically encourage employees to engage in work-related blogging but discourage leisure-related
blogging. Firms can also implement some policies to incentivize employees’ blogging activities. In this
section, we explore the effects of two such policy interventions: 1) How do employees respond to the
policy where leisure-related blogging is prohibited? 2) How do employees adjust their blogging behavior
when the firm provides incentives for blogging?
In the policy experiment, we introduce a new policy by manipulating the parameter values in the
utility function, and then solving for the new equilibrium under the new policy. Since the reputation states
in our model are continuous, following the common practice in such scenarios, (e.g. Snider 2009) we
discretize the state space. We split individual’s reputation states (both work and leisure) into 50 equispaced intervals, and average reputation states (both work and leisure) into 20 equi-spaced intervals. And
then we use the algorithm similar to that described in Erickson and Pakes (1995) to solve for the new
equilibria.
In the first experiment, we set the cost of leisure-posting to be extremely high. One would think
that this policy will eliminate leisure-blogging activities due to the existence of the budget constraint.
Hence, users will switch from leisure-related activities to work related activities and both work posting
and work reading will increase. However, our simulation results suggest a very different story. The “No
Leisure Policy” column of Table 10 shows probability of work posting and work reading after the policy
is implemented. As expected, the probability of work reading increases since there are more individuals
reading 2 or 3 posts and fewer individuals reading 1 or no post. However, the probability of work posting
decreases from 6.9% to 1.9% once the policy is implemented. This implies that not only do individuals
with low work reputation drop out, but also that individuals who have established a high work reputation
tend to post less.
This “counterintuitive” result is actually consistent with the spillover effect we discussed in the
results section. Recall that, employees with low to moderately high work reputation, who are the majority
of users in our sample, are the most responsive to the spillover from leisure reputation to work reputation.
A moderate to high level of leisure readership greatly increases the probability for this group to post
work-related blogs. When leisure activities are eliminated, those who used to have low to moderately high
work readership but moderately high leisure readership can no longer benefit from the spillover effect.
Hence, their probability of work posting will drop significantly. Although the probability of work reading
increases, which leads to a bigger readership pie to be shared by all the employees who post in each
period, a very high fraction of the increased readership accrues to those users who have built a very high
work reputation (due to the rich get richer effect).
These results have interesting implications. Firms typically want a larger and more diversified
internal knowledge pool. We see that a policy of prohibiting leisure-related blogging can actually hurt the
26
extent of knowledge creation and sharing within the firm. A dramatic decrease in probability of work
posting means that less new knowledge will be added to the knowledge pool and fewer people will be
willing to post work-related content. Even though under “No-leisure” policy, people tend to read more
work-related blogs, at the firm level, the quantity and diversity of the knowledge transferred in the
enterprise blog system both decrease. Thus, we see an adverse consequence of prohibiting leisure-related
blogging.
Table 10: Policy Experiment Results
Work Reading
Current policy
No Leisure Policy Medium Incentive High Incentive
Read one post
0.078 (0.002)
0.067 (0.002)
0.361 (0.008)
0.411 (0.005)
Read Two Posts
0.067 (0.003)
0.087 (0.003)
0.159 (0.003)
0.186 (0.002)
Read Three Posts 0.025 (0.001)
0.044 (0.002)
0.089 (0.004)
0.149 (0.002)
Mean of Average Probability of Reading (standard deviation of Average Probability of Reading). We first
calculate the average reading probability for each time period. The mean and standard deviation are for
this average probability across time periods.
Leisure Reading
Current policy No Leisure Policy Medium Incentive High Incentive
Read one post
0.119 (0.002)
0.268 (0.004)
0.342 (0.003)
Read Two Posts
0.069 (0.002)
0.136 (0.003)
0.206 (0.002)
Read Three Posts 0.048 (0.003)
0.071 (0.002)
0.088 (0.002)
Mean of Average Probability of Reading (standard deviation of Average Probability of Reading). We first
calculate the average reading probability for each time period. The mean and standard deviation are for
this average probability across time periods.
Work Posting
Current policy
No Leisure Policy Medium Incentive High Incentive
0.069 (0.002)
0.019 (0.002)
0.112 (0.003)
0.397 (0.004)
Mean of Average Probability of Posting (standard deviation of Average Probability of Posting). We first
calculate the average posting probability for each time period. The mean and standard deviation are for
this average probability across time periods.
Leisure Posting
Current policy
No Leisure Policy Medium Incentive High Incentive
0.113 (0.003)
0.154 (0.002)
0.328 (0.006)
Mean of Average Probability of Posting (standard deviation of Average Probability of Posting). [We first
calculate the average posting probability for each time period. The mean and standard deviation are for
this average probability across time periods.
In the second experiment, the firm encourages employees to blog by giving them incentives for
blogging. For example, giving employees the liberty to blog during regular office hours without any
27
penalties can operationalize this policy. High tech companies like Google and IBM have been pioneers in
these kinds of employee-friendly policies in the hope that this can lead to increased innovation and
productivity.
Providing incentives for blogging is implicitly equivalent to cutting down total costs of blogging.
Here, we try two levels of incentives or cost reduction, denoted by “Medium-Incentive” and “HighIncentive”. This is in comparison to “No-Incentive” which would correspond to the current policy. The
results are again shown in Table 10. Under the “Medium-Incentive” policy, employees only need to pay
half of the total cost incurred by participating in the four blogging activities. Under the “High-Incentive”
policy, employees incur no cost of blogging14 . The latter policy may not be very realistic due to its
extreme nature. However, by examining this extreme case, and combining it with the case when there is a
medium level of cost cutting, we can understand how users may react to different level of incentives, or in
other words, how the substitution effects differ under different budget constraints.
From Table 12, we can see posting activities increases slightly under “Medium-Incentive” policy.
When the “Medium-Incentive” policy is introduced, the probabilities of work and leisure posting increase
only marginally from a base level of 6.9% to 11.2% and from 11.3% to 15.4%, respectively. In contrast,
the probabilities of work and leisure reading greatly increase from the base level. The probability of
reading one work- (leisure-) related post increases from 7.8% (11.9%)-36.1% (26.8%); the probability of
reading two work- (leisure-) related post increases from 6.7% (6.9%)-15.9% (13.6%); and the probability
of reading three work- (leisure-) related post increases from 2.5% (4.8%)-8.9% (7.1%). This suggests that
the marginal benefit of posting seems to be lower than the marginal benefit of reading when the additional
incentives are small. When the “High-Incentive” policy is introduced, the probabilities of work- and
leisure- related posting increase to 39.7% and 32.8%, respectively. In contrast, the probabilities of reading
work or leisure posts only increase by between 5% and 10% from the “Medium -Incentive” policy to
“High-Incentive" policy
Thus, we see that the probabilities of both work and leisure posting increase by a larger amount
than the probabilities of work and leisure reading when the firm switches from “Medium-Incentive” to
“High-Incentive”. In other words, the tradeoff between posting and reading is different under different
budget constraints. As mentioned earlier, work posting is the main driver of the knowledge sharing within
enterprise blogs. Our policy simulation shows that by providing great amount of additional time for
blogging, a firm can significantly enhance knowledge sharing among employees.
14
Note from Table 5 that under the current policy if an employee participates in all four activities (write one post
each and read 3 posts each of work and leisure) in a period, he/she will incur a total cost of 6.574. Under the
Medium-Incentive policy the employee does not incur a cost up to 3.287 (half of the maximum possible cost of
6.574). For example, if in a period an employee reads three work posts, he only incurs a cost of 0.0.331. In
comparison, under the High-Incentive policy the same individual will incur no cost.
28
6. Discussion and Conclusion
Over the past few years, we have seen that Web 2.0 and social media technologies have become a
powerful lure for organizations. When used effectively, they also may encourage participation in projects
and idea sharing. Blogs provide traces of personal expertise and practices. Making it visible helps to get
an idea of who knows what, which is a starting point for collaboration and for allowing knowledge to
spread more effectively. While the practice of blogging is ubiquitous in many businesses, research that
could inform them is rather limited.
In this paper, we present a dynamic structural model in which employees in an enterprise compete
with each other in the process of reading and writing blog posts. There are two kinds of blogs in our
context: work-related and leisure-related posts. Users make choices about reading and writing based on
their preferences for either type of content. We explicitly model the tradeoffs that employees have to
make in this context. Our paper aims to understand employees’ motivations for reading and writing workrelated and leisure-related posts within an enterprise setting and then draw policy implications based on
employees’ incentive structure. The model is estimated on a dataset consisting of employees from a large
Fortune 1000 IT services and consulting firm.
Bloggers have to deal with the effects of visibility that comes because of blogging. While
visibility might be a driving factor for blogging, it also comes with challenges of dealing with changes in
power distribution when crossing hierarchical boundaries, increased expectations from individuals
(especially those with higher reputations) and making errors that are publicly available and thus costly.
Our paper shows that it is only in the long term that the benefits of blogging outweigh the costs. This is an
indication of forward-looking behavior of employees because if employees were myopic they would not
read at all. Thus any incentive structure that an organization puts forth to encourage employees to blog
needs to consider such nuances in user behavior. These are the kind of nuances that a dynamic model can
capture while a static model cannot.
There is also evidence of strong competition among employees with regard to attracting
readership for their posts. While readership of leisure posts provides little direct utility, employees still
post a significant amount of these posts as there is a significant spillover effect on the readership of work
posts from the creation of leisure posts. Indeed, we find that there are two countervailing effects of leisure
posting. While on one hand, there is market expansion effect wherein people who come to read leisure
blogs also are exposed to work blogs and this expands the market of readers. On the other hand, a high
reputation for leisure posts can also have a negative signaling effect since it suggests that the blogger may
not be sufficiently committed to work-related blogging. Whether one effect dominates the other depends
on the level of the employee's work reputation. In addition, we find that that the utility employees derive
from posting work-related blogging is about 4 times what they derive from leisure blogging. Reading and
writing work-related posts is more costly than leisure-related posts on an average.
29
Our policy simulations suggest that prohibiting leisure-related posting will be counter-productive
for organizations since it leads to a reduction in work-related posting too. There is a positive spillover
effect from leisure posting to work-posting that enterprises should account for before implementing such
policies. Overall, these results shed light on how enterprise adoption of social media tools is associated
with employee behavior and choices.
Our paper has several limitations. First, in term of reading, we only model whether an individual
reads work-related or leisure-related posts. We do not model whose blog an individual reads. It is possible
that an individual may follow a blogger rather than a specific work or leisure posts. Individual reading
dynamics may be affected by the dynamics of content created by the blogger he/she follows. Future
research can explicitly model this relationship. Second, commenting on blogs can be an important part of
the knowledge creation process but we do not have sufficient data on the comments on the blog posts in
our data to do any meaningful analyses. Finally, we accounted for blog reading behavior only within the
enterprise. Some of the blog creation and consumption dynamics may be affected by events outside of our
data context. For example, employees can also access outside blogs. However, such data are not available
to us. Notwithstanding these limitations, we hope our work can pave the way for future research in this
emerging area of enterprise social media.
7. References
Aggarwal, R., Gopal, R., Sankaranarayanan, R., and Singh, P. V. 2010. Blog, Blogger and the Firm: Can
Negative Posts by Employees Lead to Positive Outcomes? Information Systems Research forthcoming.
Aguirregabiria, V., and Ho, C. 2009. A Dynamic Oligopoly Game of the US Airline Industry: Estimation
and Policy Experiments. Working Paper.
Aguirregabiria, V., and Ho, C. 2010. A Dynamic Game of Airline Network Competition: Hub-and-Spoke
Networks and Entry Deterrence. International Journal of Industrial Organization (28:4), 377-382.
Aguirregabiria, V., and Mira, P. 2007. Sequential Estimation of Dynamic Discrete Games. Econometrica
(75:1), pp 1-53.
Albuquerque, P., Pavlidis, P., Chatow, U., Chen, K., Jamal, Z., and Koh, K. 2010. Evaluating
Promotional Activities in an Online Two-Sided Market of User-Generated Content. Working Paper,
University of Rochester.
Arcidiacono, P., and Miller, R. 2011. CCP Estimation of Dynamic Models with Unobserved
Heterogeneity. Econometrica (76:6), 1823-1867.
Argote, L., McEvily, B., and Reagans, R. 2003. Managing Knowledge in Organizations: An Integrative
Framework and Review of Emerging Themes, Management Science (49:4), 571-582.
Bajari, P., Benkard, C., and Levin, J. 2007. Estimating Dynamic Models of Imperfect Competition.
Econometrica (75:5), 1331-1370.
Bajari, P., Chernozhukov, V., Hong, H., and Nekipelov, D. 2009. Nonparametric and Semiparametric
Analysis of a Dynamic Discrete Game. Unpublished manuscript, Dep. Econ., Univ. Minn., Minneapolis.
Breusch T. S. and Pagan A. R. 1980. The Lagrange Multiplier Test and its Applications to Model
Specification in Econometric, The Review of Economic Studies(47:1), 239-253.
30
Ching, A. 2010. A Dynamic Oligopoly Structural Model for the Prescription Drug Market After Patent
Expiration. International Economic Review, (51:4), pp.1175-1207.
Chung, D., Steenburgh, T., and K. Sudhir 2009. Do Bonuses Enhance Sales Productivity? A Dynamic
Structural Analysis of Bonus Based Compensation Plans, Working Paper.
Constant, D., Kiesler, S., and Sproull, L. 1994. What's Mine Is Ours, or Is It? A Study of Attitudes About
Information Sharing. Information Systems Research (5:4), 400-421.
Dewan, S., and J. Ramaprasad 2011. Chicken and Egg? Interplay Between Music Blog Buzz and Album
Sales, Working Paper, UC Irvine.
Dhar, V., and Chang, E. 2009. Does Chatter Matter? The Impact of User-Generated Content on Music
Sales. Journal of Interactive Marketing (23:4), 300-320.
Dubé, J.P., Hitsch, G. and P. Manchanda. 2005. An Empirical Model of Advertising Dynamics.
Quantitative Marketing and Economics, (3:2), 107-144, 2005.
Dubé, J. P., Sudhir, K., Ching, A., Crawford, G., Draganska, M., Fox, J., Hartmann, W., Hitsch,
G., Viard, B., Villas-Boas, M., Vilcassim, N. 2005. Recent Advances in Structural Econometric
Modeling: Dynamics, Product Positioning and Entry. Marketing Letters, (16:3-4), 209-224.
Erdem, T., Keane, M., and Sun, B. 2008. The Impact of Advertising on Consumer Price Sensitivity in
Experience Goods Markets. Quantitative Marketing and Economics (6:2), 139-176.
Ericson, R., and Pakes, A. 1995. Markov-Perfect Industry Dynamics: A Framework for Empirical Work.
The Review of Economic Studies (62:1), 53-82.
Forman, C., Ghose, A., and Wiesenfeld, B. 2008. Examining the Relationship Between Reviews and
Sales: The Role of Reviewer Identity Information in Electronic Markets, Information Systems Research
(19:3), 291-313
French, J. R. P., and Raven, B. H. 1959. The bases of social power. In D. Cartwright (Ed.), Studies in
social power, Ann Arbor, MI: Institute for Social Research, 150-167.
Ghose, A., and Han, S. 2011. An Empirical Analysis of User Content Generation and Usage Behavior on
the Mobile Internet, Management Science, 57(9), 1671-1691.
Ghose, A., and Han, S. 2010. A Dynamic Structural Model of User Learning in Mobile Media Content.
Working Paper, SSRN.
Grossman, M., and McCarthy, R. 2007. Web 2.0: is the Enterprise ready for the adventure, Issues in
Information Systems (8:2),180-185.
Gurram, R., Mo, M., and Gueldemeister, R. A. 2008. Web Based Mashup Platform for Enterprise 2.0,
Web Information Systems Engineering 2008 Workshops, 5176, 144-151.
Hofstetter, R., Shriver, S., and Nair, H. 2009. Social Ties and User Generated Content: Evidence from an
Online Social Network. Working Paper, Stanford University.
Hotz, V., and Miller, R. 1993. Conditional Choice Probabilities and the Estimation of Dynamic Models.
The Review of Economic Studies (60:3), 497-529.
Hausman, J. A. 1978. Specification Tests in Econometrics, Econometrica(46:6), 1251-1271.
Huh, J., Jones, L., Erickson, T., Kellogg, W., Bellamy, R., and Thomas, J. 2007. Blogcentral: The Role of
Internal Blogs at Work. Conference on Human Factors in Computing Systems, 2447-2452.
Kavanaugh, A., Zin, T., Carroll, J., Schmitz, J., Pérez-Quiñones, M., and Isenhour, P. 2006. When
Opinion Leaders Blog: New Forms of Citizen Interaction. Proceedings of the 2006 international
conference on Digital government research, 79-88.
31
Kumar. V., and Sun, B. 2010. Why do Consumers Contribute to Connected Goods? A Dynamic Game of
Competition and Cooperation in Social Networks. Working Paper.
Lee, S., Hwang, T., and Lee, H. 2006. Corporate Blogging Strategies of the Fortune 500 Companies.
Management Decision (44:3), 316-334.
Levy, M. 2009. WEB 2.0 implications on knowledge management, Journal of Knowledge Management
(13:1), 120-134.
Lu, Y., Singh, P. V, and Sun, B. 2010. Blind Men Can Judge No Color: A Dynamic Structural Analysis
of Enterprise Knowledge sharing. Working Paper.
McAfee, A. P. 2006. Enterprise 2.0: The Dawn of Emergent Collaboration. MIT Sloan Management
Review, (47:3), 21–28.
McAfee, A. P. 2009. Enterprise 2.0 is vital for business, Financial Times (9 December).
McKinsey. 2009. How companies are benefiting from Web 2.0: McKinsey Global Survey Results,
McKinsey Quarterly (18 September).
Misra, S. and H. Nair. 2009. A Structural Model of Sales-Force Compensation Dynamics: Estimation and
Field Implementation, Quantitative Marketing and Economics forthcoming.
Moe, W., and Schweidel, D. 2010. Online Product Opinion: Incidence, Evaluation and Evolution.
Working Paper.
Nardi, B., Schiano, D., Gumbrecht, M., and Swartz, L. 2004. Why We Blog. Communications of the ACM
(47:12), 41-46.
Pakes, A. and Mcguire, P. 1994. Computing Markov-Perfect Nash Equilibrium: Empirical Implicaitons
Of A Dynamic Model. Rand Journal of Economics, 555-589.
Pesendorfer, M., Schmidt-Dengler, P., and Street, H. 2003. Identification and Estimation of Dynamic
Games. NBER working paper.
Rabiner, L. R. 1989. A Tutorial on Hidden Markov Models and Selected Applications in Speech
Recognition. Proceedings of the IEEE (77), 257-286.
Ryan, S., and Tucker, C. 2008. Heterogeneity and the Dynamics of Technology Adoption, Working Paper,
SSRN.
Roberts, J. A, Hann, I. H, Slaughter, S. A. 2006. Understanding the Motivations, Participation, and
Performance of OSS Developers: A Longitudinal Study of the Apache Projects, Management Science
(52:7), 984-999.
Singh, P., Sahoo, N., and Mukhopadhyay, T. 2010a. Seeking Variety: A Dynamic Model of Employee
Blog Reading Behavior. Working Paper.
Singh, P.V., Tan, Y., and Youn, N. 2010b. A Hidden Markov Model of Developer Learning Dynamics in
Open Source Software Projects. Information Systems Research, forthcoming.
Snider, C. 2009. Predatory Incentives and Predation Policy: The American Airlines Case. Working Paper.
Sweeting, A. T. 2007. Dynamic Product Repositioning in Differentiated Product Markets: The Case of
Format Switching in the Commercial Radio Industry, Working Paper.
Thomas-Hunt, M. C., Ogden, T. Y., and Neale, M. A. 2003. Who's really sharing? Effects of social and
expert status on knowledge exchange within groups. Management Science (49:4), 464-477.
Trusov, M., A. Bodapati and R. E. Bucklin. 2010. Determining Influential Users in Internet Social
Networks. Journal of Marketing Research, (47:4), 643-658.
32
Venkataraman. S., Chintagunta, P., Gopinath, S. 2009. Do Online Blogs Influence Offline Movie Boxoffice Performance, Working Paper, WCAI.
Wattal, S., Racherla, P., Mandviwalla, M. Network Externalities and Technology Use: A Quantitative
Analysis of Intraorganizational Blogs. Journal of Management Information Systems, (27:1), 145-174.
Xu, Y. 2008. A Structural Empirical Model of R&D, Firm Heterogeneity, and Industry Evolution,
working paper.
Yardi, S., Golder, S., and Brzozowski, M. 2009. Blogging at Work and the Corporate Attention Economy.
International Conference on Human factors in Computing Systems, 2071-2080.
33
GENERAL APPENDIX 1
Modeling of Budget Constraint
Suppose yit is the exogenous time budget for each individual (24*7 hours a week). Then we have the
following budget constraint:
6 = @ 4
+ @ 4
+ 45 & .
Here, & is the outside good consumption and 45 is the associated coefficient that captures the per unit
time cost of consuming outside good. The outside good option captures employees’ decision between
blogging and non-blogging activities. Here 4
is the cost of identifying and reading a type j post and 4
is the cost of developing and writing a type j post. This budget constraint allows us to capture the tradeoff
that an individual would consider while allocating time between work and leisure blogging activities, and
non-blogging activities.
Let us define / = 4 ⁄4A ; 0 = 4⁄4A ; 1 = 4 ⁄4A ; 2 = 4 ⁄4A , which can be interpreted as the
relative cost of participating in the four activities compared to participating in non-blogging activities.
Then the budget constraint can be written as:
Solving for & , we get
6
= / + 0 + 1 + 2 + & .
45
& =
DEF
G
− / − 0 − 1 − 2 Our specification for utility function is
get
= , , , + " # , # , $ , % + & + ' . Substituting for Oit we
= log + log + $ # + % # +
6
45
−/ − 0 − 1 − 2 + ' Utility for every choice for an individual would include yit/t0. Let O
¹¹¹¹¹
¸U be the expected utility that
individual i will receive from taking action a at time t. Further, Let ¹¹¹¹¹
Oº¸U be the utility that individual i will
D
receive from taking action a at time t without including EF. That is
G
¹¹¹¹¹
O¸U = O
¹¹¹¹¹
º ¸U +
34
6
45
We have 64 choices and we assume ' is extreme value type 1 distributed implying a multinomial logit
utility function. The probability that an individual will choose action k is given by
6
6
¹¹¹¹¹
¹¹¹¹¹
¼T½ ¾O
º ¸e + ¿
¼T½PO
º ¸e Q¼T½ ¾ ¿
¹¹¹¹¹
¼T½PO
º ¸e Q
45
45
n7 = » =
=
=
6
6
¹¹¹¹¹
¹¹¹¹¹
¹¹¹¹¹
º ¸U Q
∑U ¼T½ ¾O
º ¸U + 4 ¿ ¼T½ ¾ 4 ¿ ∑U ¼T½PO
º ¸U Q ∑U ¼T½PO
5
5
Mathematically, yit/t0 drops out because exp(yit/t0) is a factor of both numerator and denominator and gets
cancelled out.
35
Decisions on Work Posting and Leisure Posting
Figure (Figure ) plots the probability of work (leisure) posting for a low status employee as a
function of his/her both work and leisure reputation and work (leisure) knowledge state. There are several
interesting findings in these figures.
First, Figure shows that at a given level of leisure reputation, employees are more likely to post
work-related blogs when their work reputation is high than otherwise. This is because individuals are
forward-looking and expect higher readership when they have higher levels of reputation. The same
reasoning also applies to probability of leisure posting (Figure ). Further, leisure posting always increases
with leisure reputation, thereby confirming the rich get richer effect. However, leisure posting increases
when work reputation is low but decreases when work reputation is high.
This shows the
interdependence among work and leisure, which was also highlighted by the readership spillover.
Basically, what we find is that while work reputation does not affect the readership of leisure posts, in
contrast, leisure reputation increases work readership when work reputation is low and decreases work
readership when work reputation is high. While leisure readership always increases with leisure
reputation, the marginal benefit from new leisure readership when work reputation is high may not be
sufficient to outweigh the negative effect of loss of new work readership. Further, more knowledgeable
employees have a higher probability of posting. This finding is again consistent with the earlier
discussion where more knowledgeable employees receive higher readership.
High Work Knowledge
Low Work Knowledge
Low Leisure Knowledge
High Leisure Knowledge
1
Probability of Leisure Posting
Probability of W ork Posting
0.8
0.6
0.4
0.2
0.8
0.6
0.4
0.2
0
10
0
10
8
8
10
6
4
8
4
2
0
4
2
2
0
6
4
6
Log (Employee Leisure Reputation)
10
6
8
Log (Employee Leisure Reputation) 0
Log (Employee Work Reputation)
Figure A1: Probability of Work Posting
2
0
Log (Employee Work Reputation)
Figure A2: Probability of Leisure Posting
Not surprisingly, since high status individuals receive more readership when they post workrelated blogs, the probability for them to post work-related blogs is higher than that of low-status
individuals, given the same reputation level. We also find that the probability for high status individuals
to post leisure-related blogs is lower than that of low status individuals.
36
Knowledge States and Reading Decisions
Employees also face a trade-off between work and leisure when choosing which blogs to read.
Again, firms typically would want their employees to invest in reading work-related blogs. Hence, a good
understanding of how employees’ knowledge states would affect their reading decisions is also important.
Using the HMM, we find that the model with two knowledge levels for each type (leisure and work) fits
our data best. We define the two levels as “high knowledge” and “low knowledge”. The choice-specific
state transition probabilities of the two types of knowledge states are given in Table A1 and Table A2.
Note that here we assume that an employee’s work (leisure) knowledge state is only affected by his/her
work (leisure) reading decisions. There are several interesting observations in the state transition
probabilities.
First, if an employee reads more work-related posts he/she will have a higher chance to move to
(or stay in) high knowledge state. There are several intuitive reasons as to why an individual moves down
a state if he/she does not participate in reading. The primary reason is that given the high tech industry
context of this research setting, the technologies, algorithms, new ways to solve problems evolve very
quickly leaving the current knowledge of an individual outdated soon. Further, individuals can also forget
what they know over time due to bounded rationality, which is documented across several studies on
individual learning behavior (Argote et al 2003). Posting work-related blogs can also help employees to
sustain themselves in high work knowledge states, possibly due to the re-thinking and re-searching
involved in blog composing processes. Similar results are also found in leisure knowledge state transition
probability estimation. Second, we see that it is harder to get to a high work knowledge state in
comparison to leisure knowledge state from a low value of the corresponding state. Third, once an
individual is in a high state it is easier for him/her to stay in the high leisure knowledge state than in the
high work knowledge state without posting; while with posting, an individual will have a better chance to
stay in work knowledge state than to stay in leisure knowledge state.
Overall, the state transition shows that knowledge states seem to be very sticky, which is
consistent with the finding in Singh et al. (2011). The estimated state transition probability can be
expressed as following equations, and the predicted state transition probabilities for the no activities and
the maximum activities are reported in Table A1-A4. The results indicate that as an individual reads or
posts more she is more likely to stay in (or move) to a higher state. The knowledge states turn out to be
quite sticky, suggesting that it is hard to move from one state to another. The results indicate that as an
individual reads (or posts) more she is more likely to stay in (or move) to a higher state. For example,
looking at Table A1, we can see that if an individual in a low work knowledge state does not engage in
any blogging activity (reading or writing) the probability that she will stay in the low state is 0.931. In
contrast a person in a high work knowledge (as seen from the second table above) state will have a 0.852
probability of staying in the high state even if he does not read or write any work post. These probabilities
37
for state transitions are also consistent with those found in the literature on learning by software
developers (Singh et al 2011). From Tables A3-A4 we can see that the leisure states also appear to be
quite sticky. With no blogging activity a person in a low (high) state stays in the low (high) state with a
probability of 0.88 (0.88).
The results further highlight that in contrast to reading, posting a blog has a bigger impact on the
probability of moving to a high knowledge state in both work and leisure categories. From Table A1 we
can see that for work knowledge, only reading one work post increases the probability of an employee
moving from low to high state from 0.069 to 0.083. In comparison, only writing one work post increases
the probability from 0.069 to 0.101. In a similar vein, for leisure knowledge, only reading one work post
increases the probability of an employee moving from low to high state from 0.120 to 0.132 (Table A3).
In comparison, only writing one work post increases the probability from 0.120 to 0.160.
Work knowledge15:
n7 :;À 4; ℎ¯<ℎ =
1
¼T½2.60 − 0.19 × − 0.41 × + 1
n7 :;À 4; :;À = 1 − n7 :;À 4; ℎ¯<ℎ
1
n7ℎ¯<ℎ 4; :;À =
¼T½1.75 + 0.12 × + 1.14 × + 1
Leisure knowledge:
n7 ℎ¯<ℎ 4; ℎ¯<ℎ = 1 − n7 ℎ¯<ℎ 4; :;À
n7 :;À 4; ℎ¯<ℎ =
1
¼T½2.00 − 0.11 × − 0.34 × + 1
n7 :;À 4; :;À = 1 − n7 :;À 4; ℎ¯<ℎ
n7 ℎ¯<ℎ 4; :;À =
1
¼T½1.99 + 0.09 × + 0.81 × + 1
n7 ℎ¯<ℎ 4; ℎ¯<ℎ = 1 − n7 ℎ¯<ℎ 4; :;À
Work Knowledge States
Table A1: The probability that an individual will be at high (low) state in period t+1 if she is at low
work knowledge state in period t:
38
Work Reading
0
Work Posting
0
1
1
2
3
0.069(0.931) 0.083(0.917) 0.099(0.901) 0.117(0.883)
0.101(0.899) 0.119(0.881) 0.169(0.831) 0.235(0.765)
Table A2: The probability that an individual will be at high (low) state in period t+1 if she is at high
work knowledge state in period t:
Work Reading
Work Posting
0
1
0
1
0.852(0.148)
0.947(0.053)
2
3
0.866(0.134) 0.880(0.120)
0.953(0.046) 0.958(0.042)
0.892(0.108)
0.963(0.037)
Leisure Knowledge States
Table A3: The probability that an individual will be at high (low) state in period t+1 if she is at low
work knowledge state in period t:
Leisure Reading
Leisure Posting
0
1
0
1
2
3
0.120(0.880) 0.132(0.868) 0.145(0.855) 0.158(0.842)
0.160(0.840) 0.176(0.824) 0.193(0.808) 0.210(0.790)
Table A4: The probability that an individual will be at high (low) state in period t+1 if she is at low
work knowledge state in period t:
Leisure Reading
Leisure Posting
0
1
0
1
2
3
0.880(0.120)
0.942(0.058)
0.889(0.111)
0.947(0.053)
0.898(0.102)
0.952(0.048)
0.905(0.095)
0.956(0.044)
39
3
High Work Knowledge
Low Work Knowledge
Average Nu mber of L eisure Posts Read
Average Nu m ber of W or k Posts Read
2.5
2
1.5
1
0.5
0
0
1
2
3
4
5
6
7
Log (Employee Work Reputation)
8
9
2
1.5
1
0.5
0
0
10
Figure A3: Probability of Work Reading
High Leisure Knowledge
Low Leisure Knowledge
2.5
1
2
3
4
5
6
7
Log (Employee Leisure Reputation)
8
9
10
Figure A4: Probability of Leisure Reading
Given the choice-specific state transition probability matrices, we next discuss employees reading
decisions. We notice that there is little interdependence between work knowledge and leisure knowledge.
Probability of work reading is only affected by employees’ work knowledge level, but not by leisure
knowledge level, and vice versa. In Figure 5 and 6, we use two sets of comparison to summarize the
relation between current knowledge levels and the reading decisions. First, when we fix the work and
leisure reputation status, and control for the knowledge level of other type, people with high work
(leisure) knowledge level read more work- (leisure-) related posts. This is probably because it is easier for
employees with high work (leisure) knowledge to maintain the high knowledge state by reading in the
current period than for those with low knowledge state to move up by reading posts. Second, when we fix
the corresponding knowledge state and only look at the correlation between average work- (leisure-)
related reading and individuals’ work (leisure) reputation, we find that people with a higher reputation
also tend to read more.
Competitive Dynamics
Figure and Figure show that the readership of a new work post decreases with average work
reputation of peers, and the readership of the new leisure post decreases with average leisure reputation of
peers for low-status individuals. When peers' work readership is high, the proportion of readers that an
employee can attract from his/her new work-related post is smaller. This is evidence of the readership
competition among employees. A similar pattern is also observed among high-status individuals. The
slight difference between high-status individuals and low-status individuals is that high-status individuals
can attract more work readership than low status individuals. And therefore, given the same own
reputation and peers’ reputation, high-status individual receives a higher new work readership.
When we plot the probabilities of work and leisure posting vs. peer work and leisure reputation,
then consistent with the readership dynamics, we find that bloggers are more likely to post when their
reputation is higher. Further, the probabilities of work and leisure posting decrease with peer reputation.
40
M ean L og (R ead ersh ip of N ew L eisu re Post)
M ean L og (R eadership of N ew W ork Post)
4
3.5
3
2.5
2
1.5
10
7
6
5
4
3
2
1
10
8
10
6
8
0
Log (Employee Leisure Reputation)
Log (Employee Work Reputation)
Figure A5: Readership for New Work Post
4
2
2
0
8
6
4
4
2
Log (Peer Work Reputation)
10
6
6
4
8
2
0
0
Log (Peer Leisure Reputation)
Figure A6: Readership for New Leisure Post
41
GENERAL APPENDIX 2
External Validity for Reputation Measure
To further alleviate any remaining concerns, we collected additional data on blog citations to
provide external validity for our measure. If our measure of reputation is correct than others should find
bloggers with higher reputation levels, (as constructed by us) as being more authoritative or experts on the
topic. Towards this, we acquired access to data on blog citations from the company that provided us the
original data. The citations or hyperlinks on the Internet have been used as a measure of quality and
influence as indicated by their use in search engine design (Brin and Page 1998). As is well known, these
links are used to identify the most authoritative webpage for a given query. In fact the blog search engine,
Technoratti, gives higher weights to posts with large number of links in their results. If our measure of
reputation is correct, then posts by individuals who we classify as having high reputation, should have a
higher probability of being cited by others.
In our dataset there are about 600 citations. We performed a logit regression to predict the
probability that a post will be cited by others as a function of an individual's reputation and other
variables. The results of the key variables are reported in Table below:
Table A5: Logit regression predicting whether a work post will be cited or not.
Variable
Coefficient (standard error)
Constant
-9.894*** (1.014)
:;< 1.293*** (0.224)
#
0.312** (0.157)
Status=High
0.078 (0.047)
Notes: Variables controlled for but not reported here include peer reputation, peer knowledge, time
dummies, topic dummies, individual leisure reputation, and individual leisure knowledge. *** and **
denote significance at 1 and 5 percent, respectively.
Table A6: Logit regression predicting whether a leisure post will be cited or not.
Variable
Coefficient (standard error)
Constant
-8.741*** (2.795)
:;< 0.939*** (0.290)
#
0.049* (0.027)
Status=High
-0.054 (0.079)
Notes: Variables controlled for but not reported here include peer reputation, peer knowledge, time
dummies, topic dummies, individual work reputation, and individual work knowledge. *** and * denote
significance at 1 and 10 percent, respectively.
42
These results show that posts of individuals with high reputation as constructed by us are
considered as being more influential by others and hence more likely to be cited by others. This provides
some external validity to our construction of reputation measure.
To provide additional validity for our measure, we used the help of two external coders who were
experts on software testing to classify posts as high or low quality. We only focus on software testing
category as that corresponds with the external coders' expertise. We provided these coders with all posts
(447) of 119 individuals posted in Software Testing Category. We provided the coders with only the text
of the post. The coders disagreed on only 6 posts. We discard these 6 posts for further analysis. There are
59 posts that were classified as high quality by these coders.
If our construction of reputation measure is correct then the posts, which are classified as high
quality, should receive higher levels of readership. The basic idea about our reputation measure is that
highly reputed bloggers would produce high quality posts and the audience will be able to identify them
as experts on the topic, which could bring them indirect incentives. Note that this is also one of the
benefits that we argue individuals with high reputation get, i.e., get identified as an expert on a certain
topic within the enterprise. We find that the correlation between quality and readership is 0.91. This
indicates that readers are attracted towards high quality posts. Further, we test the likelihood of bloggers
coming up with high quality posts as a function of their reputation.
Table A7: Logit regression predicting whether a software testing post is high quality or not
(observation level is a post)
Variable
Coefficient (standard error)
Constant
-3.284*** (0.091)
:;< 0.578*** (0.149)
:;< 0.092 (0.174)
#
0.881** (0.404)
#
0.011 (0.083)
Status=High
0.091 (0.054)
Notes: *** denote significance at 1 percent.** denotes significance at 5 percent.
The results indicate that the individuals with high reputation are more likely to produce high
quality posts. This combined with the earlier result that high quality posts receive higher levels of
readership imply that readers identify the bloggers who receive higher readership as the ones who
produce high quality posts. Such identification by the readers would correspond to a higher reputation for
bloggers in the enterprise setting.
43
External Validity for Knowledge
An external validity for our knowledge measure could be that knowledge of an employee from
reading blogs as constructed by us leads to a higher performance rating for the employee. We have access
to performance data on a subset of employees (272) in our sample. The performance data corresponds to
an employee's performance from July 2007 to June 2008. The employee is evaluated by her immediate
superior and rated on a discrete scale from 1 to 4 with 1 being the lowest and 4 being the highest rating.
The ratings were provided at the end of June 2008. Because our blogging data corresponds to July 2007
to Oct 2008, it overlaps very well with the performance evaluation time period.
One of our arguments in the paper is that knowledge gained from reading blogs increases an
employee's job performance. We use the data to test this relationship. We construct a measure of average
work knowledge for an individual as the average of her knowledge from Nov 2007 to June 2008. (Recall
that data from July 2007 to Oct 2007 is used as the hold-out sample for initial state calculation.) We first
segment the individuals based on their job performance rating. For each rating, we calculate the average
of the average knowledge of all individuals who received that rating. Below we show this graphically. As
is obvious from the figure, individuals who are in a higher knowledge state on average receive a higher
job performance evaluation.
Average Work
Knowledge State
1
0.8
0.6
0.4
0.2
0
2
3
Performance Rating
4
Figure A7: Knowledge State versus Performance Rating of Employees
We also constructed an ordered logistic regression to further establish the relationship between
knowledge of an employee from reading blogs and her subsequent performance evaluation.
44
Table A8: Ordered Logistic Regression predicting Performance Rating
Dependent Variable: Performance Rating (observation level is employee)
Variable
@
Â
#Â
Z
# ÄÃ
Parameter Estimate Standard Deviation
2.17*
1.15
1.25
7.94***
Cut Point 1
1.65***
0.45
Cut Point 2
7.47***
0.75
Notes: Variables controlled for but not reported here include individual average reputations on work and
leisure, peer average reputations on work and leisure, peer average knowledge of work and leisure. ***
and * denote significance at 1 and 10 percent, respectively.
The results indicate that individuals who are on average more knowledgeable (as constructed by
us) receive a higher performance evaluation. We acknowledge that we only have access to a cross section
of performance data, and hence, we cannot conclude a causal relationship between knowledge gained
from blogging and job performance in the workplace. However, we can at least see that the measure of
knowledge as constructed by us for our analyses is strongly associated with job performance.
45
TECHNICAL APPENDIX 1
First Stage: CCP and Knowledge State Transition Probability
The HMM in our model is comprised of three elements: 1) the initial knowledge state distribution, Å; 2)
the knowledge state transition probability matrix Q(t, t+1); and 3) the observed outcome (action)
probability vector Λ. We assume that the evolution of # only depends on , and unobserved
error and the evolution of # only depends on , and an unobserved error. Nevertheless, the
observed outcome probability vector, which is CCP defined above, is jointly determined by 3 = ,
, # and # and 3M = M , M . Note that for each period we observe the work and leisure
related reputations of all employees. Hence, the state transition matrix in our HMM corresponds to only
knowledge state transitions.
Let λ= Å, ±, Ç denote the complete parameter set of the HMM model. Let 3 = (3 , 3 , … , 3Â )
denote employee i’s sequences of states from period 1 to period T and = ( , , … , Â ) denote an
observed action sequence. The probability of observing the sequence actions conditional on the
current states are defined as
Â
n |3 , 3M = È n |É, 3 , 3M Z
Given that any two adjacent observed outcomes are linked only through the hidden states, we
have
n |É, 3 , 3M = n|É, 3 , 3M n |É, 3 , 3M … … n |É, 3 , 3M Where n |É, 3 , 3M is probability of observing action given states 3 and 3M . States
other than # and # are all observed in the data, so they are deterministic. For notational simplicity
we suppress all states other than # and # . Since the evolutions of # = (# , #
, … , #Â )
and
# = ( # , # , … , #Â ) are independent conditional on , and , according to
Markov property, the probability of such an unobserved state sequence, #
(j= w or l), is given by
nP#
ÊÉQ = ůËnP# |#
Qn#$
|# , … … n#Â
|#ÂM
Then the joint probability of and # =(# , # ) is given by
n , # |É = n |É, # , n# , |É
Applying rule of total probability, we get the likelihood of observing a sequence of outcome is
Ì = n |É, 3MÍ = @ n |É, # , 3MÍ n# , |É, 3MÍ ∀ÍE ,
Where 3MÍ represents all observed state variables except for individuals’ own knowledge states.
To simplify calculation of the likelihood, following MacDonald and Zucchini (1997), we rewrite the
equation above as
Ì = ůǯ, 1±¯, 1,2ǯ, 2±¯, 2,3 … ±¯, à − 2, à − 1ǯ, Ã1′
46
(A1)
In Equation (A1), 1’ is a vector of ones with length »À × »:. For consistency, we assume that
# and # have kw and kl possible levels. We try multiple values of kw and kl and implement maximum
likelihood estimation under each value. We choose the value of kw and kl that jointly minimize Bayesian
Information Criterion (BIC). In the Equation (A1), ů is the initial state distribution # . Then for each j
in w or l, define ±
¯, 4, 4 + 1 = n#R
|#
, , . Assume that employees’ knowledge states
can only transit between adjacent states, we have
ÎÏ µÐ,ÑMÒÓÔÕÖ ×EFÓÕ MÒÓÔØÖ ×EFÓØ P(#R
=m+1Ê#
= S, , Q= 1 − RÎÏ µÐ,ÑMÒ
P(#R
=m-1Ê#
= S, , Q=
=
P(#R
=mÊ#
= S, , Q
ÓÔ ×EFÓÕ MÒÓÔØÖ ×EFÓØ ÎÏ µ,ÑMÒÓÔÕÙ ×EFÓÕ MÒÓÔØÙ ×EFÓØ RÎÏ µ,ÑMÒÓÔÕÙ ×EFÓÕ MÒÓÔØÙ ×EFÓØ ,
¼T½ µℎ, S − ´
Ñ − ´
Ñ ¼T½ µ:, S − ´
Ñ − ´
Ñ −
1 + ¼T½ µℎ, S − ´
Ñ − ´
Ñ 1 + ¼T½ µ:, S − ´
Ñ − ´
Ñ Here µℎ, S and µ:, Sare thresholds of moving from m to a higher state and moving from m
to a lower state respectively and we also set µℎ, S = ∞ and µ:, S = −∞.´
Ñ (´
Ñ ) captures the
effect of one reading (posting) on the probability that the type j knowledge state can move up; while
´
Ñ (´
Ñ ) captures the effect of one reading (posting) on the probability that the type j knowledge
state will move down. ±¯, 4, 4 + 1, the joint state transition matrix considering both work-knowledge
and leisure knowledge state, will be the Kronecker product of ± ¯, 4, 4 + 1 and ± ¯, 4, 4 + 1.
ǯ, 4 =
¯<P½ |É, # = 1, # = 1, ½ |É, # = 1, # = 2, … , ½ |É, # = »À, # = »:Q is a
»À × »: Ú6 »À × »: matrix with nonzero diagonal elements and zero off-diagonal elements.. Finally,
we need to know n |É, # , or more generally, ½ |É, 3 . Following Bajari et al. (2009), we use
sieve logit method to estimate CCP. The sieve logit estimator is simply the standard multinomial logit
where the covariates are predefined basis functions, and a sieve of polynomial spaces are selected. Here,
CCP is an unknown function of 3 and 3M . We will approximate it with a finite-dimensional sieve. The
simplest form of a sieve is a space of polynomials. In order to approximate the function better, we use a
second degree orthogonal polynomial as sieve basis. To be more specific, basis (sieve) functions are
defined as Ú Û 3 , 3M = Ú 3 , 3M , Ú 3 , 3M , … ÚÛ 3 , 3M ′,, where B is the dimension of the sieve.
The CCP can then be expressed as
n7 |3 =
¼T½ Ú Û 3 , 3M ′ÜUÛE ß
′ÝÞ
Û
E¿ + 1
∑0$
UE ZA ¼T ½ ¾Ú 3 , 3M The only difference between the standard multinomial logit and sieve logit is that the variables in
sieve logit, the exponential term is not a simple linear combination of states, but the sieve basis. Similar to
any HMM paper, We apply maximum likelihood method to estimate É. The model with kw=2 and kl=2
47
has the smallest BIC, indicating that each type of knowledge state has 2 levels. We define the two levels
as high and low.
Remember that λ= Å, ±, Ç, where Q represents the state transition probability, and Ç
includes all information about the CCP. Once we estimate λ, the estimation of CCP and knowledge
state transition probabilities are completed.
48
TECHNICAL APPENDIX 2
First Stage: Estimate the Value Functions f g; ¥, ž
f g; ¥, ž denotes the value function of individual ¯ at state s, given individual ¯ and his/her peers
follow the same strategy ¥. By definition,
Y
f 3; d; ž = V ¦j@ 8 cg§ , k§ , g§ , ' ; ž¨ pA = p; ž©.
ZA
Given the first stage estimate of CCP, and the state transition probabilities, we can use forward
simulation method to estimate f g; ¥, ž. First, we state from state gá (the state of all individuals), draw
privite shocks ' from the i.i.d type I extreme value distribution for each individual ¯. With this
notation, action is employee i’s optimal choice when
where
, p + ' ≥ o , p + ' o ,
∀o ∈ o , p − , p = :¤n7o |p − :¤n7 |p
as given Hotz and Miller (1993). Since we already know n7 |p from previous steps, we can
obtain o , p − , p. And then we can further determine individual’s decision in each period ( ).
Then, we can draw new state gâ using the state transition probability estimated before. We repeat the
whole process to T periods, where T is sufficiently large numbers, we will be able to approximate
f g; ¥, ž.
49