Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
Country differences in measuring attitude towards immigration Josine Verhagen and Jean-Paul Fox 1 Introduction In the ESS study, the cross-national measurement and analysis of a construct like attitude towards immigration are complex problems. It is not possible to apply common measurement techniques in a straightforward way due to cross-national differences. The item-based measurement instrument is known to act differently over countries. Previously cross-national differences in item characteristics have been found for items measuring the attitude towards immigration (Meuleman, Davidov & Billiet, 2009; Welkenhuysen-Gybbels, Billiet & Cambré, 2003). When the characteristics of an item (strength of relationship with the underlying attitude, endorsement level) differ between countries, scores on the item cannot be directly compared. The traditional methods for dealing with such items are based on the presence of anchor items, which have equal item characteristics across countries (e.g. Thissen, Steinberg, & Wainer, 1993, see for an overview f.e. Teresi, 2006; Vandenbergh & Lance, 2000). The anchor items are needed to establish a common latent scale for all countries such that meaningful comparisons can be made. However, detecting anchor items in a cross-national survey can be a complex and timeconsuming process. We propose a new method using random country-specific item characteristics (De Jong, Steenkamp & Fox, 2007; Fox & Verhagen, 2010; Fox, 2010) to measure the attitude towards immigration on a common international latent scale. In this way it is possible to explore cross-national differences in both item characteristics and attitude without using anchor items. It also provides the opportunity to test invariance for all item parameters simultaneously, without the necessity to estimate multiple models and compare the results. Furthermore, covariates can be used to explain variance in item parameters. 2 Illustration of measurement non-invariance We will illustrate the problem of measurement non-invariance with items that measure attitude towards immigration in the ESS. In the present analysis, eight dichotomized items concerning the perceived consequences and allowance of immigration were used, and all 22 countries participating in ESS round 1 (2002). In Table 1, the percentage of respondents in Greece, Norway, Poland and Sweden that agreed with three of the statements are shown. There are large differences between countries in these percentages. These differences are partly due to country mean differences in the attitude, but this can not explain why in Norway and Poland a similar percentage of respondents agrees with the first statement, while in Poland a much higher percentage of respondents agrees with the statement about immigrants taking away jobs. Table 1: Percentage of respondents agreeing with three statements 1 3 Modeling and testing measurement invariance The responses are modeled with the two parameter normal ogive IRT model. In IRT models the probability that a person agrees with the statement in an item depends on both the attitude of the person and on the characteristics of the item. Two commonly modeled item characteristics are the threshold and the discrimination parameters. The threshold parameter indicates the attitude value at which the probability of agreeing with the statement in the item reaches .5; the higher the threshold parameter the stronger the attitude before a respondent agrees with the statement in the item. The discrimination parameter indicates how well the item discriminates between respondents with low and high scores on the attitude, which can be interpreted as how relevant the item is for measuring the attitude. A multilevel group structure can be added on the latent variable, resulting in a multilevel IRT model (e.g., Fox, 2007; Fox and Glas, 2001) with different latent means and variances for each country. This is similar to multilevel modeling of attitude scores in different countries, where the person scores are modeled as random deviations from their country means. To account for differences in item parameters between countries, different threshold and discrimination parameters can be specified for each country. To keep the estimated attitude values for each country on the same scale, instead of using anchor items, the country-specific threshold and discrimination parameters are modeled as random deviations from the general "international" item parameters for each item. This is a form of multilevel modeling, where the country-specific item parameters are the lower level and the general item parameters for each item are the higher level. In addition, when all items are endorsed less by a country, this is represented by a lower mean on the construct, while differences in the within country orders of items are represented by differences in country-specific item parameters. To test whether the variance in item characteristics over countries is substantial, a test based on the Bayes Factor was developed which compares the likelihood of invariance with the likelihood of non-invariance. As a general rule, when the Bayes factor is larger than 3 and therefore the likelihood of invariance is three times more likely than non-invariance, an item is indicated as invariant. The advantage of this test is that with running only one analysis, the invariance of the parameters of all items can be tested simultaneously. 4 Results In Table 2, the estimated general item parameters for each item are shown, and the estimated variance of this parameter over countries. In general, the statement about immigrants causing a worse crime rate has the lowest threshold (-.95) and discrimination (.76) parameters. The statement about immigrants undermining the country’s culture has the highest threshold (.67), the statement about immigrants being bad for the economy the highest discrimination parameter. A Bayes Factor test of the variance component indicated for each parameter whether there are item parameter differences between countries. No "anchor" items with equal item characteristics in all countries were found, as no item had a Bayes factor higher than 3 for both item characteristics. Table 2: Estimated item parameters and variances and Bayes factors 2 The estimated means and threshold parameters for the example items and countries are in Table 3. The different country means explain a large part of the differences between Greece and Sweden. The differences on the third item between Norway and Poland, with a similar mean attitude, are explained by different thresholds for the item. In Poland this item has a much lower threshold, which means that people with a relatively low anti-immigrant attitude will agree with this item, while in Norway the anti-immigrant attitude has to be relatively high for respondents to agree with this statement. Table 3: Estimated item parameters and variances and Bayes factors Background information was used to explore possible causes of differences in item characteristics. Following the suggestion of Welkenhuysen-Gybbels, Billiet and Cambre (2003) about explanations of differences in item parameters, the following variables were included in the analysis: The percentage of immigrants (% Immigrants) and the percentage of unemployment (% Unemployed) in the country at the time of the survey, and the gross domestic product (GDP) per capita (a measure of a country’s overall economic output). These explanatory variables are also frequently used as possible predictors for country-level differences in attitude towards immigrants (Card, Dustmann & Preston, 2005; Malchow-Moller, Munch, Schroll & Saksen , 2009; Meuleman, Davidov & Billiet, 2009; Sides & Citrin ,2007). Significant effects were found of GDP on the item parameters of the item about immigrants taking away jobs. In Figure 1 the item characteristic curves are shown for countries with low, average and high GDP. Countries with low overall economic output (GDP) had a lower threshold and a steeper slope on the item about immigrants taking away jobs than countries with a higher GDP, which indicates that in those countries this item was both more relevant to the attitude and endorsed more frequently given attitude level. Figure 1: Country-specific item characteristic curves of item six for low and high GDP countries 5 Conclusion It was shown that a multilevel IRT model with random item effects can be useful to detect variance in item parameters for all items simultaneously and without using anchor items. Useful results can be provided regarding which items are invariant and possible explanations for this invariance can be investigated using covariates on the item parameters. Scores can be acquired on a common international latent scale taking invariance in item characteristics into account. References Billiet, J. & Welkenhuysen-Gybels, J. (2004). Assessing cross-national construct equivalence in the ESS: The case of six immigration items. Paper presented at the Sixth International conference on Social Science Methodology, Amsterdam. Card, D., Dustmann, C., & Preston, I. (2005). Understanding attitudes to immigration: The migration and minority module of the first European Social Survey. Center for Research and Analysis of Migration Discussion Paper. Davidov, E., Meuleman, B., Billiet, J., & Schmidt, P. (2008). Values and support for immigration: A cross-country comparison. European Sociological Review, 24, 583–599. 3 De Jong, M. G., Steenkamp, J. B. E. M., & Fox, J.-P. (2007). Relaxing measurement invariance in cross-national consumer research using a hierarchical IRT model. Journal of Consumer Research, 34, 260–278. Fox, J.-P. (2007). Multilevel IRT modeling in practice with the package mlirt. Journal of Statistical Software, 20, 1–16. Fox, J.-P. (2010). Bayesian item response modeling: Theory and applications. New York: Springer. Fox, J.-P. & Glas, C. A. W. (2001). Bayesian estimation of a multilevel IRT model using Gibbs sampling. Psychometrika, 66, 271–288. Fox, J.-P. & Verhagen, A. J. (2010). Random item effects modeling for cross-national survey data. In E. Davidov, P. Schmidt, & J. Billiet (Eds.), Cross-cultural analysis: Methods and applications (pp. 467–488). London: Routeledge Academic. Meuleman, B., Davidov, E., & Billiet, J. (2009). Changing attitudes toward immigration in Europe, 2002–2007: A dynamic group conflict theory approach. Social Science Research, 38, 352–365. Sides, J. & Citrin, J. (2007). European opinion about immigration: The role of identities, interests and information. British Journal of Political Science, 37, 477–504. Teresi, J. A. (2006). Overview of quantitative measurement methods - Equivalence, invariance, and differential item functioning in health applications. Medical Care, 44, S39–S49. Thissen, D., Steinberg, L., & Wainer, H. (1993). Detection of differential item functioning using the parameters of item response models. In P. W. Holland & H. Wainer (Eds.),Differential item functioning (pp. 67–113). Hillsdale, NJ: Lawrence Erlbaum. Vandenberg, R. J. & Lance, C. E. (2000). A review and synthesis of the measurement invariance literature: Suggestions, practices, and recommendations for organizational research. Organizational Research Methods, 3, 4. Welkenhuysen-Gybels, J., Billiet, J., & Cambre, B. (2003). Adjustment for acquiescence in the assessment of the construct equivalence of Likert-type score items. Journal of Cross-Cultural Psychology, 34, 702–722. 4