3 Partial invariance

Partially invariant models are model specifications that allows for some item indicators parameters to vary freely, while still having enough common parameters in the measurement model among groups. These models offer a way to generalize findings based on the invariant parameters while excluding those that exhibit non-invariance in the results generalization (Meredith, 1993; Millsap, 2011). Hence, the name partially invariant.

These models allow researchers to identify group differences or treatment effects reliably for responses to items that remain consistent across groups, while keeping non invariant indicators in the measurement model. For instance, in case of causal inference exercises such as experiments and intervention studies, where control and treated groups are compared, if a treatment effect holds equally for 10 out of 12 items within a scale, assertions between treated and non-treated can be made safely for these 10 items, while not making such claims onto the two non-invariant indicators (e.g., Gilbert, 2024).

However, if the number of non-invariant parameters in the measurement models is too large, then is less credible the ability of making claims that are applicable to all compared groups across all items. As a point reference, Muthén, & Asparouhov (2014) suggest that if 75% of the parameters between groups are held common (while 25% of the response parameters are non-invariant), latent means comparisons between groups are of good enough quality. Yet, such a threshold can be put to the test with Monte Carlo studies for the speficities of a measurement model, while taking into account the number of groups being compared (Muhen & Asparouhov, 2014, p3), and the research purpose.

There are few caveats to consider regarding partially invariant models when group comparisons are of interest. If non-invariant items are excluded, this selective exclusion can alter the meaning of the generated score. Exclusion of non-invariant items can narrow the scope of the attribute of interest. For example, if one has three groups, and twelve items, is possible to have a scenario in which the measurement model is invariant for two of the three groups. And simultaneously, the measurement equivalence could hold with only four out of the twelve items for the three groups. As such, researchers can have the dilemma of narrowing the amount of indicators at the cost of reliability and narrowing the meaning of the generated score; and compare the three groups. Or, to do comparison across all the items for only two of the three groups. Yet, with partially invariant model while is allowed to keep all indicators in the measurement model, the meaning of the group differences would partially generalize to the responses where the measurement model parameters are held equal. Restricting the comparison to only the common items among the three groups can restrict the scope of the intended interpretations. This a central problem for cross-cultural studies, where including more diverse groups can augment the chances of non-comparability (Van De Vijver & Matsumoto, 2011). In essence, partially invariant models can pose interpretive challenges. The treatment effects or group differences identified in these models may not fully represent the intended construct’s complexity. Researchers must exercise caution in claiming generalizability across indicators, ensuring transparency in reporting the extent of invariance and acknowledging the limitations of their results (Fischer et al., 2019; Van de Schoot et al., 2012).

In practical terms, the process of implementing partially invariant models is time-consuming and often requires significant manual intervention (Svetina et al., 2020; Robitzsch & Lüdtke, 2023). Arranging and adjusting the measurement model to exclude non-invariant items, or to freely estimate measurement model indicators between partially comparable groups, demands meticulous attention to detail and an iterative testing process. As a whole, is a procedure which can hinder efficiency. This labor-intensive aspect of the method underscores the need for more streamlined analytical tools or automated procedures to facilitate its application in large-scale studies.

Alignment methods (Muthén & Asparouhov, 2014; Asparouhov & Muthén, 2014; Asparouhov & Muthén, 2022) are a collection of procedures which are helpful in finding the least discrepant solution among groups for a given measurement model. This method searches for an optimal solution where the number of discrepant parameters (i.e., non-invariant) is minimized. Alignment is an approach that’s becoming a popular method for invariance studies in large-scale assessment research literature, with applications in different studies including Trends in International Mathematics and Science Study (TIMSS) (e.g., Yiğiter, 2024), International Civic and Citizenship Education Study (ICCS) (e.g., Ziemes, 2024), Program for International Student Assessment (PISA) (e.g., Wurster, 2022), Teaching and Learning International Survey (TALIS) (e.g., Fang et al., 2025), to name a few. Interested readers can consult Sandoval-Hernandez et al. (2025) for a scoping review on the topic.

Although is a procedure which helps to search partially invariant solutions, the resulting partially invariant solutions are conditional to the selection algorithm (Pokropek, Lüdtke, & Robitzsch, 2020). Thus, is not a method which would yield non debatable partially invariant solutions, but plausible partially invariant solutions. As such, is the researcher who would need to make a judgment call regarding if the reached solution is a useful model specification for their purposes, considering its limitations.

In conclusion, while partially invariant models offer a practical approach to addressing measurement invariance challenges, their limitations highlight the importance of careful interpretation and methodological rigor. Aligment methods offer an interesting tool to search for partially invariant model specification in cases where strict and scalar invariance is not held. Apart from aligment methods, there are other alternatives that are aim at addressing the challenges of comparing many groups such as bayesian aproximate invariance, measurement invariance via multilevel models, mixture multigroup factor analysis among others (see Leitgöb et al., 2023). These other alternatives, besides aignment methods are out of the scope of the present guidelines

In the following section (section 3), we describe what is in the library(rd3c3), and how it can help to fit model-based measurement invariance onto graded response models, and how it helps to fit alignment method optimization onto the same measurement models.

3.1 References

Asparouhov, T., & Muthén, B. (2014). Multiple-group factor analysis alignment. Structural Equation Modeling: A Multidisciplinary Journal, 21(4), 495–508. https://doi.org/10.1080/10705511.2014.919210

Asparouhov, T., & Muthén, B. (2022). Multiple group alignment for exploratory and structural equation models. Structural Equation Modeling: A Multidisciplinary Journal, 1–23. https://doi.org/10.1080/10705511.2022.2127100

Fang, G., Teo, T., & Chan, P. W. K. (2025). Testing for approximate measurement invariance of instructional quality in the Teaching and Learning International Survey (TALIS) 2018. Humanities and Social Sciences Communications, 12(1), 520. https://doi.org/10.1057/s41599-025-04870-4

Fischer, J., Praetorius, A. K., & Klieme, E. (2019). The impact of linguistic similarity on cross-cultural comparability of students’ perceptions of teaching quality. Educational Assessment, Evaluation and Accountability, 31, 201–220.

Gilbert, J. B. (2024). Modeling item-level heterogeneous treatment effects: A tutorial with the glmer function from the lme4 package in R. Behavior Research Methods, 56(5), 5055–5067. https://doi.org/10.3758/s13428-023-02245-8

Leitgöb, H., Seddig, D., Asparouhov, T., Behr, D., Davidov, E., De Roover, K., Jak, S., Meitinger, K., Menold, N., Muthén, B., Rudnev, M., Schmidt, P., & van de Schoot, R. (2023). Measurement invariance in the social sciences: Historical development, methodological challenges, state of the art, and future perspectives. Social Science Research, 110(October 2022). https://doi.org/10.1016/j.ssresearch.2022.102805

Meredith, W. (1993). Measurement invariance, factor analysis, and factorial invariance. Psychometrika, 58(4), 525–543. https://doi.org/10.1007/BF02294825

Millsap, R. E. (2011). Statistical Approaches to Measurement Invariance. New York, NY: Routledge.

Muthén, B., & Asparouhov, T. (2014). IRT studies of many groups: the alignment method. Supplemental Material. Frontiers in Psychology, 5, 23–31. https://doi.org/10.3389/fpsyg.2014.00978

Pokropek, A., Lüdtke, O., & Robitzsch, A. (2020). An extension of the invariance alignment method for scale linking. Psychological Test and Assessment Modeling, 62(2), 305–334.

Robitzsch, A., & Lüdtke, O. (2023). Why full, partial, or approximate measurement invariance are not a prerequisite for meaningful and valid group comparisons. Structural Equation Modeling, 30(2), 190–204. https://doi.org/10.1080/10705511.2023.2191292

Svetina, D., Rutkowski, L., & Rutkowski, D. (2020). Multiple-group invariance with categorical outcomes using updated guidelines: An illustration using Mplus and the lavaan/semTools packages. Structural Equation Modeling: A Multidisciplinary Journal, 27(1), 111–130. https://doi.org/10.1080/10705511.2019.1602776

Van de Schoot, R., Lugtig, P., & Hox, J. (2012). A checklist for testing measurement invariance. European Journal of Developmental Psychology, 9(4), 486–492. https://doi.org/10.1080/17405629.2012.686740

Van De Vijver, F.J.R., Matsumoto, D. (2011) Introduction to the methodological issues associated with cross-cultural research. In: Matsumoto, D., van de Vijver, F.J.R. (Eds.), Cross-Cultural Research Methods in Psychology, 1st ed.,. Cambridge University Press, pp. 1–14.

Wurster, S. (2022). Measurement invariance of non-cognitive measures in TIMSS across countries and across time. An application and comparison of Multigroup Confirmatory Factor Analysis, Bayesian approximate measurement invariance and alignment optimization approach. Studies in Educational Evaluation, 73, 101143. https://doi.org/10.1016/j.stueduc.2022.101143

Yiğiter, M. S. (2024). Cross-National Measurement of Mathematics Intrinsic Motivation: An Investigate of Measurement Invariance with MG-CFA and Aligment Method Across Fourteen Countries. Journal of Theoretical Educational Science, 17(1), 1–27. https://doi.org/10.30831/akukeg.1207350

Ziemes, J. F. (2024). Political trust among European youth: Evaluating multi -dimensionality and cross-national measurement comparability. Studies in Educational Evaluation, 73, 101321. https://doi.org/10.1016/j.stueduc.2023.101321