Analyzing Within-Group Changes in an Experiment: To Deal with Retest Effects, You Have to Go Latent But Not All Latents Are Equal

John Protzko; Jan te Nijenhuis; Khaled Elsayed Ziada; Hanaa Abdelazim Mohamed Metwaly; Salaheldin Farah Bakhiet; Yousif Balil Bashir Maki

doi:doi:10.11648/j.sjams.20251302.12

Research Article |

| Peer-Reviewed

Analyzing Within-Group Changes in an Experiment: To Deal with Retest Effects, You Have to Go Latent But Not All Latents Are Equal

John Protzko

, Jan te Nijenhuis

, Khaled Elsayed Ziada

, Hanaa Abdelazim Mohamed Metwaly

, Salaheldin Farah Bakhiet^*

, Yousif Balil Bashir Maki

Published in Science Journal of Applied Mathematics and Statistics (Volume 13, Issue 2)

Received: 27 February 2025 Accepted: 10 March 2025 Published: 14 May 2025

Views: Downloads:

Download PDF

Share This Article

Twitter
Linked In
Facebook

Abstract

Analyzing within-group change in an experimental context, where the same group of people is measured before and after some event, can be fraught with statistical problems and issues with causal inference. Still, these designs are common from political science to developmental neuropsychology to economics. In cases with cognitive data, it has long been known that a second administration, with no treatment or an ineffective manipulation between testings, leads to increased scores at time 2 without an increase in the underlying latent ability. We investigate several analytic approaches involving both manifest and latent variable modeling to see which methods are able to accurately model manifest score changes with no latent change. Using data from 760 schoolchildren given an intelligence test twice, with no intervention between, we show using manifest test scores, either directly or through univariate latent change score analysis, falsely leads one to believe an underlying increase has occurred. Second-order latent change score models also show a spurious significant effect on the underlying latent ability. Longitudinal structural equation modeling with measurement invariance correctly shows no change at the latent level when measurement invariance is tested, imposed, and model fit tested. When analyzing within-group change in an experiment, analyses must occur at the latent level, measurement invariance tested, and change parameters explicitly tested. Otherwise, one may see change where none exists.

Published in	Science Journal of Applied Mathematics and Statistics (Volume 13, Issue 2)
DOI	10.11648/j.sjams.20251302.12
Page(s)	34-44
Creative Commons	This is an Open Access article, distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution and reproduction in any medium or format, provided the original work is properly cited.
Copyright	Copyright © The Author(s), 2025. Published by Science Publishing Group

Keywords

Pre-post Change, Statistical Methods, Model Comparison, Latent Variable Modelling

References

[1]	Arrindell, W. A. (1993). The fear of fear concept: Stability, retest artefact and predictive power. Behaviour Research and Therapy, 31(2), 139-148.
[2]	Arrindell, W. A. (2001). Changes in waiting-list patients over time: data on some commonly- used measures. Beware! Behaviour Research and Therapy, 39(10), 1227-1247.
[3]	Berns, C., Brüchle, W., Scho, S., Schneefeld, J., Schneider, U., & Rosenkranz, K. (2020). Intensity dependent effect of cognitive training on motor cortical plasticity and cognitive performance in humans. Experimental Brain Research, 238(12), 2805-2818. https://doi.org/10.1007/s00221-020-05933-5
[4]	Bonnechère, B., Klass, M., Langley, C., & Sahakian, B. J. (2021). Brain training using cognitive apps can improve cognitive performance and processing speed in older adults. Scientific Reports, 11(1), 1-11. https://doi.org/10.1038/s41598-021-91867-z
[5]	Borsboom, D., Mellenbergh, G. J., & van Heerden, J. (2003). The theoretical status of latent variables. Psychological Review, 110(2), 203.
[6]	Brown, T. A. (2006). Confirmatory factor analysis for applied research. New York: Guilford Press.
[7]	Cane, V. R., & Heim, A. W. (1950). The effects of repeated retesting: III. Further experiments and general conclusions. Quarterly Journal of Experimental Psychology, 2(4), 182-197.
[8]	Castro-Schilo, L., & Grimm, K. J. (2018). Using residualized change versus difference scores for longitudinal research. Journal of Social and Personal Relationships, 35, 32-58. https://doi.org/10.1177/0265407517718387
[9]	Choquette, K. A., & Hesselbrock, M. N. (1987). Effects of retesting with the Beck and Zung depression scales in alcoholics. Alcohol and Alcoholism, 22(3), 277-283.
[10]	Coman, E. N., Picho, K., McArdle, J. J., Villagra, V., Dierker, L., & Iordache, E. (2013). The paired t-test as a simple latent change score model. Frontiers in Psychology, 4, 738. https://doi.org/10.3389/fpsyg.2013.00738
[11]	Durham, C. J., McGrath, L. D., Burlingame, G. M., Schaalje, G. B., Lambert, M. J., & Davies, D. R. (2002). The effects of repeated administrations on self-report and parent-report scales. Journal of Psychoeducational Assessment, 20(3), 240-257.
[12]	Eid, M. (2000). A multitrait-multimethod model with minimal assumptions. Psychometrika, 65(2), 241-261.
[13]	Fan, X., Thompson, B., & Wang, L. (1999). Effects of sample size, estimation methods, and model specification on structural equation modeling fit indexes. Structural Equation Modeling: A Multidisciplinary Journal, 6(1), 56-83.
[14]	Farmus, L. Arpin-Cribbie, C. A., & Cribbie, R. A. (2019). Continuous predictors of pretestposttest change: Highlighting the impact of the regression artifact. Frontiers of Applied Mathematics and Statistics, 4, 64. https://doi.org/10.3389/fams.2018.00064
[15]	Ferrer, E., Balluerka, N., & Widaman, K. F. (2008). Factorial invariance and the specification of second-order latent growth models. Methodology, 4(1), 22-36.
[16]	Geiser, C., & Lockhart, G. (2012). A comparison of four approaches to account for method effects in latent state-trait analyses. Psychological Methods, 17(2), 255. https://doi.org/10.1037/a0026977
[17]	Griffin, B., Bayl‐Smith, P., Duvivier, R., Shulruf, B., & Hu, W. (2019). Retest effects in medical selection interviews. Medical Education, 53(2), 175-183. https://doi.org/10.1111/medu.13759
[18]	Hoffman, L., Hofer, S. M., & Sliwinski, M. J. (2011). On the confounds among retest gains and age-cohort differences in the estimation of within-person change in longitudinal studies: a simulation study. Psychology and Aging, 26(4), 778.
[19]	Jensen, A. R. (1965). Scoring the Stroop test. Acta Psychologica, 24(5), 398-408.
[20]	Jones, S. M., Shulman, L. J., Richards, J. E., & Ludman, E. J. (2020). Mechanisms for the Testing Effect on Patient-Reported Outcomes. Contemporary Clinical Trials Communications, 100554. https://doi.org/10.1016/j.conctc.2020.100554
[21]	Kievit, R. A., Brandmaier, A. M., Ziegler, G., Van Harmelen, A. L., de Mooij, S. M., Moutoussis, M.,... & Lindenberger, U. (2018). Developmental cognitive neuroscience using latent change score models: A tutorial and applications. Developmental Cognitive Neuroscience, 33, 99-117. https://doi.org/10.1016/j.dcn.2017.11.007
[22]	Köhler, C., Hartig, J., & Schmid, C. (2020). Deciding between the covariance analytical approach and the change-score approach. Multivariate Behavioral Research. https://doi.org/10.1080/00273171.2020.1726723
[23]	Lenhart, L., Steiger, R., Waibel, M., Mangesius, S., Grams, A. E., Singewald, N., & Gizewski, E. R. (2020). Cortical reorganization processes in meditation naïve participants induced by 7 weeks focused attention meditation training. Behavioural Brain Research, 112828. https://doi.org/10.1016/j.bbr.2020.112828
[24]	Longwell, B. T., & Truax, P. (2005). The differential effects of weekly, monthly, and bimonthly administrations of the Beck Depression Inventory-II: Psychometric properties and clinical implications. Behavior Therapy, 36(3), 265-275. https://doi.org/10.1016/S0005-7894(05)80075-9
[25]	Lüdtke, O., & Robitzsch, A. (2020, September 12). ANCOVA versus Change Score for the Analysis of Nonexperimental Two-Wave Data: A Structural Modeling Perspective. https://doi.org/10.31234/osf.io/5zdme
[26]	Maris, E. (1998). Covariance adjustment versus gain scores—revisited. Psychological Methods, 3, 309-327.
[27]	Markus, K. A., & Borsboom, D. (2013). Frontiers of test validity theory: Measurement, causation, and meaning. New York, NY: Routledge
[28]	Maulik, P. K., Kallakuri, S., Devarapalli, S., Vadlamani, V. K., Jha, V., & Patel, A. (2017). Increasing use of mental health services in remote areas using mobile technology: a pre- post evaluation of the SMART Mental Health project in rural India. Journal of Global Health, 7(1): 010408. https://doi.org/10.7189/jogh.07.010408
[29]	McArdle, J. J. (2009). Latent variable modeling of differences and changes with longitudinal data. Annual Review of Psychology, 60, 577-605.
[30]	O’Neill, S. O., Kreif, N., Grieve, R., Sutton, M., & Sekhon, J. S. (2016). Estimating causal effects: Considering three alternatives to difference-in-difference estimation. Health Service and Outcomes Research Methodology, 16, 1-21. https://doi.org/10.1007/s10742-016-0146-8
[31]	Pearl, J. (2016). Lord’s paradox revisited-(oh Lord! Kumbaya!). Journal of Causal Inference, 4(2). https://doi.org/10.1515/jci-2016-0021
[32]	Stieger, M., Wepfer, S., Rüegger, D., Kowatsch, T., Roberts, B. W., & Allemand, M. (2020). Becoming more conscientious or more open to experience? Effects of a two‐week smartphone‐based intervention for personality change. European Journal of Personality. Advanced online publication https://doi.org/10.1002/per.2267
[33]	Ormel, J., Koeter, M. W. J., & Van den Brink, W. (1989). Measuring change with the General Health Questionnaire (GHQ). Social Psychiatry and Psychiatric Epidemiology, 24(5), 227-232.
[34]	Sliwinski, M., Hoffman, L., & Hofer, S. M. (2010). Evaluating convergence of within-person change and between-person age differences in age-heterogeneous longitudinal studies. Research in Human Development, 7(1), 45-60.
[35]	van Breukelen, G. J. (2013). ANCOVA versus CHANGE from baseline in nonrandomized studies: The difference. Multivariate Behavioral Research, 48(6), 895-922. https://doi.org/10.1080/00273171.2013.831743
[36]	Van Iddekinge, C. H., & Arnold, J. D. (2017). Retaking employment tests: What we know and what we still need to know. Annual Review of Organizational Psychology and Organizational Behavior, 4, 445-471. https://doi.org/10.1146/annurev-orgpsych-032516-113349
[37]	Vernon, P. E. (1954, March). Practice and coaching effects in intelligence tests. In The Educational Forum (Vol. 18, No. 3, pp. 269-280). Taylor & Francis.
[38]	Wallis, P. S. (2013). The impact of screen format and repeated assessment on responses to a measure of depressive symptomology completed twice in a short timeframe (Doctoral dissertation, Arts & Social Sciences: Department of Psychology).
[39]	Wicks, R. H. (1992). Improvement over time in recall of media information: An exploratory study. Journal of Broadcasting & Electronic Media, 36(3), 287-302.
[40]	Windle, C. (1954). Test-retest effect on personality questionnaires. Educational and Psychological Measurement, 14(4), 617-633.
[41]	Windle, C. (1955). Further studies of test-retest effect on personality questionnaires. Educational and Psychological Measurement, 15(3), 246-253.
[42]	Zhang, H., Shen, Z., Liu, S., Yuan, D., & Miao, C. (2021). Ping pong: An exergame for cognitive inhibition training. International Journal of Human-Computer Interaction, 1-12. https://doi.org/10.1080/10447318.2020.1870826

Cite This Article

Plain Text BibTeX RIS

APA Style

Protzko, J., Nijenhuis, J. T., Ziada, K. E., Metwaly, H. A. M., Bakhiet, S. F., et al. (2025). Analyzing Within-Group Changes in an Experiment: To Deal with Retest Effects, You Have to Go Latent But Not All Latents Are Equal. Science Journal of Applied Mathematics and Statistics, 13(2), 34-44. https://doi.org/10.11648/j.sjams.20251302.12

Copy | Download

ACS Style

Protzko, J.; Nijenhuis, J. T.; Ziada, K. E.; Metwaly, H. A. M.; Bakhiet, S. F., et al. Analyzing Within-Group Changes in an Experiment: To Deal with Retest Effects, You Have to Go Latent But Not All Latents Are Equal. Sci. J. Appl. Math. Stat. 2025, 13(2), 34-44. doi: 10.11648/j.sjams.20251302.12

Copy | Download

AMA Style

Protzko J, Nijenhuis JT, Ziada KE, Metwaly HAM, Bakhiet SF, et al. Analyzing Within-Group Changes in an Experiment: To Deal with Retest Effects, You Have to Go Latent But Not All Latents Are Equal. Sci J Appl Math Stat. 2025;13(2):34-44. doi: 10.11648/j.sjams.20251302.12

Copy | Download

@article{10.11648/j.sjams.20251302.12,
  author = {John Protzko and Jan te Nijenhuis and Khaled Elsayed Ziada and Hanaa Abdelazim Mohamed Metwaly and Salaheldin Farah Bakhiet and Yousif Balil Bashir Maki},
  title = {Analyzing Within-Group Changes in an Experiment: To Deal with Retest Effects, You Have to Go Latent But Not All Latents Are Equal
},
  journal = {Science Journal of Applied Mathematics and Statistics},
  volume = {13},
  number = {2},
  pages = {34-44},
  doi = {10.11648/j.sjams.20251302.12},
  url = {https://doi.org/10.11648/j.sjams.20251302.12},
  eprint = {https://article.sciencepublishinggroup.com/pdf/10.11648.j.sjams.20251302.12},
  abstract = {Analyzing within-group change in an experimental context, where the same group of people is measured before and after some event, can be fraught with statistical problems and issues with causal inference. Still, these designs are common from political science to developmental neuropsychology to economics. In cases with cognitive data, it has long been known that a second administration, with no treatment or an ineffective manipulation between testings, leads to increased scores at time 2 without an increase in the underlying latent ability. We investigate several analytic approaches involving both manifest and latent variable modeling to see which methods are able to accurately model manifest score changes with no latent change. Using data from 760 schoolchildren given an intelligence test twice, with no intervention between, we show using manifest test scores, either directly or through univariate latent change score analysis, falsely leads one to believe an underlying increase has occurred. Second-order latent change score models also show a spurious significant effect on the underlying latent ability. Longitudinal structural equation modeling with measurement invariance correctly shows no change at the latent level when measurement invariance is tested, imposed, and model fit tested. When analyzing within-group change in an experiment, analyses must occur at the latent level, measurement invariance tested, and change parameters explicitly tested. Otherwise, one may see change where none exists.
},
 year = {2025}
}

Copy | Download

TY  - JOUR
T1  - Analyzing Within-Group Changes in an Experiment: To Deal with Retest Effects, You Have to Go Latent But Not All Latents Are Equal

AU  - John Protzko
AU  - Jan te Nijenhuis
AU  - Khaled Elsayed Ziada
AU  - Hanaa Abdelazim Mohamed Metwaly
AU  - Salaheldin Farah Bakhiet
AU  - Yousif Balil Bashir Maki
Y1  - 2025/05/14
PY  - 2025
N1  - https://doi.org/10.11648/j.sjams.20251302.12
DO  - 10.11648/j.sjams.20251302.12
T2  - Science Journal of Applied Mathematics and Statistics
JF  - Science Journal of Applied Mathematics and Statistics
JO  - Science Journal of Applied Mathematics and Statistics
SP  - 34
EP  - 44
PB  - Science Publishing Group
SN  - 2376-9513
UR  - https://doi.org/10.11648/j.sjams.20251302.12
AB  - Analyzing within-group change in an experimental context, where the same group of people is measured before and after some event, can be fraught with statistical problems and issues with causal inference. Still, these designs are common from political science to developmental neuropsychology to economics. In cases with cognitive data, it has long been known that a second administration, with no treatment or an ineffective manipulation between testings, leads to increased scores at time 2 without an increase in the underlying latent ability. We investigate several analytic approaches involving both manifest and latent variable modeling to see which methods are able to accurately model manifest score changes with no latent change. Using data from 760 schoolchildren given an intelligence test twice, with no intervention between, we show using manifest test scores, either directly or through univariate latent change score analysis, falsely leads one to believe an underlying increase has occurred. Second-order latent change score models also show a spurious significant effect on the underlying latent ability. Longitudinal structural equation modeling with measurement invariance correctly shows no change at the latent level when measurement invariance is tested, imposed, and model fit tested. When analyzing within-group change in an experiment, analyses must occur at the latent level, measurement invariance tested, and change parameters explicitly tested. Otherwise, one may see change where none exists.

VL  - 13
IS  - 2
ER  -

Copy | Download

Author Information

John Protzko

Department of Psychological & Brain Sciences, University of California, Santa Barbara, USA; Department of Psychological Science, Central Connecticut State University, New Britain, USA

Contact Email

http://orcid.org/0000-0001-5710-8635
Jan te Nijenhuis

Gwangju Alzheimer’s Disease and Related Dementia Cohort Research Center, Chosun University, Gwangju, Republic of Korea; Department of Biomedical Science, Chosun University, Gwangju, Republic of Korea

Contact Email

http://orcid.org/0000-0002-1268-6121
Khaled Elsayed Ziada

Department of Psychology, Menoufia University, Shebin-el-Kom, Egypt

Contact Email

http://orcid.org/0000-0003-2149-4428
Hanaa Abdelazim Mohamed Metwaly

Department of Psychology, Kafr El-sheikh University, Kafr El-sheikh, Egypt

Contact Email

http://orcid.org/0000-0002-1358-6130
Salaheldin Farah Bakhiet

Department of Special Education, King Saud University, Riyadh, Saudi Arabia

Contact Email

http://orcid.org/0000-0002-5091-7402
Yousif Balil Bashir Maki

Department of Special Education, King Saud University, Riyadh, Saudi Arabia

Contact Email

http://orcid.org/0009-0001-3901-9441

Download PDF

Sections

Plain Text BibTeX RIS

APA Style

Protzko, J., Nijenhuis, J. T., Ziada, K. E., Metwaly, H. A. M., Bakhiet, S. F., et al. (2025). Analyzing Within-Group Changes in an Experiment: To Deal with Retest Effects, You Have to Go Latent But Not All Latents Are Equal. Science Journal of Applied Mathematics and Statistics, 13(2), 34-44. https://doi.org/10.11648/j.sjams.20251302.12

Copy | Download

ACS Style

Protzko, J.; Nijenhuis, J. T.; Ziada, K. E.; Metwaly, H. A. M.; Bakhiet, S. F., et al. Analyzing Within-Group Changes in an Experiment: To Deal with Retest Effects, You Have to Go Latent But Not All Latents Are Equal. Sci. J. Appl. Math. Stat. 2025, 13(2), 34-44. doi: 10.11648/j.sjams.20251302.12

Copy | Download

AMA Style

Protzko J, Nijenhuis JT, Ziada KE, Metwaly HAM, Bakhiet SF, et al. Analyzing Within-Group Changes in an Experiment: To Deal with Retest Effects, You Have to Go Latent But Not All Latents Are Equal. Sci J Appl Math Stat. 2025;13(2):34-44. doi: 10.11648/j.sjams.20251302.12

Copy | Download

@article{10.11648/j.sjams.20251302.12,
  author = {John Protzko and Jan te Nijenhuis and Khaled Elsayed Ziada and Hanaa Abdelazim Mohamed Metwaly and Salaheldin Farah Bakhiet and Yousif Balil Bashir Maki},
  title = {Analyzing Within-Group Changes in an Experiment: To Deal with Retest Effects, You Have to Go Latent But Not All Latents Are Equal
},
  journal = {Science Journal of Applied Mathematics and Statistics},
  volume = {13},
  number = {2},
  pages = {34-44},
  doi = {10.11648/j.sjams.20251302.12},
  url = {https://doi.org/10.11648/j.sjams.20251302.12},
  eprint = {https://article.sciencepublishinggroup.com/pdf/10.11648.j.sjams.20251302.12},
  abstract = {Analyzing within-group change in an experimental context, where the same group of people is measured before and after some event, can be fraught with statistical problems and issues with causal inference. Still, these designs are common from political science to developmental neuropsychology to economics. In cases with cognitive data, it has long been known that a second administration, with no treatment or an ineffective manipulation between testings, leads to increased scores at time 2 without an increase in the underlying latent ability. We investigate several analytic approaches involving both manifest and latent variable modeling to see which methods are able to accurately model manifest score changes with no latent change. Using data from 760 schoolchildren given an intelligence test twice, with no intervention between, we show using manifest test scores, either directly or through univariate latent change score analysis, falsely leads one to believe an underlying increase has occurred. Second-order latent change score models also show a spurious significant effect on the underlying latent ability. Longitudinal structural equation modeling with measurement invariance correctly shows no change at the latent level when measurement invariance is tested, imposed, and model fit tested. When analyzing within-group change in an experiment, analyses must occur at the latent level, measurement invariance tested, and change parameters explicitly tested. Otherwise, one may see change where none exists.
},
 year = {2025}
}

Copy | Download

TY  - JOUR
T1  - Analyzing Within-Group Changes in an Experiment: To Deal with Retest Effects, You Have to Go Latent But Not All Latents Are Equal

AU  - John Protzko
AU  - Jan te Nijenhuis
AU  - Khaled Elsayed Ziada
AU  - Hanaa Abdelazim Mohamed Metwaly
AU  - Salaheldin Farah Bakhiet
AU  - Yousif Balil Bashir Maki
Y1  - 2025/05/14
PY  - 2025
N1  - https://doi.org/10.11648/j.sjams.20251302.12
DO  - 10.11648/j.sjams.20251302.12
T2  - Science Journal of Applied Mathematics and Statistics
JF  - Science Journal of Applied Mathematics and Statistics
JO  - Science Journal of Applied Mathematics and Statistics
SP  - 34
EP  - 44
PB  - Science Publishing Group
SN  - 2376-9513
UR  - https://doi.org/10.11648/j.sjams.20251302.12
AB  - Analyzing within-group change in an experimental context, where the same group of people is measured before and after some event, can be fraught with statistical problems and issues with causal inference. Still, these designs are common from political science to developmental neuropsychology to economics. In cases with cognitive data, it has long been known that a second administration, with no treatment or an ineffective manipulation between testings, leads to increased scores at time 2 without an increase in the underlying latent ability. We investigate several analytic approaches involving both manifest and latent variable modeling to see which methods are able to accurately model manifest score changes with no latent change. Using data from 760 schoolchildren given an intelligence test twice, with no intervention between, we show using manifest test scores, either directly or through univariate latent change score analysis, falsely leads one to believe an underlying increase has occurred. Second-order latent change score models also show a spurious significant effect on the underlying latent ability. Longitudinal structural equation modeling with measurement invariance correctly shows no change at the latent level when measurement invariance is tested, imposed, and model fit tested. When analyzing within-group change in an experiment, analyses must occur at the latent level, measurement invariance tested, and change parameters explicitly tested. Otherwise, one may see change where none exists.

VL  - 13
IS  - 2
ER  -

Copy | Download