This study investigated the nature of number processing (including both basic steps, such as recognizing numbers and understanding their magnitude, and late steps, such as performing calculations) and its timing in relation to text processing within solvable and non-solvable arithmetic word problems. We explored whether text and number processing occur in sequential steps or whether they interact, and whether numbers are processed as early as the text. To delve into this question, we examined not only behavioral measures but also eye-tracking measures. Specifically, we analyzed both static eye-tracking measures (i.e., fixations and regressions) and dynamic eye movements (i.e., transitions) between text and numbers. Importantly, in this study, both solvable and non-solvable arithmetic word problems were used, with their level of numerical difficulty being manipulated by the need for a carry/borrow operation.
Interaction of text and number processing in word problem solving models
The disordinal interaction between operation and difficulty in arithmetic word problems reflects that numbers start to be processed before the text is fully processed and before it is understood whether the problem is solvable or non-solvable. This suggests an integration of number and text processing, and it is possible that even more processes are involved simultaneously in word-problem solving. Our findings are supported by Zhou et al. (
2018), who found evidence in neural correlates that, contrary to the view of a dissociation of text and number processing as shown by neuroimaging studies, the dissociation does not imply a complete separation of number processing from text processing due to the involvement of the semantic system. Nevertheless, these studies did not investigate the neural correlates of the carry/borrow effect within word problems. Our results provide compelling evidence for a magnitude-based mental representation, as supported by Bergqvist and Österholm (
2010), whose model incorporates a cyclic component that enables it to update mental representations. Moreover, our findings are in line with conclusions drawn from pure text comprehension studies (without numerical information and mathematical operations to be performed), claiming that mental representations are built up in an incremental manner, allowing for regular updates during the reading process (Gernsbacher et al.,
1998). This suggests that the mental representation of word problems is delicate, using an economical and effective parallel method for managing cognitive resources. This finding provides evidence against the sequential model proposed by Kintsch and Greeno (
1985), who argue for a separation of text and number processing. According to their sequential model, a mental representation of a word problem is built by processing the surface and understanding the text semantics, and, in the next step, extending this text model by a problem model to select a fitting schema, fill it in with numbers, and finally carry out the calculation. According to this model, simply reading a non-solvable word problem should be sufficient to recognize that it is non-solvable, and a calculation should not even be initiated if the text of a non-solvable word problem is correctly understood. Notably, the significant interaction between difficulty and operation in non-solvable problems highlights that number manipulation influenced the solving process, which would not be the case if participants first read the text before initiating a calculation. Note that to rule out potential influences of the exact text and used story, all problems were standardized for word count, letter count, and word frequency (Roth,
2021). Thus, although the exact text and used story was counterbalanced within solvability and operation but not within difficulty (see Supplementary Materials
A and
I), it is highly unlikely that the observed interaction is due to textual differences.
Other models which try to explain how word problems are solved account for numbers in various ways but do not fully clarify what occurs in certain scenarios. The schema model and the situation model suggest that numbers are not necessarily processed from the beginning of the solving process. The SECO model examines how numbers function within word problems, attributing importance to the role and meaning of numbers as they semantically relate to the mental representation in word problems. The semantic relatedness was intentionally omitted from our study because the research aimed to explore number processing in a more abstract sense, specifically how arithmetic operations like carrying or borrowing influence problem-solving, without considering the contextual or semantic significance of the numbers involved. Our results suggest a more generalized role for numbers in problem-solving, indicating that numbers may play an early role in cognitive processing, even when they are not directly connected to semantic meanings.
To explain the disordinality of the observed interaction, we propose that the solving process is relatively adaptable and can shift based on the perceived or actual difficulty of the problems. Following the model proposed by Bergqvist and Österholm (
2010), the initial reading (possibly skimming) of a non-solvable problem leads to a mental representation of the content. Based on this mental representation, a suitable strategy is chosen: re-reading, calculating, or ceasing the solving process. In some cases, individuals use the initial mental representation to label the problem as “solvable” and calculate or as “non-solvable” and cease the solving process, without re-reading the text. Such processes are supported by the fact that some behavioral carry or borrow effects are weaker than in previous studies. However, in other cases, individuals may opt to re-read the text, which might result in a modification of their initial mental representation (i.e., from a solvable to a non-solvable representation). In such cases, an extra step is necessary to correctly construct a mental representationand identify simple non-solvable problems as “non-solvable.” This additional step necessitates resources, which manifest as fixations, regressions, and transitions. Crucially, this process seems to be moderated by the problem’s perceived difficulty, aligning with the findings by Doz et al. (
2023) and speaking against mutually exclusive strategies and instead for calculation already happening during reading and re-reading, most likely depending on the perceived difficulty of the problems.
The above reasoning explains well why there might be a borrow effect (as reflected by more fixations, regressions, and transitions) in non-solvable subtraction word problems. Namely, when a problem is non-solvable and involves borrowing, individuals might need more cognitive resources to confirm its non-solvability. Following the same reasoning, one would expect a similar pattern in addition problems, specifically a carry effect in non-solvable addition word problems. However, the observed crossover interaction reveals a descriptively reversed trend in addition, such that less attention is drawn to simple (non-carry) than to difficult (carry) non-solvable addition problems. One possible explanation is that individuals employ a direct translation strategy (e.g., Hegarty et al.,
1995) when they perceive both the text and numbers as easy and initiate a calculation as it requires minimal cognitive resources (i.e., in the case of the non-carry addition problems, representing the easiest of the four problem types in the current study). They might then revise their problem representationor abandon it, depending on the complexity of the problem, which is defined as the combination of factors resulting from carry/non-carry in combination with addition/subtraction, along with other factors such as solvability and non-solvability. Nonetheless, even in case of abandonment of the calculation, a mismatch with the initial mental representationis recognized, indicating at least partial integration of number processing and text processing. The decision of whether to calculate may be influenced not only by motivational aspects but also by the simple act of “giving up” (Bergqvist & Österholm,
2010). Specifically, on a response (decision) level, participants may lower their threshold for answering that the presented word problem is non-solvable if the potential calculation is deemed as relatively complicated. In this case, it would be more likely and faster that participants abandon the calculation and respond that the problem is non-solvable. This would lead to less eye movements on the numbers in difficult than in easy non-solvable word problems. This notion is supported by the fact that the need for carrying is usually recognized relatively early during the encoding of the problem (Moeller et al.,
2011b), so individuals could choose to capitulate relatively early.
While this explanation holds well for the eye-tracking data from addition word problems and the carry effect, it is, however, important to note that this explanation does not account for all findings of the current study. Namely, it does not account for the results of subtraction word problems and the borrow effect, where the more difficult arithmetic condition leads to more eye-movements and no such tendencies to abandon calculation or “giving up” can be observed. While both explanations (i.e., accounting for the addition findings and the carry effect on the one hand, and accounting for the subtraction findings and the borrow effect on the other hand) are possible and in line with some previous theoretical assumptions, we have no convincing explanation of why one explanation seems to be true for addition and the other for subtraction.
A different explanation for the reversed carry effect in static measures (i.e., fixations and regressions) for addition problems as reflected by the significant interactions might be addition is the default operation for many participants. That is, when faced with two numbers and before knowing what operation is actually required, addition might be more likely to be performed as compared to subtraction or other arithmetic operations. This assumption can be derived from, on the one hand, participants getting more involved in simple addition word problems than in complex ones (as reflected by more fixations and regressions in non-carry problems than in carry problems), but, on the other hand, no stronger involvement in simple subtraction word problems than in complex ones. Importantly, if an addition was performed with the two numbers in the subtraction problems, half of them would be carry problems and half of them would be non-carry problems. Thus, addition as a default operation would explain that there is on average no difference between borrow problems and non-borrow problems.
Importantly, it also remains unclear whether difficulty had an effect apart from its interaction with operation. As the interaction was significant and describes the eye movement patterns more precisely, the main effects of difficulty and operation were not even tested within the LMMs for any of the four measures. The descriptive pattern of results is consistent with the explanatory approach developed above. Another argument that supports this idea is that the average RT for solvable problems, where calculation is necessary to provide a correct solution, is much higher than for non-solvable problems, where mere reading comprehension is enough to answer correctly. Overall, it can be concluded that there is evidence for interactive processing of text and numbers and for different strategies in addition and subtraction word problems. Indeed, numbers play a significant role in word problem solving and are processed even when not required. These arguments are supported by Orrantina and Múñez(
2013), who found suggesting that an automatic, analog magnitude-based mental representation is routinely activated during word problem solving. Our research not only supports these findings but provides even stronger evidence, as we even manipulated the calculation within the word problems.
Limitations and future research
Although the current study could clearly demonstrate that number processing takes part at an early stage in word problem solving, there are some limitations and future research questions to be considered. Firstly, the study employed a 2 (solvability) * 2 (operation) * 2 (difficulty) design, leading to eight word-problem types. Importantly, we decided to use each story for no more than four word problems. The reason for this was that if each story would have been repeated eight times, participants might have ceased to engage with the text thoroughly, potentially skipping most of the reading at some point due to familiarity, learning effects, and efficiency concerns. Thus, the design was not fully counterbalanced, but only across difficulty (carry/borrow), but only across solvability and operation. To mitigate the effects of this unbalanced design, we controlled for linguistic difficulty across the word problems. Therefore, we can argue that any differences due to linguistic difficulty arising from mean word frequencies, average sentence length, number of words, and unequal use of stories are absent or negligible. However, there might be other, not yet considered (or even unknown), factors that make word problems more difficult in certain conditions than in others.
Secondly, relational terms such as “more” or “less” should not be overlooked, as they are keywords for the arithmetic operations addition and subtraction, respectively. The evidence of interactive processing underscores the importance of closely examining integrative attention between text and numbers. Moreover, in several word problems, the second sentence already pointed towards an arithmetic operation (e.g., the words “in total” in the simple solvable “Museum” problem or in the complex solvable “Gardening” problem pointed towards subtraction), whereas in most other problems, this cue appeared only in the third sentence. Future studies should ensure the placement of keywords pointing towards a specific mathematical operation is consistent across word problems or is manipulated as a factor.
Thirdly, some used word problems in this study might have seemed ambiguous in terms of whether they were solvable or non-solvable. For instance, the simple non-solvable “Cinema” problem contained the information that two classes of 41 and 27 children and their teachers went to the cinema and asked about how many seats they needed. Since the number of teachers joining the trip to the cinema was not mentioned, the word problem was non-solvable, but some participants might have assumed that one teacher joined per class. This might explain why only 7.14% of the two (simple and complex) non-solvable versions of the “Cinema” word problems were identified as non-solvable. Moreover, in the non-solvable versions of the “Tennis,” “Museum,” “Busses,” and “Party” problems, some participants might have executed a calculation to find a minimum or maximum as an answer. For instance, if Sophie invites 24 and Mia invites 18 friends to their party, assuming that none of their friends declines the invitation, the maximum of 42 friends would come. The “Marbles,” “Gardening,” and “Fruits” problems, in contrast, might have been less ambiguous and thus more suitably constructed for the present investigation focusing on non-solvable word problems. In addition, cultural context might impact the perceived solvability of problems. Participants’ approaches to problem-solving and their assumptions about a problem’s solvability can be shaped by cultural norms and experiences, which influence their cognitive processing strategies (e.g., Rhodes et al.,
2024). Therefore, we recommend that word problems should be specifically selected and tailored to fit cultural contexts.
Forthly, we should note that fixations and regressions of AOI, as well as transitions between parts of the text and numbers, do not capture the full calculation process. Rottmann and Schipper (
2002) demonstrated that children tend to shift their gaze upwards or to an imaginary point when concentrating, which is potentially not even on the computer screen and therefore not recorded in eye-tracking. It is plausible that fixations, regressions, and transitions outside of AOI also indicate important cognitive processes in adults. Thus, a subsequent study could capture eye movements that do not target the word problem but instead points that are not located on the computer screen.
Lastly, we acknowledge that whether the question is fully read and whether attention is paid to the question early or late during the solving process may indeed play a crucial role in the word problem-solving process. However, to fully explore this aspect, we recommend conducting experiments specifically designed to investigate the unique role of the question within the processing of word problems. In general, our study does not allow to exactly locate when and how calculation and other processes take place and interact. We encourage other researchers to explore this question further by designing studies that can better isolate and identify processes and their interactions. Additionally, we hope that such future work will refine the existing theoretical framework.