The Foreign Language Classroom Anxiety Scale and Academic Achievement: An Overview of the Prevailing Literature and a Meta-analysis

Foreign language learners experience a unique type of anxiety during the language learning process: Foreign Language Classroom Anxiety (FLCA). This situation-specific anxiety is frequently examined alongside academic achievement in foreign language courses. The present meta-analysis examined the relationship between FLCA measured through the Foreign Language Classroom Anxiety Scale (FLCAS) and five forms of academic achievement: general academic achievement and four competency-specific outcome scores (reading-, writing-, listening-, and speaking academic achievement). A total of k = 99 effect sizes were analysed with an overall sample size of N = 14,128 in a random-effects model with Pearson correlation coefficients. A moderate negative correlation was found between FLCA and all categories of academic achievement (e.g., general academic achievement: r = -.39; k = 59; N = 12,585). The results of this meta-analysis confirm the negative association between FLCA and academic achievement in foreign language courses.


27
to classroom language learning arising from a uniqueness of the language learning process" (Horwitz, Horwitz, & Cope, 1986, p. 128).
FLCA is thus seen as a unique form of state anxiety that learners experience when they participate in learning and/or using a language (Horwitz, 2017), where the language learner is limited in their communicative ability in the language being learned (Horwitz, 2001;Horwitz, Horwitz, & Cope, 1986). As the construct of FLCA is intrinsically tied to classroom learning, the relationship between FLCA and academic achievement is an oft-researched topic (Teimouri, Goetze, & Plonsky, 2019). Research findings on this relationship between FLCA and academic achievement have been somewhat consistent -with Horwitz (2001) attributing this relative consistency to the uniform measure used to conceptualise FLCA. Indeed, before the introduction of the Foreign Language Classroom Anxiety Scale (FLCAS) in 1986, cross-comparisons of research on anxiety and its effect in the foreign language classroom was nearly impossible due to the divergent measures and definitions used -a period labelled the "Confounded Approach" by MacIntyre (2017). The publication of the FLCAS heralded of the start of the "Specialised Approach" where the use of the same instrument allowed comparability across studies (MacIntyre, 2017). Horwitz (2001) noted that consistent moderate negative correlations were found between FLCA and academic achievement. This may be a prevalent trend, however large negative correlations (Vo, Samoilova, & Wilang, 2017), non-significant results (Alidoost, Mirchenari, & Mehr, 2013), and positive correlations (Jee, 2014) have also been reported in recent years. In addition, even though a uniform measure of FLCA has been used across the majority of research in the field, inconsistency still occurs in the use of achievement measures. General academic achievement measures popularly used in the literature include grade point average scores (Aida, 1994) and test or exam scores (Dordinejad & Ahmadabad, 2014). In addition, several studies use achievement measures pertaining to a specific competence in language learning, such as reading (Jee, 2016), writing (Abu-Rabia, 2004;Khodadady & Khajavy, 2013), listening (Elkhafaifi, 2005;Legac, 2007), and speaking (Phillips, 1992;Satar & Özdener, 2008).
A meta-analysis assesses the strength of the evidence in regards to the relationship between two variables and identifies a common effect across all studies. Thus, in order to fully investigate the strength of the evidence of a moderate negative relationship between FLCA and academic achievement, a meta-analysis on the topic will be conducted in this paper. In addition, in order to provide clarity on the achievement measures used, analyses will distinguish between each achievement measure and its subsequent composite effect size with FLCA.

Literature Review
FLCA can be seen as a distinct form of anxiety in the language learning process that may affect the outcome of language learning itself, as the "propensity to reach one's full potential as a language learner is partially determined by affective variables such as anxiety" (MacIntyre, 1995a, p. 96). FLCA refers to the specific construct designed and developed by Horwitz et al. (1986) and measured through the FLCAS, although the construct has also been referred to throughout the literature by the shortened "foreign language anxiety" or just "language anxiety".
Throughout this study, these terms will be used interchangeably although still referring to the specific conceptualisation of the construct as determined by Horwitz et al. (1986).
FLCA has been dubbed a situation-specific anxiety and can be discriminated from trait or other state forms of anxiety. Trait anxiety has been compared to a 'personality style' or a habitual anxiety, whereas state anxiety is a psychological and/or physiological reaction to a specific adverse situation and is momentary (Roos et al., 2015). FLCA in turn is both a habitual anxiety that occurs whenever the learner is confronted with language learning, yet momentary as it only pertains to specific instances. Indeed, in a series of studies, MacIntyre and Gardner (1991Gardner ( , 1989) distinguished general forms of anxiety from a language anxiety factor, commenting that "situation-specific constructs can be seen as trait anxiety measures limited to a given context" (Gardner & MacIntyre, 1991, p. 90).
FLCA can additionally be likened to other forms of situation-specific anxieties, such as social anxiety, communicative apprehension, fear of negative evaluation and test anxiety. Indeed, MacIntyre and Gardner (1989) argue that language anxiety may be seen as a form of social anxiety as it specifically focuses on socialising in the language being learned. In addition, Horwitz et al. (1986) further describe communication apprehension, test anxiety, and fear of negative evaluation as "useful conceptual building blocks" (p. 128) for the development of FLCA. Horwitz (2017) warns that reducing FLCA to these three building blocks is a misreading of the original paper and a "false premise" (p. 38). Indeed, "the emotions evoked by Language Anxiety go much deeper than a simple combination of anxieties" (p. 41). The main cause is the distress people experience at being unable "to be themselves and to connect authentically with other people through the limitation of the new language" (p. 41).

Measurement of FLCA
FLCA is measured by the 33-item, 5-point Likert scale questionnaire introduced by Horwitz et al. (1986). Since its induction in 1986, the scale has gained traction and remains extremely popular with peer-reviewed research published regularly utilising it. The FLCAS has been adapted or shortened in several studies (see Dewaele & MacIntyre, 2014;Liu & Huang, 2011) and translated into numerous languages, including Hungarian (Tóth, 2008), Persian (Alidoost et al., 2013), Arabic (Dewaele & Al-Saraj, 2015) and Thai (Tanielian, 2014). The scale measures the conceptualisation of language anxiety specific to FLCA, with 20 of the items focusing on speaking and listening in the target language in particular (Rodríguez & Abreu, 2003). The items of the FLCAS are not without criticism, as Sparks and Ganschow (2007) noted that, "items seem to be tapping students' perceptions and attitudes about language as well as their feelings about anxiety" (p. 261).
The psychometric evidence regarding the reliability and validity of the FLCAS point to the validity of the measure -although not without flaw. Horwitz et al. (1986) reported an internal consistency of α = .93, with numerous studies also reporting high internal consistencies of α > .90 (Aida, 1994;Elkhafaifi, 2005;Gocer, 2014). In addition, Horwitz et al. (1986) reported an acceptable test-retest reliability (rtt = .83). Tóth (2008) found evidence of response validity in think-aloud exercises of the Hungarian translation of the FLCAS. In addition, construct validity of the scale has received considerable research attention.
Differing factor structures underlying FLCA have been found in validation studies, with some shorter measures proposing a unidimensional structure (Dewaele & MacIntyre, 2014) and other a multidimensional structure (Aida, 1994;Liu & Jackson, 2008;Tóth, 2008). A possible reason for the variation in factor analysis results of the FLCAS is offered by Park and French (2013), who noted that the factor structure may differ across sample groups depending on proficiency levels and learning contexts. Another possible reason for the variation in factors may lie in the statistical methods used, as the factors garnered in exploratory factor analyses can be an artefact of the estimation and rotation methods used by the researchers (Field, 2005).

30
Although the FLCAS is a highly popular measure, several other attempts have been made to define and design measures of language anxiety. Other measures include, but are not limited to: French Class Anxiety Scale (Gardner & Smythe, 1975); French Use Anxiety Scale (Gardner et al., 1997); English Use Anxiety Scale (Clément, Gardner, & Smythe, 1977); Second Language Speaking Anxiety Scale (Woodrow, 2006). In addition, several scales have been developed to measure specific language competencies' anxiety, such as the Foreign Language Listening Anxiety Scale (Elkhafaifi, 2005), Foreign Language Reading Anxiety Scale (Saito, Garza, & Horwitz, 1999), and the Second Language Writing Anxiety Inventory (Cheng, 2004). These scales tend to measure a similarly broad construct of language anxiety, although they are often targeted towards a specific language learning domain or skill. In order to avoid confusion linked to different operationalisations of language anxiety and in the hope of deriving a definitive answer regarding effect size in the meta-analysis, a decision was made to only include studies utilising the FLCAS, whether in its original, shortened, or translated form. This limitation does not restrict the amount of studies included in the meta-analysis unduly, as the FLCAS has been widely accepted by the research community and as such is used in the majority of self-reported language anxiety studies.

Academic achievement and FLCA
Since the inception of FLCA, its relations to achievement in language learning has been under scrutiny (Horwitz, 1986). A research trend has emerged with the majority of studies reporting significant moderate negative correlations (Horwitz, 2000); however-as previously noted-deviations to the status quo still occur.
The debate regarding the directionality of the relationship between FLCA and academic achievement has been contentious. A strand of research led by the work of Sparks and Ganschow (see Sparks & Ganschow, 1995Sparks, Patton, Ganschow, & Humbach, 2009) question the existence of FLCA as a construct independent of aptitude and contends that anxiety in language learning is the natural result of learning difficulties -particularly arising out of first language deficits. In a series of response papers (see Horwitz, 2000;MacIntyre, 1995aMacIntyre, , 1995b FLCA is defended, with the argument made that FLCA is an independent construct, distinct from aptitude, but which may influence or be influenced by the performance of the language learner.
FLCA is therefore likened to other domain-specific forms of learning anxiety, such as mathematics anxiety (see Frenzel, Pekrun, & Goetz, 2007;Zan, Brown, Evans, & Hannula, 2006), which may have a detrimental effect on learning above and beyond the natural ability of the student (MacIntyre, 1995b). In addition, FLCA is argued to be a separate construct from first language learning deficits and subtle language learning disabilities, as several studies on the topic has found students who experience high levels of anxiety but are still successful language learners (MacIntyre, 1995a(MacIntyre, , 1995b. However, it should be noted that as this meta-analysis will utilise correlation coefficients in order to examine effect sizes across studies, no conclusion regarding the directionality can be made. Nevertheless, in this study, the construct of FLCA is seen as a variable distinct from achievement and aptitude, with the strength of the relationship between FLCA and academic achievement being under scrutiny. Beyond the directionality debate, research in the field has also examined the relationship of FLCA and specific language competencies. Particular attention has been paid to the relationship of FLCA and oral classroom activities, which may indicate that listening to and speaking in the target foreign language may be specific anxiety-filled activities for foreign language learners. In particular, academic achievement in speaking activities and FLCA have been of interest (Phillips, 1992;Satar & Özdener, 2008). Horwitz (personal communication) included an item on the paralysing effects of anxiety in the FLCAS after students told her about them "freezing up" during speaking activities. This association between FLCA and oral achievement may be attributable to the fact that anxiety has been found to interfere with the grammatical precision and interpretive ability of the language learner (Gardner & MacIntyre, 1991). In addition, language anxiety not only interferes with speaking activities, but also affects the ability of the language learner to receive and decipher messages in listening activities (Kim, 2000). Indeed, several studies have found significant correlations between FLCA and listening academic achievement (Legac, 2007;Tóth, 2007). In particular, highly anxious students may have difficulties in discriminating sounds in listening activities (Horwitz, 1986), with Kim (2000 pointing out that the delivery speed of the activity and the level of vocabulary in particular being a source of contention for anxious students. In addition to speaking and listening, FLCA has also been found to be significantly related to reading-and writing academic achievement (Jee, 2016;Khodadady & Khajavy, 2013). Saito et al. (1999) postulated that anxiety intervenes in the decoding and processing of text, with Sellers (2000) finding that students with higher levels of language anxiety recalling significantly fewer details from a reading text.
Thus, FLCA has been associated with general academic achievement in the target language, as well as academic achievement in the language competencies of speaking, listening, reading, and writing. In order to do justice to the prevalent research, within this meta-analysis academic achievement as an outcome measure will therefore be examined and coded into five categories (general-, reading-, writing-, listening-, and speaking academic achievement), with effect sizes calculated separately for each category of academic achievement.
It should be noted that a meta-analysis was recently conducted by Teimouri, Goetze, and Plonsky (2019) on second language anxiety and achievement, which found an overall effect size of r = -.36 (k = 105; N = 19 933) between FLCA and academic achievement. Although the results of the previous meta-analysis could be compared and contrasted with the current study, it is important to note the differing methodology and data base in comparison between the two studies. Teimouri et al. (2019) included studies utilising numerous different scales measuring language anxiety, including the Foreign Language Reading Anxiety Scale (Saito et al., 1999), the French Class Anxiety Scale (Gardner, 1985), and the FLCAS. In contrast, the current metaanalysis specifically limits the inclusion criteria to studies utilising the FLCAS as designed and developed by Horwitz et al. (1986). This decision was made for two reasons: Firstly, the FLCAS is the only scale -to the authors' knowledge -to be validated across numerous contexts. More specifically, the FLCAS has been utilised in studies across different countries, differing educational contexts, age groups, and language groups. Secondly, the comparability between studies is ensured in that variables labelled as 'Foreign Language Classroom Anxiety' were indeed defined and captured in the same way in order to ensure an 'apples to apples' comparison.
Measures such as the Foreign Language Reading Anxiety Scale were for example excluded, as the scope and definition of the language anxiety measured cannot be said to be synonymous with FLCA.
Thus, the results presented by Teimouri et al. (2019) can to a certain extent be compared and contrasted with the current study, whilst remaining cognizant of the narrower scope and stringent definitions utilised within the current article. Furthermore, additional insight may be provided by this current meta-analysis, as it may verify some findings made by Teimouri et al. (2019) and raise yet more questions regarding the relationship between language anxiety and academic achievement.

Possible moderators
Research regarding moderators possibly influencing the direction or strength of the relationship between FLCA and academic achievement seems few and far in between. Several mean-level differences have been found in regards to FLCAS scores and/or academic achievement scores on the basis of demographic factors, cultural differences and learning contexts. However, it should be noted that mean-level differences across groups does not necessarily imply a moderator effect, which occurs when the relationship between two variables are entirely dependent on a third variable (Field, 2005).
Gender has been researched in FLCA and academic achievement studies, however results vary from study to study. Females have been found to have higher levels of FLCA in some research papers (Abu-Rabia, 2004;Dordinejad & Ahmadabad, 2014;Park & French, 2013), with others reporting no significant difference (Aida, 1994), and others still yet finding males to have higher levels of FLCA (Alidoost et al., 2013). However, no study could be found where the relationship between FLCA and academic achievement was moderated by gender in the sense that the size of the relation between FLCA and academic achievement differed between boys and girls. Thus, an exploratory stance will be taken within this meta-analysis with gender by examining the female proportion of participants as a possible moderator in the relationship between FLCA and academic achievement.
Some studies have found a significant relationship between FLCA and age, with older participants reporting higher levels of language anxiety in Onwuegbuzie et al. (1999)  were found in other research papers (Dewaele, Petrides, & Furnham, 2008;MacIntyre, Baker, Clément, & Donovan, 2002). However, Dewaele (2007) found additional complexity in the relationship between age and FLCA, depending on the conversation partner and number of languages known. Younger learners indicated less anxiety when communicating with strangers in their second and third languages, as compared to older language learners (Dewaele, 2007). Further complicating matters is the possible interaction effect between age and gender, with Samimy and Tabuse (1992) finding that gender plays a more important role in FLCA scores at a younger age. Although it should be noted that no studies could be found specifically examining the moderating effect of age in the relationship between FLCA and academic achievement, the mean differences reported by previous studies does provide justification as to examining the moderating effect of age on an exploratory basis. Therefore, as with gender, the average age of participants will thus be included as a possible moderator in this meta-analysis in an exploratory fashion.
The variability of language learning courses has been found to affect FLCA and academic achievement (Kim, 2009;Onwuegbuzie et al., 1999). More intensive language learning courses have been found to lower mean anxiety levels of students learning a foreign language (Baker & MacIntyre, 2000;MacIntyre et al., 2003). Increased grade levels have also been associated with strengthening language anxiety as a predictor of achievement (Gardner, Smythe, Clément, & Gliksman, 1976), however non-significant results have also been found in grade levels predicting the level of language anxiety in foreign language students (MacIntyre et al., 2002). In their meta-analysis on second language anxiety and academic achievement, Teimouri et al. (2019) found effect size differences between educational levels as well as study contexts.
Research results in terms of language learning experiences are therefore highly contradictory and nearly impossible to compare from study to study -as descriptions of grade levels, intensity levels, and language learning experiences differ across educational settings, countries and cultures. As such, it was not possible to consider or code classroom context moderators beyond the educational setting of secondary school classes, university language courses and private language institute courses, which will be examined in an exploratory manner.
Due to the limitations of cross-comparisons and coding, only a handful of possible moderators could therefore be included in the meta-analysis of FLCA and academic achievement: average age of participants, the female proportion of the sample, and the type of language institution at which the language is being learned.

Search strategy
A comprehensive literature search was conducted in September 2018, using four online databases: PsychINFO, PsychARTICLES, ERIC and Google Scholar. An additional hand-search of three relevant peer-reviewed journals 1 was carried out in January 2019. Articles published in English in peer-reviewed journals, conference proceedings, and dissertations submitted for doctoral degrees were examined. A two by four search grid was used in this study, with two keywords aimed at finding Foreign Language Classroom Anxiety (FLCA) measures ("Foreign Language Classroom Anxiety" OR "Language Anxiety") and four keywords aimed at finding a measure of the language learner's academic achievement in the language being learned ("Achievement" OR "Performance " OR "Grades" OR "Scores").  result of the search, of which 66 were included in the meta-analysis. An additional call for unpublished research was made to limit the effect of the 'file-drawer problem', which refers to the bias in publication regarding non-significant results (Rosenberg, 2005). Unpublished data also increases the amount of grey literature in the dataset. Grey literature is an umbrella term for conference proceedings, unpublished dissertations, and unpublished data, and as such grey literature usually is not subjected to peer review. The inclusion of grey literature has been found to negate the effects of publication bias in meta-analyses (Conn, Valentine, Cooper, H, & Rantz, 2003). The call for unpublished research resulted in one study being added to the meta-analysis to form a total of 67. The selection process used for the meta-analysis is summarised in Figure 1.
It should be noted that of the 67 studies included in the meta-analysis, 14 studies (15%) can be considered 'grey literature'.

Review Strategy
Of the 67 studies generated by the literature search, a total of 99 effect sizes were included in the meta-analysis. The following inclusion criteria were applied: 1. Quantitative Data Requirements: Only quantitative studies were included in the metaanalysis. In addition, only studies that reported correlation coefficients between FLCA and academic achievement were included. In cases in which this information was not available, an attempt was made to reach out to authors if contact details were readily available. One response was garnered and added to the analysis. 3. Measurement of academic achievement: General academic achievement was recorded either through grade point average scores or test/exam scores of the foreign language being learned. In addition, separate data entries for specific achievement measures on reading-, writing-, listening-, and speaking competency in the language being learned were made if available. These competencies were measured through tests, assignments and/or course grades. A summary of the academic achievement measures can be found in Figure 2. 4. Study Designs: No specific designs were excluded from the study. However certain guidelines were followed in regards to coding experiments and group-difference studies. In the experimental studies included in the meta-analysis, only preintervention data were recorded. In group-differences studies, the group total data was recorded if available, if not, separate groups were entered into the dataset and this was specifically noted.

Coding Strategy
The study recorded numerous publication, demographic and descriptive characteristics, as well as data relating to the effect sizes and possible moderator variables (Zessin et al., 2015).
With regard to publication characteristics, the authors, year of publication, full title, and publication medium were recorded. The demographic characteristics recorded included sample size, gender distribution, average age, and country where the sample was gathered. Specific descriptive characteristics regarding language learning were also recorded, namely, language being learned, first language of sample, and whether the language course was undertaken through a school, university or private language institute. The measurement characteristics coded included extensive information on the specific version of the FLCAS that was used -whether the original or shortened version was utilised and if the measure was translated. Means, standard deviations, and internal reliability as measured by Cronbach's alpha (α) of the FLCAS were recorded for each study. Lastly, all possible measures of academic achievement were coded and recorded as outlined in Figure 2. Means, standard deviations and Pearson's correlations coefficient (r) for each measure of achievement were recorded. Though an attempt was made to include and code several moderators in this metaanalysis, only three moderators were sufficiently represented in the included 67 studies to warrant further analyses. Firstly, the proportion of female participants in each study was calculated when a gender ratio of the sample was provided. Secondly, the average age of participants in each study was noted. If no average age was given, but the sample age range spanned two or three years (e.g. 16 -17 years, see Satar & Özdener, 2008), a mid-point in the age range was used. However, studies where the age ranges exceeded three years and no average age was provided were not included in the moderator analyses. Lastly, the institution type where the language learning took place was noted. The institution types where coded as School Language Course (n = 19), University Language Course (n = 43), or Private Language School Course (n = 5), and treated as a categorical variable. Due to the low number of private language school data entries, the category was not included in the final moderator analyses, and as such only School Language Course and University Language Course will be analysed as a possible categorical moderator.
An agreement rate (AR) was calculated in order to ensure the reliability of the coding process, with the proportion of exact matching codes and values calculated between two coders.
Twenty of the 64 studies were coded by two coders, with the proportion of the number of observations agreed upon at an acceptable level of AR = 90% (Bayerl & Paul, 2011).

Data-Analysis Strategy
The hypothesised relationship between FLCA and academic achievement was examined via the 99 correlation coefficients collected from 67 studies. All analyses in the meta-analysis were conducted in R utilising the metafor package (Viechtbauer, 2010), with models being estimated using restricted maximum likelihood estimation. All correlation coefficients were converted into Fisher's z scale in order to stabilise the variance of the results, with the summary of the Fischer's z scale transformed to a summary correlation coefficient for each form of academic achievement that was measured (Hedges & Olkin, 1985). A random effects model was used to conceptualise the 99 effect sizes as heterogeneity is assumed across studies. The Hedges and Olkin (1985) method was chosen as the Fisher z scale transformation corrects for a skew in the sampling distribution of the correlation coefficients as the correlation coefficient value increases in the population (see Field, 2001). This skew is especially prevalent in studies with smaller samples, and as reading-, writing-, and listening academic achievement measures include only a small number of studies (k < 10).
The effects of the three moderator variables (average age, female proportion, and institution type) on the effect sizes were analysed utilising a random-effect meta-regression with a restricted maximum likelihood estimator in the Jamovi interface of R (Love et al., 2018). The effects of the moderator variables are analysed for each category of academic achievement, with the exception of listening academic achievement and average age as too few studies on listening academic achievement and FLCA reported average age for a moderator analysis to be viable.
As publication bias occurs with positive results being more likely to be published (Conn et al., 2003), the possibility of publication bias affecting the result of the meta-analysis was also investigated. A funnel plot was utilised to display the ratio of effect size to sample size in order to subjectively interpret the possibility of missing findings (Borenstein et al., 2011). In addition a trim-and-fill of the funnel plot was also provided in which an iterative procedure is used to recalculate the effect size if supposed publication biases were removed (Duval & Tweedie, 2000).

FLCA and general academic achievement
The relationship between FLCA, as measured by the FLCAS, and general academic achievement was examined via the 59 effect sizes measuring either a grade point average or a general test/exam score of language learners. The correlation coefficients of each study were Therefore, it can be concluded that FLCA and general academic achievement have a moderately negative relationship and as such higher anxiety individuals are more likely to have a lower general achievement score in a language learning course.

FLCA and reading academic achievement
Reading academic achievement and FLCA showed a moderate negative correlation of r = -.342 (k = 10; N = 995), with a 95% confidence interval of r = -.405 to r = -.278. The results are statistically significant (Z = -10.6; p < .001). In addition, the results indicate that a small amount of heterogeneity exists across the 10 interaction effects (Q(9) = 11.34; p = .25; I 2 = 15.87), with almost no variance across studies (τ 2 < .000). However, it should be noted that the relatively low I 2 value is influenced by the small number of studies included in the analysis. The purported lack of heterogeneity across the 10 effect sizes does provide further confidence in the proposed conclusion that language learners with higher levels of FLCA are more likely to also have lower levels of reading academic achievement.

FLCA and writing academic achievement
The meta-analysis of 7 effect sizes between FLCA and writing academic achievement results in a moderate negative correlation of r = -.436 (k = 7; N = 1098), with a 95% confidence interval of r = -.569 to r = -.302. The results are statistically significant (Z = -6.41; p < .001). In addition, a moderate amount of heterogeneity is present in the studies included in this metaanalysis (Q(6) = 24.60; p < .001; I 2 = 75.41), with a small amount of variance across studies (τ 2 = .02). FLCA and writing academic achievement do therefore share a moderate negative correlation, however the test of heterogeneity does indicate some inconsistencies across effect sizes.

FLCA and listening academic achievement
The analyses of the 7 effect sizes between FLCA and listening academic achievement yielded a moderately large correlation coefficient of r = -.525 (k = 7; N = 986), with a 95% confidence interval of r = -.716 to r = -.333. The results are statistically significant (Z = -5.38; p < .001). In addition, a large amount of heterogeneity is present in the studies included in this meta-analysis (Q(6) = 46.24; p < .001; I 2 = 87.86), with a large amount of variance across studies (τ 2 = .06). The poor heterogeneity results are likely due to the dispersion of the data and the low precision of some data entries in the analysis (Borenstein et al., 2011). While some variation in true effect sizes across the studies is possible, the negative relationship between FLCA and listening academic achievement is clear.  (Higgins et al., 2003).

Moderator analyses
The potential moderators of average age, female proportion of the sample, and institution type were examined for each form of academic achievement (see Table 1) -with the exception of listening academic achievement and FLCA as moderated by average age as too few studies reported an average age as to make an analysis viable.
No statistically significant moderating effects were found for any category of academic achievement with the exception of listening academic achievement. The moderator analysis indicates that the institution type was found to have a moderating effect on the relationship between listening academic achievement and FLCA (slope = .294, Z = 2.68, p = .007). Thus, students learning a language at university indicated a stronger relationship between FLCA and listening academic achievement, than students in a language learning course in school. However, it should be noted that only a small amount of effect sizes were included in either analyses. Thus, the analysis did not meet the recommended minimum of k = 10 (Green et al., 2008). Therefore, the possibility that the statistically significant results in regards to the moderator analyses of listening academic achievement and FLCA may be a Type I error ought to be considered.
Thus, the moderators coded and analysed do not address or alleviate the large amounts of heterogeneity found in the initial meta-analysis results. Note. *p < .01

Publication bias analysis
In order to assess the possibility of publication bias, a funnel plot was generated for each of the categories of academic achievement. The funnel plot for general academic achievement is presented in Figure 3 (for all other categories, see the Supplementary Materials). In addition, the trim-and-fill method in a random effects model was used on each category of academic achievement and FLCA (Duval & Tweedie, 2000) (see Table 2). From the trim-and-fill results in Table 2

Summary of results
The meta-analysis indicates moderate to large negative effect sizes for all categories of academic achievement and FLCA, with large confidence intervals indicating no statistically significant difference across categories (see Table 3). Thus, students experiencing FLCA are more likely to have a negative achievement score on all categories of academic achievement coded in this study.

Discussion
The results of this meta-analysis confirmed the negative relationship between FLCA and academic achievement. General academic achievement as measured by grade point averages or test scores shared a moderate negative correlation with FLCA (r = -.39; k = 59; N = 12,858), confirming the observation made by Horwitz (2001) in her review of FLCA. In addition, the overall effect size found for general academic achievement was markedly similar to the one found in the meta-analysis of Teimouri et al. (2019) of r = -.36 (k = 105;N = 19,933). Although it should be noted that the meta-analysis conducted by Teimouri et al. (2019) was a broader study on general second language anxiety and academic achievement, and as such only 33 of the 67 studies included in this meta-analysis were also captured by Teimouri et al. (2019).
Furthermore, the results indicate that individual categories of competence are negatively related to FLCA. Reading-, writing-, listening-, and speaking academic achievement has each been separately and negatively linked to FLCA (see Table 3 (Phillips, 1992). However, the analysis between speaking academic achievement and FLCA did indicate a large amount of heterogeneity, thus the relationship may be exacerbated or impeded by other factors such as a general public speaking anxiety. Indeed, all categories of academic achievement and FLCA showed large amounts of heterogeneity (see Table 3) -with the exception of reading academic achievement -indicating the relationships to be complex and most likely influenced by moderators.
The moderators coded in this analysis did not provide additional insight into the complexity of FLCA and academic achievement. The average age, female proportion of the sample, as well as the institution type were found to not significantly moderate the relationships between FLCA and the different categories of academic achievement -with the exception of FLCA and listening academic achievement as moderated by institution type (Z = .294; p < .01).
However, as the number of effect sizes included in the analysis were very small (k = 5) and no other comparable result could be found in any other category of academic achievement, it is therefore highly likely that this positive result may be a Type I error of a false positive finding.
On the other hand, the majority of moderator analyses, as well as primary analyses of writing and listening academic achievement, can perhaps be considered underpowered due to the low amount of studies included in the analyses. This increases the probability of not only Type I errors, but also Type II errors with the possibility of a true finding being dismissed. The low sample sizes and small amount of studies in the field under scrutiny -such as the specific academic competencies -therefore create errors that may permeate through the findings of the metaanalysis.
Future research efforts are therefore needed in order to establish variables that moderate the relationship between foreign language anxiety and academic achievement. In addition, as age, gender and instruction context have all been identified as possible moderators in the literature (Abu-Rabia, 2004;Dewaele, 2007;Kim, 2009), future research regarding these variables and their complex and possibly dynamic relation to FLCA and academic achievement is also recommended. Indeed, more may be understood regarding individual differences and academic achievement in foreign language learning by expanding the scope of the meta-analysis to include other variables in the FLCA nomological network such as self-perceived competence, willingness to communicate and foreign language enjoyment. A meta-analytic structural equation model on such variables is therefore highly recommended.
Future research efforts should also extend to further examining the directionality in the relationship between FLCA and academic achievement. The relationship between the two variables has been described as a "vicious circle" (Cheng, Horwitz, & Schallert, 1999, p. 437), and future studies examining causality between FLCA and academic achievement would make a valuable contribution.
Beyond directionality, the question of malleability ought to be further addressed. This meta-analysis established that students with higher levels of FLCA are placed at a disadvantage as they are more probable to have lower achievement scores than their lower-level FLCA peers.
This can have a detrimental effect on the success of a student in a high stakes test environment, where admission to schools or programs is dependent on grades or exam scores. Encouraging findings have been made throughout the years, with both teacher and learner strategies developed to reduce the presence of FLCA in learners (Oxford, 2017). Thus, continued research on the reduction and management of FLCA as well as the acknowledgement of its presence in the foreign language learning process should remain a focus point in the pedagogy of individual differences in language learning.

Limitations
The study has several limitations that ought to be considered. Firstly, in the methods of the meta-analysis, several studies were excluded due to lack of necessary statistical data or because no full-text could be located. Efforts were made to contact authors in such cases, however, there was only one fruitful reply. In addition, non-English publications were excluded.
It is also highly likely that unpublished data on the variables that could have been added to the meta-analysis exists due to the popularity of the topic in applied linguistics. The number of studies included in the meta-analysis may therefore not represent the entirety of the research on the subject, however we have confidence that the meta-analysis captured a significant portion of existing studies.
Secondly, the current meta-analysis examined language anxiety solely through the lens of the Foreign Language Classroom Anxiety Scale (FLCAS), one specific measurement instrument.
The findings of the meta-analysis are therefore limited to foreign language classroom anxiety as defined and designed by Horwitz et al. (1986). A broader meta-analysis encompassing all possible measures of language anxiety has been conducted and can inform interested readers on the relation between FLCA as measured through different instruments and academic achievement (see Teimouri et al., 2019). In addition, translated and shortened versions of the FLCAS were included in the analyses, and as such may not necessarily capture the broad context of the original 33-item FLCAS.
Thirdly, the large amount of heterogeneity identified in the analyses indicates more complexity in the relationship between FLCA and academic achievement than can be captured by correlation coefficients. In addition, the moderators coded in the analyses did not alleviate unexplained variance. Future studies examining moderators in terms of individual differences variables (foreign language enjoyment, willingness to communicate, and self-perceived competence), as well as sample characteristics (level of multilingualism, SES, culture) is therefore needed.
The small amount of studies in the specific competency-based achievement measures (reading-, writing-, listening-, and speaking academic achievement) implies that caution ought to be given in the interpretation of results. The publication bias results of these analyses further implies that the small number of studies did impact the strength of correlations. This caution should also be extended to the moderator analyses, where small numbers of effect sizes undoubtedly created instability in the results.

Conclusion
It is clear that FLCA, as measured by the FLCAS, is as prevalent in language learning today as it was at the inception of the variable in 1986. The negative relationship between FLCA and academic achievement found in this meta-analysis confirms the negative parallel occurrence that both anxiety and low achievement can have in the language learning classroom. With this result in mind, we urge researchers to further examine the directionality of the relationship between language anxiety and academic achievement, and subsequently investigate methods of reducing or managing anxiety in the language learning process. Efforts to minimise its negative impact ought to be made and, subsequently, potential effects on achievement should be investigated. This is especially a concern as low achievement scores can result in real-world negative consequences for language learners. We therefore hope that this meta-analysis can provide a useful evidence-based guide for language instructors, designers of language learning courses and materials, as well as researchers on the importance of individual differences such as FLCA and that it relates to achievement in language learning.