Investigating standards in GCSE French, German and Spanish through the lens of the CEFR

RESEARCH AND ANALYSIS

Investigating standards in GCSE

French, German and Spanish

through the lens of the CEFR

Milja Curcin and Beth Black

Acknowledgements

We would like to thank the many people without whose work or advice this study

would not have been possible:

• all our participants, who devoted a lot of their time and enthusiasm to work on

this study and share their expertise and opinions,

• colleagues at Ofqual who have helped in different ways (with IT support,

admin and paper shuffling, analytical support and advice, and various ad hoc

and last minute request for help) – in particular, Nadir Zanini, Joe Colombi,

Robin Smith, Ben Laurens, Matthew Stratford, Richard Coles and Jonathan

Clewes,

• Jane Lloyd, Alastair Pollitt, Stuart Shaw and Neil Jones, for invaluable insights

and advice.

Investigating standards in GCSE French, German and Spanish through the lens of 
the CEFR 
3 
 
Contents 
Acknowledgements ................................................................................................................. 2 
List of tables ............................................................................................................................. 4 
List of figures ........................................................................................................................... 5 
Executive summary ................................................................................................................. 7 
Introduction ............................................................................................................................ 13 
Why look at GCSE performance and assessment standards in relation to grading severity 
using CEFR descriptors ....................................................................................................... 13 
Why CEFR can be considered appropriate for use in the context of GCSE MFLs in 
England ................................................................................................................................ 14 
Method..................................................................................................................................... 22 
Overview .............................................................................................................................. 22 
Specifications ....................................................................................................................... 23 
Participants .......................................................................................................................... 23 
Familiarisation and training .................................................................................................. 28 
Content mapping .................................................................................................................. 38 
Rank ordering of written and spoken performances ........................................................... 40 
Standard linking of reading and listening comprehension assessments ............................ 44 
Data analysis........................................................................................................................ 49 
Limitations ............................................................................................................................ 53 
Results .................................................................................................................................... 56 
Content mapping .................................................................................................................. 56 
Rank ordering of written and spoken performances to map to the CEFR .......................... 61 
Standard linking of reading and listening comprehension assessments ............................ 74 
Qualitative results ................................................................................................................ 88 
Discussion .............................................................................................................................. 95 
References ............................................................................................................................ 100 
 

Investigating standards in GCSE French, German and Spanish through the lens of 
the CEFR 
4 
 
List of tables 
Table 1 GCSE to CEFR mapping for Spanish............................................................ 11 
Table 2 GCSE to CEFR mapping for German............................................................ 11 
Table 3 GCSE to CEFR mapping for French ............................................................. 11 
Table 4 Maximum mark for specifications and papers ............................................... 23 
Table 5 Breakdown of panellist background/role by panel ........................................ 23 
Table 6 Breakdown of A level teacher school type and CEFR familiarity by panel .. 24 
Table 7 Key features of the judging allocation design (identical for each component 
and language) .............................................................................................................. 42 
Table 8 CEFR levels and sub-levels used in standard linking ................................... 48 
Table 9 Numerical rating scale categories – CEFR sub-levels .................................. 51 
Table 10 Numerical rating scale categories - CEFR levels ........................................ 52 
Table 11 Example frequency table from which cut scores are calculated ................. 52 
Table 12 Content mapping ratings for productive skills ............................................. 58 
Table 13 Content mapping ratings for receptive skills ............................................... 58 
Table 14 Overall model fit ........................................................................................... 61 
Table 15 SSR and separation coefficients ................................................................. 61 
Table 16 Mark/rank order-measure correlations ........................................................ 62 
Table 17 Mark points of writing scripts included in the rank ordering exercise ......... 62 
Table 18 GCSE to CEFR mapping for Spanish writing .............................................. 65 
Table 19 GCSE to CEFR mapping for German writing .............................................. 66 
Table 20 GCSE to CEFR mapping for French writing ................................................ 67 
Table 21 Mark points of speaking scripts included in the rank ordering exercise ..... 68 
Table 22 GCSE to CEFR mapping for Spanish speaking .......................................... 70 
Table 23 GCSE to CEFR mapping for German speaking .......................................... 71 
Table 24 GCSE to CEFR mapping for French speaking............................................ 72 
Table 25 GCSE to CEFR mapping for Spanish productive skills .............................. 73 
Table 26 GCSE to CEFR mapping for German productive skills .............................. 73 
Table 27 GCSE to CEFR mapping for French productive skills ................................ 73 
Table 28 ICCs based on initial ratings ........................................................................ 74 
Table 29 ICCs based on final ratings .......................................................................... 74 
Table 30 CEFR level rating frequency and cut scores for reading ............................ 76 
Table 31 GCSE to CEFR mapping for Spanish reading comprehension .................. 77 
Table 32 GCSE to CEFR mapping for German reading comprehension .................. 78 
Table 33 GCSE to CEFR mapping for French reading comprehension .................... 79 
Table 34 CEFR level rating frequency and cut scores for listening ........................... 81 
Table 35 GCSE to CEFR mapping for Spanish listening comprehension ................. 82 
Table 36 GCSE to CEFR mapping for German listening comprehension ................. 83 
Table 37 GCSE to CEFR mapping for French listening comprehension .................. 84 
Table 38 GCSE to CEFR mapping for Spanish receptive skills ................................ 85 
Table 39 GCSE to CEFR mapping for German receptive skills ................................ 85 
Table 40 GCSE to CEFR mapping for French receptive skills .................................. 85 
Table 41 Percentage of total marks required for each CEFR level ........................... 85 
Table 42 Percentage of total marks required for each GCSE grade ......................... 86 
Table 43 GCSE to CEFR mapping for Spanish ......................................................... 96 
Table 44 GCSE to CEFR mapping for German ......................................................... 96 
Table 45 GCSE to CEFR mapping for French ........................................................... 96 
Table 46 Indicative linking at qualification level .......................................................... 97 
   

Investigating standards in GCSE French, German and Spanish through the lens of 
the CEFR 
5 
 
List of figures 
Figure 1 Estimated qualification level mapping for each language and grade .......... 12 
Figure 2 The CEFR global scale ................................................................................. 15 
Figure 3 The structure of the CEFR descriptive scheme ........................................... 18 
Figure 4 Sequence of activities in the linking exercise............................................... 22 
Figure 5 Nature of participants’ experience with the CEFR ....................................... 25 
Figure 6 Participants’ attitudes to the CEFR and its use in understanding GCSE 
standards ..................................................................................................................... 26 
Figure 7 Experience of writing reading/listening comprehension test items and 
standard setting ........................................................................................................... 27 
Figure 8 Training evaluation – productive skills ......................................................... 31 
Figure 9 Training evaluation – receptive skills............................................................ 32 
Figure 10 Confidence in understanding the distinction between CEFR levels at the 
end of the training ........................................................................................................ 33 
Figure 11 Familiarisation ratings distribution of CEFR exemplars – French reading 34 
Figure 12 Familiarisation ratings distribution of CEFR exemplars – French listening
 ..................................................................................................................................... 35 
Figure 13 Familiarisation ratings distribution of CEFR exemplars – German reading
 ..................................................................................................................................... 36 
Figure 14 Familiarisation ratings distribution of CEFR exemplars – German listening
 ..................................................................................................................................... 36 
Figure 15 Familiarisation ratings distribution of CEFR exemplars – Spanish reading
 ..................................................................................................................................... 37 
Figure 16 Familiarisation ratings distribution of CEFR exemplars – Spanish listening
 ..................................................................................................................................... 38 
Figure 17 “I found rank ordering 4electronic files (writing or speaking) feasible” ...... 43 
Figure 18 Example of one-mark tasks ........................................................................ 46 
Figure 19 Example of a multi-mark task ..................................................................... 47 
Figure 20 Spanish writing rank order - individual script measures ............................ 63 
Figure 21 German writing rank order - individual script measures ............................ 64 
Figure 22 French writing rank order - individual script measures .............................. 64 
Figure 23 Spanish writing rank order - average grade boundary script measures ... 65 
Figure 24 German writing rank order - average grade boundary script measures ... 66 
Figure 25 French writing rank order - average grade boundary script measures ..... 67 
Figure 26 Spanish speaking rank order - individual script measures ........................ 68 
Figure 27 German speaking rank order - individual script measures ........................ 69 
Figure 28 French speaking rank order - individual script measures .......................... 69 
Figure 29 Spanish speaking rank order - average grade boundary script measures 70 
Figure 30 German speaking rank order - average grade boundary script measures 71 
Figure 31 French speaking rank order - average grade boundary script measures . 72 
Figure 32 Spanish reading comprehension - distribution of CEFR sub-levels and 
levels ............................................................................................................................ 75 
Figure 33 German reading comprehension - distribution of CEFR sub-levels and 
levels ............................................................................................................................ 76 
Figure 34 French reading comprehension - distribution of CEFR sub-levels and 
levels ............................................................................................................................ 76 
Figure 35 Spanish reading comprehension – GCSE grade to CEFR mapping ........ 77 
Figure 36 German reading comprehension – GCSE grade to CEFR mapping ........ 78 
Figure 37 French reading comprehension – GCSE grade to CEFR mapping .......... 79 

Investigating standards in GCSE French, German and Spanish through the lens of 
the CEFR 
6 
 
Figure 38 Spanish listening comprehension - distribution of CEFR sub-levels and 
levels ............................................................................................................................ 80 
Figure 39 German listening comprehension - distribution of CEFR sub-levels and 
levels ............................................................................................................................ 80 
Figure 40 French listening comprehension - distribution of CEFR sub-levels and 
levels ............................................................................................................................ 81 
Figure 41 Spanish listening comprehension – GCSE grade to CEFR mapping ....... 82 
Figure 42 German listening comprehension – GCSE grade to CEFR mapping ....... 83 
Figure 43 French listening comprehension – GCSE grade to CEFR mapping ......... 84 
Figure 44 Estimated qualification level mapping for each language and grade ........ 98 
   

Investigating standards in GCSE French, German and Spanish through the lens of

the CEFR

Executive summary

While most stakeholders would agree that modern foreign language (MFL) study is a

valuable part of the curriculum, there is general decline in numbers of students

taking GCSEs in these subjects. There is a persistent perception that MFL GCSEs

are more difficult compared to other subjects. This is often cited as a reason for

declining subject take-up at secondary and university level. On the face of it,

consistent patterns in statistical evidence appear to support the notion that MFL

GCSEs are graded more severely than other GCSE subjects. However, while such

statistical analyses may indicate on average lower grade outcomes when controlling

for prior or concurrent attainment, these analyses do not take into account a

multitude of factors related to (perceptions of) difficulty and demand. These could be,

for instance, subject demand, nature of assessment, allocation of teaching time and

other resources, motivation of students, efficiency and effectiveness of teaching and

learning, etc. (Coe, 2008; Newton, 2012; Lockyer and Newton, 2015; Wingate, 2018;

Macaro, 2008; Graham, 2002; Klapper, 2003; etc.).

This study was part of a programme of research carried out by Ofqual to help inform

its policy decision of whether to intervene and adjust grading standards in MFL

GCSE qualifications in French, German and Spanish. The study was designed to

describe the nature of performance and assessment standards in these subjects

using the ‘metalanguage’ of the Common European Framework of Reference for

languages (CEFR), an internationally widely used framework describing language

ability via a common ‘can do’ scale, allowing broad comparisons across languages

and qualifications. The aim was to provide a platform for a more principled

discussion about whether GCSE MFL performance standards and corresponding

grading standards are appropriate for these qualifications, or are indeed too high.

We do not believe that possible discrepancies between the notions of communicative

language competence and language use as described in the CEFR, and the way

communicative language competence and use may be understood, taught, and

assessed at GCSE level, would in itself invalidate an attempt to describe GCSE

MFLs in terms of CEFR descriptors. We would argue that, as long as the broad

intention of the MFL GCSE curriculum and pedagogy is reasonably aligned to the

CEFR – and this would appear to be the case as, for instance, MFL GCSEs should

“develop [learners’] ability to communicate confidently and coherently with native

speakers in speech and writing, conveying what they want to say with increasing

accuracy” (DFE, 2015: 3) – a description in terms of the CEFR may not only be

appropriate, but also helpful.

However, we do believe that it is important to be aware of the specific context of the

MFL GCSEs, as it may account for occasional disjoint between CEFR descriptors

and GCSE assessments/performances that are observed in the linking. In addition,

an awareness of these discrepancies could be helpful for improving both current

language pedagogy and assessment methods where appropriate, helping learners to

achieve the goal of communicative language competence at the level appropriate for

the phase of education at which they are.

Because this study was designed as a piece of research to answer a specific

research question, rather than as a full-blown linking study, it consequently has

some potential limitations in scope and generalisability. This is, to our knowledge,

the first explicit attempt to link GCSE MFL qualifications to the CEFR using

Investigating standards in GCSE French, German and Spanish through the lens of

the CEFR

recommended methodology, and so we consider this study primarily exploratory.

Involvement and endorsement of other relevant stakeholders (e.g. Department for

Education, exam boards), greater resources, further refinement of some aspects of

the methodology and linking of specifications from other exam boards would be

necessary to conduct a linking study where the results might be considered to

represent an “official” linking. Therefore, the findings need to be treated as

essentially descriptive and indicative. Having said this, we have made every effort to

conduct this linking study according to best practice in the field, and in this sense,

the results should be reasonably robust for those specifications on which the linking

was performed.

In this study, key grades (grades 9, 7 and 4) in GCSE French, German and Spanish

on the summer 2018 tests were notionally linked to the CEFR scale. Initially, content

mapping (i.e., relating the construct and content coverage of the GCSE to the CEFR)

was carried out for each subject by a CEFR expert and a GCSE subject expert.

Subsequently, panels of 13 experts (including CEFR experts, Higher Education and

subject experts, A level teachers and exam board representatives) carried out the

following activities for each subject:

• For writing and speaking, they rank ordered, in terms of overall quality, series

of GCSE performances (at grades 9, 7 and 4) interspersed with performances

previously independently benchmarked on the CEFR scale. This created an

overall performance quality scale on which the relative position of the GCSE

and CEFR performances was determined, and CEFR-related performance

standards at grades 9, 7 and 4 extrapolated from this.

• For reading and listening comprehension, they conducted a ‘standard linking’

exercise using the ‘Basket Method’ to rate each mark point on the tests in

terms of the CEFR levels. CEFR level cut scores were derived from these

ratings and grades 9, 7 and 4 related to these in terms of proportions of marks

on the test needed to achieve each.

• The linking results at component level were averaged to get a

qualification-level estimate of the mapping of each grade to the CEFR level.

The results of the linking at component level are shown in Tables 1 to 3 . The linking

of GCSE grades to the CEFR levels across components within Spanish and German

is very consistent, with productive skills being at a lower CEFR level than the

receptive skills. French mapping is less consistent, but this may be partly due to the

issues with the CEFR exemplars for productive skills, and apparent issues with the

listening comprehension paper (described in the Results section). Therefore, we

would suggest that the linking for French is more tentative than for the other two

languages. The patterns are broadly consistent across the 3 languages, with the

notable exception of grade 7 for productive skills (lowest standard in Spanish), and

grade 4 for receptive skills (highest standard in Spanish).

Figure 1 shows indicative linking at qualification level for each grade, based on

averaging across the CEFR sub-levels of components. It appears that performance

standards between the 3 languages are reasonably aligned at qualification level

despite some component-level inconsistencies. The results suggest that grade 4 is

around high A1 level for Spanish and mid A1 level for German and French. Grade 7

is around mid A2 level and grade 9 around low B1 for all languages. This result

accords with the results of the content mapping, which suggested that each of the 3

GCSE MFL specifications assessed most of the skills up to A2+ (i.e. high A2) level,

Investigating standards in GCSE French, German and Spanish through the lens of

the CEFR

with some aspects of language competence assessed up to low B1 level. While a

degree of consistency across languages is perhaps to be expected given that these

assessments are supposed to be developed based on specifications that should be

reasonably aligned in terms of content and implicit demand, there is no particular

reason why we should expect the performance standards for different grades to be

perfectly aligned across languages. This reminds us that considering standards

between even quite related subjects involves considerable nuance and

interpretation.

However, in addition to the limitations discussed in the Limitations section, an

important “health warning” regarding the interpretation of this linking is in order. It

should be borne in mind that the limitations of assessments highlighted in both

content mapping and in discussion with panellists, particularly with respect to

assessment of interaction and integrated skills, would to some extent limit the

interpretation based on these assessments that candidates are fully at A2 or B1

level. This is because the assessments themselves provide little evidence of some of

the skills essential for communicative language competence, such as ability to

engage in meaningful interaction. In a sense, it may be more appropriate to say that,

overall, candidates achieving each of the GCSE grades possess most, but not all, of

the skills and knowledge required of the CEFR level assigned in this linking exercise.

While this is also true of A2 level to some extent, most of the caveats and

discrepancies relate to where assessments appear to be targeting B1 level, as in

many cases assessments were patchy in the extent to which they allowed for all of

the skills relevant for B1 level to be demonstrated. This would mean that the levels

assigned to different grades could be seen as overestimates to some extent,

particularly for B1 level, but also to some extent for A2. This should be borne in mind

in any discussions about whether A2 or B1 level may be appropriate for different

GCSE grades.

This linking study dealt with describing the content/construct of GCSE MFL

specifications and tests, as well as performances, in terms of the CEFR, and relating

the current GCSE grading standards to the CEFR. The results essentially give an

indication of where GCSE assessments are pitched and which performance

standards are represented by different GCSE grades, using the language of the

CEFR descriptors. Therefore, this linking is not a statement of what the GCSE

standard should be, but an approximate description of what the performance and

assessment/grading standard currently appears to be, using the language and

descriptors of the CEFR.

The GCSE MFL assessments reviewed in this study do not appear to elicit sufficient

evidence of certain linguistic skills that may be considered by some to be a crucial

part of communicative language competence. It would seem important to investigate

these issues further and explore ways in which the assessments might be made

more effective in assessing these important skills. As far as GCSE MFLs should

enable learners to act in real-life situations, expressing themselves and

accomplishing tasks of different natures, it would make sense that, like the CEFR,

they put the co-construction of meaning (through interaction) at the centre of the

learning and assessment process.

The results are offered to stakeholders for consideration as to whether the content

and performance standards and assessment demands associated with the key

GCSE grades are appropriate given the purpose of GCSE qualifications, the spirit

Investigating standards in GCSE French, German and Spanish through the lens of

the CEFR

and nature of the curriculum, and the current context of GCSE MFL learning and

teaching. For instance, if the relevant stakeholders were to conclude that, generally

speaking, a mid A2 level of performance is too high for GCSE grade 7, this could

provide rationale to support a change to grading standards. However, in this case,

this rationale would not be based on statistical evidence or any notions of

comparable ‘value-added’ between different subjects, but based on an

understanding of what an appropriate performance standard, in terms of what

students can do, is or should be for each grade within MFLs themselves.

We would suggest, however, in the spirit of the CEFR, that discussions around the

appropriateness of language performance and assessment standards should

consider important aspects of the context of language teaching in schools. The

CEFR (Council of Europe, 2018: 28) suggests planning backwards from learners’

real life communicative needs, with consequent alignment between curriculum,

teaching and assessment. As North (2007a) points out, educational standards must

always take account of the needs and abilities of the learners in the context

concerned. Norms of performance need to be definitions of performance that can

realistically be expected, rather than relating standards to “some neat and tidy

intuitive ideal” (Clark 1987: 46). This posits an empirical basis to the definition of

standards. If used appropriately, the CEFR could aid this endeavour in the context of

GCSE MFLs in England.

Investigating standards in GCSE French, German and Spanish through the lens of the CEFR

Table 1 GCSE to CEFR mapping for Spanish

Writing

Speaking

Reading

Listening

GCSE

grade

CEFR

sub-level

CEFR

level

CEFR

sub-level

CEFR

level

CEFR

sub-level

CEFR

level

CEFR

sub-level

CEFR

level

Mid-high

Low-mid

Mid-high

Low-mid

Table 2 GCSE to CEFR mapping for German

Writing

Speaking

Reading

Listening

GCSE

grade

CEFR

sub-level

CEFR

level

CEFR

sub-level

CEFR

level

CEFR

sub-level

CEFR

level

CEFR

sub-level

CEFR

level

Low-mid

Mid A1

High A1-

low A2

A1/A2

High A1-

low A2

A1/A2

Mid-high

High A2

Mid-high

Low-mid

Low B1

Low-mid

Table 3 GCSE to CEFR mapping for French

Writing

Speaking

Reading

Listening

GCSE

grade

CEFR

sub-level

CEFR

level

CEFR

sub-level

CEFR

level

CEFR

sub-level

CEFR

level

CEFR

sub-level

CEFR

level

High A1-

Low A2

A1/2

Low-mid

High A1-

low A2

A1/A2

Low-mid

High A2-

low B1

A2/B1

Mid-high

High A1-

low A2

A1/A2

Low-mid

Mid-high

Low-mid

High A2-

lowB1

A2/B1

Investigating standards in GCSE French, German and Spanish through the lens of

the CEFR

PROFICIENT USER

Can understand with ease virtually everything heard or

read. Can summarise information from different spoken

and written sources, reconstructing arguments and

accounts in a coherent presentation. Can express

him/herself spontaneously, very fluently and precisely,

differentiating finer shades of meaning even in more

complex situations.

Can understand a wide range of demanding, longer

texts, and recognise implicit meaning. Can express

him/herself fluently and spontaneously without much

obvious searching for expressions. Can use language

flexibly and effectively for social, academic and

professional purposes. Can produce clear, well-

structured, detailed text on complex subjects, showing

controlled use of organisational patterns, connectors

and cohesive devices.

INDEPENDENT USER

Can understand the main ideas of complex text on both

concrete and abstract topics, including technical

discussions in his/her field of specialisation. Can

interact with a degree of fluency and spontaneity that

makes regular interaction with native speakers quite

possible without strain for either party. Can produce

clear, detailed text on a wide range of subjects and

explain a viewpoint on a topical issue giving the

advantages and disadvantages of various options.

Can understand the main points of clear standard input

on familiar matters regularly encountered in work,

school, leisure, etc. Can deal with most situations likely

to arise whilst travelling in an area where the language

is spoken. Can produce simple connected text on topics

which are familiar or of personal interest. Can describe

experiences and events, dreams, hopes and ambitions

and briefly give reasons and explanations for opinions

and plans.

BASIC USER

Can understand sentences and frequently used

expressions related to areas of

most immediate relevance (e.g. very basic personal

and family information, shopping, local geography,

employment). Can communicate in simple and routine

tasks requiring a simple and direct exchange of

information on familiar and routine matters. Can

describe in simple terms aspects of his/her background,

immediate environment and matters in areas of

immediate need.

Can understand and use familiar everyday expressions

and very basic phrases aimed at the satisfaction of

needs of a concrete type. Can introduce him/herself

and others and can ask and answer questions about

personal details such as where he/she lives, people

he/she knows and things he/she has. Can interact in a

simple way provided the other person talks slowly and

clearly and is prepared to help.

Figure 1 Estimated qualification level mapping for each language and grade

S 4

S 7

G 4

G 7

F 4

F 7

S 9

G 9

F 9

Investigating standards in GCSE French, German and Spanish through the lens of

the CEFR

Introduction

While most stakeholders would agree that modern foreign language (MFL) study is a

valuable part of the curriculum, there is general decline in numbers of students

taking GCSEs in these subjects. There is a persistent perception that MFL GCSEs

are more difficult compared to other subjects. This is often cited as a reason for

declining subject take-up at secondary and university level.

On the face of it, consistent patterns in statistical evidence appear to support the

notion that MFL GCSEs are graded more severely than other GCSE subjects.

However, while statistical analyses may indicate on average lower grade outcomes

when controlling for prior or concurrent attainment, these analyses do not take into

account a multitude of factors related to (perceptions of) difficulty and demand.

These could be, for instance, subject demand, nature of assessment, allocation of

teaching time and other resources, motivation of students, efficiency and

effectiveness of teaching and learning, etc. (Coe, 2008; Newton, 2012; Lockyer and

Newton, 2015; Cuff, 2017; Wingate, 2018; Macaro, 2008; Graham, 2002; Klapper,

2003; etc.).