Flipping large university courses: medium-term effects of active learning


In a flipped learning setting, the major part of content delivery is accomplished outside of the classroom and class time is instead used for engaging students in collaborative and hands-on activities. During the past decades, this pedagogical approach has gained much popularity and a large body of research supports its benefits. Implementing flipped learning, however, is not obvious and relies on many factors related to the local learning and teaching culture, the existing assessment regulations, the curricular boundary conditions and, most important, on scalability considerations. Flipping a class with 30 students might be considered as a feasible task, but flipping a lecture with 300 students turns out to be rather challenging and may potentially require considerable investments, such as room reconfiguration and increased teaching manpower. Before any department or university considers adopting flipped learning in a given local context, it will be necessary to identify possible assets and drawbacks beforehand. For this reason, we have conducted a pilot study within a physics lecture class of 370 students at a major Swiss research university.


During the spring semester 2017, we have divided a non-physics undergraduate student cohort into two parallel teaching settings, one focusing on skill development (SCALE-UP) and one focusing on content delivery (LECTURE).

In the SCALE-UP setting, students had to prepare the content prior to coming to class (flipped classroom).

Photos of the lecture hall and of the SCALE-UP classroom

Photos of the lecture hall and of the SCALE-UP classroom

In order to conduct a comparative study of the two different pedagogical settings, we recorded the performance of the complete student cohort (both SCALE-UP and LECTURE) at two different points:

  • Physics mid-term exam: 10th week during the intervention
  • Physics final exam: 8 months after the intervention

The physics mid-term and final exams included conceptual and numerical questions. In the mid-term exam, 50% of the points could be achieved by conceptual multiple-choice questions, whereas the ratio in the final exam was 40%. Therefore, we were able to split the overall achievement into conceptual and numerical performance components. Conceptual questions assess student understanding of the underlying phenomena rather than the application of the physics material within a mathematical framework. Thus, our study enables us to make a clear distinction between the conceptual understanding and its numerical transfer.

Furthermore, the physics final exam was split into one part (Phys1) covering the topics  that were introduced during the flipped classroom intervention in spring and another part (Phys2) with the topics that were covered in autumn without a parallel setting. With this distinction, we are able to draw conclusions on longitudinal effects (Phys1) and on how well the learning achievements of the flipped class can be transferred to new topics (Phys2).

Throughout the performance analysis, we are only considering students who took part in all assessments. As a result, we had to reduce the overall population to 35 students in the SCALE-UP setting and 133 students in the LECTURE setting. The data are still sufficient to run statistical tests, even though we have to deal with an unbalanced design.

Performance Results

Performance gains of the SCALE-UP students

Performance gains of the SCALE-UP students: In order to compare the performance of students from the SCALE-UP setting to those of the LECTURE, we have conducted a series of independent t-tests. The gain is calculated by the difference in the means G = M(SCALE-UP) – M(LECTURE).
Error bars correspond to the 95% confidence intervals. Effect sizes of d=0.2 are considered to be small, whereas d=0.5 is related to a medium effect and d=0.8 to a large effect.

Medium-term performance effects

Medium-term performance effects: We can directly compare the performance recorded in the mid-term exam to the performance in Phys1 by running a series of dependent t-tests. The mean difference is calculated by M(PHYS1) –M(Midterm).
Error bars correspond to the 95% confidence intervals. Effect sizes of d=0.2 are considered to be small, whereas d=0.5 is related to a medium effect and d=0.8 to a large effect.

  • During the intervention period, students from the flipped SCALE-UP group outperformed students from the LECTURE setting. This performance gain, however, was substantially reduced when evaluated over the medium-term scale.
  • For those students who participated in the 14-week flipped SCALE-UP group, we could not identify any transfer or modification of learning behavior that would induce better performance outside of a dedicated flipped learning setting.


  • A single active learning intervention of one semester (14 weeks) is too short to sustain substantial performance gains.
  • Even though students enjoyed the flipped class very much, their performance gains were much lower than those reported from the (mainly U.S.) literature.
  • Curricular constraints such as contact hours and assessment conditions should be considered and adapted when shifting to a flipped class setting.

The full paper, including further results, presented at EDULEARN18 is available from >here<.

Leave a Reply

Your email address will not be published. Required fields are marked *