Artificial Intelligence (AI) in general is the term on everyone’s lips right now, whether it’s an entrepreneur looking to cash in on their next innovation or an analyst working to determine its potential future impacts on humanity. The launch of ChatGPT in late 2022 brought with it a mix of reactions, ranging from excitement to trepidation and even panic from some.

In the education sector specifically, it’s no different. While excited about opportunities to increase efficiency, students and academics are also proceeding with caution amid concerns about academic integrity. At the start of 2023, the authors of a new paper were flooded with communication related to ChatGPT—from emails from university leaders to invitations to attend information sessions. But the sessions included very little information on the known impact of ChatGPT on current assessment practices, and advice generally was the same: “proceed with caution.”

A new paper co-authored by Dr Peter Neal and Dr Sarah Grundy of the UNSW School of Chemical Engineering, along with seven other authors from six other Australian universities, sheds more light on ChatGPT by examining how it may specifically affect assessment in engineering education. By exploring ChatGPT responses to existing assessment prompts from ten engineering courses across seven Australian universities, the paper exposes the strengths and weaknesses of current assessment practice and discusses opportunities for ChatGPT to facilitate learning.

Could ChatGPT get a university engineering degree?

In recognition of the fact that ChatGPT is continually advancing and evolving, the paper aimed to establish a ‘benchmark of ChatGPT’s performance in a diverse range of assessment tasks during the first quarter of 2023’. And in acknowledgement of the fact that ChatGPT and AI are only going to continue to rise in uptake, the study also explored potential changes that could be made to the engineering assessment processes to help those impacted embrace, not renounce the technology.

The different assessment categories explored included online quizzes, numerical assignments and exams, code submission, oral assessments, visual assessments, and four different types of written assessments (experimentation-based, project-based, research-based and reflective and critical thinking-based). The specific subjects included Maths, Engineering Physics, Introductory Programming, Manufacturing Technology, Engineering Laboratory, Sustainable Product Engineering & Design, Renewable Energy & Electrical Power, Sustainable, Environmental & Social Impacts of Technology, Workplace Practice & Communication and Engineering Research.

The research found that ChatGPT did pass some courses and even excelled with some assessment types—particularly online quizzes or exams with weightings that favour a risk/reward ratio towards cheating. Three types of assessments were failed by ChatGPT, and two were too close to call. And in a course-by-course analysis, ChatGPT passed three courses and failed five, while two tied. With this in mind, the paper’s discussion further explored the types of assessments used in a subject, with a series of recommendations provided.

Are changes needed to engineering assessments?

Findings within the paper suggest that changes in current practice are needed. And as future versions are trained on larger data sets, becoming more advanced, this is only going to become more urgent.

At present, Dr Sarah Grundy advises teaching colleagues to, “Use ChatGPT as a student learning resource, informing students of the pros and cons, in particular the need to do their own research and validate the responses,”

When discussing ChatGPT with students, Dr Grundy she often quotes colleague and Deputy Head of the School of Chemical Engineering Associate Professor Pierre Le-Clech, who leaves students with the following advice:

Treat ChatGPT as that sixth unreliable member of your team.
A/Prof Pierre Le Clech
Deputy Head of the School of Chemical Engineering

ChatGPT versus engineering education assessment: A multidisciplinary and multi-institutional benchmarking and analysis of this generative artificial intelligence tool to investigate assessment integrity. By Sasha Nikolic (UOW), Scott Daniel (UTS), Rezwanul Haque (USC), Ghulam M. Hassan (UWA), Sarah Grundy (UNSW), Sarah Lyden (UTAS), Peter Neal (UNSW) & Caz Sandison (UOW)