PGCert in Academic Practice Essays

The abstracts and two essays below formed part of my submissions to achieve a PGCert in Academic Practice from the University of Exeter in 2023. If desired, please reach out for a .pdf form of the content that may be easier to read.


Abstract

Curriculum design forms the basis for both essays, with each tacking a different approach to the integration of critical topics, in accordance with their individual nuances. The first tackles the role of theoretical Computer Science within the Data Science curriculum. This essay provides an argument supporting changes that should be made to the current Data Science courses at the University of Exeter to improve long-term graduate outcomes and improve the standing of future alum community, aiming to provide a course that produces the industry leaders of tomorrow. The second essay discusses integrating the climate crisis into Computer and Data Science education. I discuss the lessons learned and provide recommendations for future education practitioners on the best approach to integrating climate crisis education into a curriculum where it would not usually be tackled. This supports the University of Exeter’s strategic goal of integrating the climate crisis into all curricula across the university.


Essay 1 – Integrating Theoretical Computer Science into the Data Science Curriculum

Computer Science is a rapidly growing field. Data Science more so. Computer and Data Science have considerable overlap in the skills needed to succeed within the fields. Data Science mixes Mathematics, Domain Knowledge, and Computer Science [1]. The rapid expansion of these fields within recent years has blurred the line on what exactly is within the remit of Data Science, Computer Science, or both. At first, what should be taken from Computer Science education and included within Data Science might appear clear; to date, Data Science education has focused primarily on teaching the practical aspects of Computer Science [2]. However, this essay argues that crucial components within theoretical Computer Science have yet to be included in the standard Data Science curriculum and presents an argument for their inclusion.

 

Even though Computer Science is a mature field, it is still seeing considerable growth, with a projected growth of 15% from 2021 to 2031 [3]. However, the growth of Computer Science pales in comparison to Data Science, which has a projected growth of 36% [4]. Given this growth, the supporting infrastructure is also required to expand to ensure the skillsets needed by industry are present in the workforce.


The scope of Data Science courses available across the higher education sector is vast, ranging from comprehensive Data Science BSc [5], to specialised conversion courses for other STEM disciplines at the MSc level [6], to advanced PhD level courses [7] Interestingly for Data Science, online asynchronous courses provide a particularly robust method of upskilling within the discipline [8]. Data Science has made extensive use of this form of teaching, most likely due to the mismatch between the number of individuals wishing to study Data Science and the number of individuals available to train others [9]. This mismatch is indicative of the broader problem in Data Science education; the speed at which the field has expanded has resulted in only the most pressing issues being tackled. Organisations such as the ACM (Association for Computing Machinery, an international learned society for computing) have established a task force to align the range of different Data Science curriculums [10], since there is still considerable heterogeneity in course content, despite nearly every university offering a Data Science course [11]. This has manifested in curriculum design considering only the most overt Computer Science concepts, such as programming, rather than crucial elements, such as computationally efficient programming. More subtle topics within theoretical Computer Science not acknowledged by Data Science practitioners can have delayed, indirect consequences on a Data
Science project’s implementation. For example, computationally efficient code will only manifest problems when the dataset becomes sufficiently large. While this may not be encountered in all cases, in the event that this does occur, work may cease to be conducted entirely, with the potential for a complete rewrite of the code required before the analysis can be completed.


The nefarious issue presented by this oversight within Data Science education is that the problems are only evident further in an individual’s Data Science career when the problem being solved requires considerable sized datasets. This may be years after their education, at which point the individual may not possess the skills required to succeed. While some courses attempt to include big data analysis in their content, these are still within a controlled and preplanned exploration environment due to the prohibitively expensive nature of the data and computation. Therefore, they are not experiencing the problem as if they were working in the real world, where they must navigate for themselves and may require some insight into subtle theoretical Computer Science concepts to overcome the problem. By tackling this issue from a theoretical standpoint, it is possible to circumvent both the data and computation costs associated with practical big data courses.


While there are many critical theoretical Computer Science topics, the two fundamental concepts are time and space complexity. Time complexity is a form of computational complexity that denotes the amount of computer time a given algorithm takes to run. The method used to describe the time complexity of an algorithm involves a theoretical machine, the Random Access Machine (RAM). The RAM is defined as taking a constant amount of time to run an elementary operation, such as addition or subtraction. This method of time complexity allows for a comparison between algorithms to be performed across machines of different computational power, such as a laptop from the 1970s and a laptop from the 2010s.

 

Take the example below, where T(n) is the computational time for the algorithm given an input value of n, in the form of a function:

  • Algorithm one has a time complexity of T(N) = 4n + 3.
  • Algorithm two has a time complexity of T(n) = 10n

If the algorithms were to run on data of size n = 10, then algorithm one would take 43 (4*10 + 3) units to run, and algorithm two would take 100 (10*10) units to run. We could then determine in this circumstance that algorithm one is optimal in time complexity for data of size 10. Space complexity is the same concept as time complexity but is applied to memory consumption.

There are a range of different consequences of inefficient code. When running analysis on cutting-edge problems, it is possible for the analysis to take 10000s of computation hours to run [12], even with efficient code. Inefficient code can make some experiments impossible to run as they can have prohibitively long running times. This issue worsens annually as data becomes more readily produced by society, and the datasets to be analysed continue to grow larger [13]. The current approach of applying more computing power to the problem may no longer be possible due to limitations in pursuing Moores’s law [14]. This concept explains the speed at which computational power grows, which is currently slowing down. The combination of data growth and computational power slowdown results in a requirement to make better use of currently available resources by producing computationally efficient code.


Across a range of different Data Science curriculums, there is an evident lack of time and space complexity and broader theoretical Computer Science elements [10, 2, 15]. My experiences in Data Science education allowed me to see these issues firsthand when I transitioned from Computer Science and Software Engineering to Environmental Data Science research for my PhD. Big data analytics has been a common theme in my research due to the high spatial and temporal resolution of the datasets commonly used in Climate Science, which can span decades and cover the entire globe. The PhD program I am a part of is an interdisciplinary Center for Doctoral Training (CDT) where researchers from various backgrounds [16], such as traditional STEM disciplines alongside other areas, such as the Humanities and Social Sciences, use Data Science to tackle pressing issues within the environmental domain. The concepts I find most beneficial within my research were taught during my undergraduate Computer Science education; however, these were not covered during the Data Science postgraduate education I received at Exeter. These issues are highlighted when working in interdisciplinary teams, as some problems are considerably easier to tackle by someone with knowledge of these fundamental theoretical Computer Science concepts. This disparity has become more evident as the problems my colleagues and I engage with have become more complex as the PhD progresses.


This issue parallels the earlier discussion, where theoretical Computer Science’s importance only becomes evident later in a career. Due to the extent of the problem, I developed and delivered a bespoke remedial course for the CDT students at the request of the CDT management. The position where remedial, bespoke courses can be offered is unique and unlikely for most graduates, highlighting the need for this content to be included in the core Data Science curriculum.


The first step in designing the content of any course is having a clear understanding of its goals and objectives. In its current form, there is an argument that a key goal of Data Science education is for students to secure a full-time position in industry. This goal is evident with the number of statistics on course pages regarding the employment prospects of alums, with the University of Exeter being no exception, as evidenced by its Data Science course page [5]. This is further incentivised by the university’s desire to  be high in rankings, which is partly based on the course’s graduate prospects [17]. The goal is, therefore, to prepare students to excel in interviews and secure employment, helping to bolster statistics that get fed into the rankings of the university and course. Including time and space complexity within the Data Science curriculum does not actively contribute to pursuing this goal, since this knowledge is unlikely to improve the outcome of initial employment seeking. Namely, no easy employment screening tests, such as interviews or take-home coding exercises, can critically assess a candidate’s prowess in the concepts of time and space complexity concerning a novel real-world problem. However, the importance of these concepts to furthering a career cannot be overstated. These concepts will allow alums of courses to tackle the most significant problems within Data Science and become industry leaders rather than encountering a roadblock at the midpoint in their careers when they begin to encounter the problems discussed, as happened in the Environmental Intelligence CDT. Industry leaders are of pivotal importance to a given university course, considerably impacting perceived prestige and reputation. For example, PPE at Oxford garners considerable media attention given that numerous industry leaders graduated from the same course [18], differentiating PPE at Oxford from other similar university courses. Including these topics would not necessarily improve the graduate outcomes in the short term. However, it would help to cultivate graduates to tackle real-world problems, empowering them to become industry leaders and notable alums.


A clear road map could be followed for integrating theoretical Computer Science content into the curriculum of Data Science courses. In the case of Exeter, the willingness to teach the skills used to solve these problems has already been shown, highlighted by offering the module ECMM461 High-Performance Computing [19]. However, two issues are present with the current framework. Firstly, it is an optional module, which does not align with an industry where big data and high-performance analytics are becoming the norm [20]. Secondly, the course content focuses on the problem’s high-level practical elements rather than the low-level theoretical aspects that this essay argues are critical. The first and simplest step in tackling this is to make ECMM461 High-Performance Computing a compulsory module, signalling the importance of these areas of study to students. A more desirable approach, however, would be to rename High-Performance Computing to Applied High-Performance Computing, keeping it as an optional module, and introducing a compulsory module that teaches the fundamental theoretical aspects of High-Performance Computing. The new module could then cover topics such as time and space complexity. A starting point for this new module could be the remedial course given to CDT students, which could be named Foundations of Efficient Computing, leading into the Applied High-Performance Computing Module mentioned above. This new teaching framework is reflected in the name change, with Applied High-Performance Computing no longer focusing on delivering content through lecturers but moving to project-based learning. The approach of project-based learning has a range of pedagogical benefits [21], alongside aligning more closely with the experiences of the industry workforce, where no two high-performance computing problems are the same. The main barrier to the implementation is the computing resources for the applied elements. However, strides are being made at the university level [22] to provision these resources, while stopgap resources from the private sector are available [23]. A further recommendation is to include the time and space complexity of practical aspects of the course at all stages. No matter the particular focus of a module, the time and space complexity of the code being discussed should be mentioned. While there is no need to assess it within the module content, the simple discussion of computational complexity across the course reflects an industry paradigm where the concept is rarely directly discussed but is always present and influencing decisions. A further problem is the lack of instructor availability, but developing an open-source educational resource would help solve this issue. With this in mind, I have made the course I designed for the CDT available online [24].


The recommendations discussed in this essay, I believe, would help to foster industry leaders through practical experience in solving real-world problems, preparing them for any challenges they may face in industry, and helping to bolster the standing of the Exeter alum community within the field of Data Science. Ensuring that Exeter is a sector leader in its Data Science course offering, helping to establish the standard for what a Data Science Curriculum should look like. A discussion that is still ongoing across the field.


Essay 2 – Integrating The Climate Crisis into Computer and Data Science Education

The climate crisis has quickly become a part of everyday life, with its role in everyone’s life only set to increase no matter where on the spectrum a person’s views on climate change lie. Across the UK Higher Education sector, calls for the climate crisis to be integrated into the curricula are emerging [25], with the University of Exeter being no exception, declaring an Environment and Climate Emergency [26]. The university has expressed a desire to incorporate the climate crisis into the university curricula as a whole, rather than only those classically at the forefront of the climate crisis, such as the physical sciences [27].


Within the academic year of 2022/23, this was felt in both the Computer and Data Science courses, with changes to two particular modules to reflect the climate crisis, COM1012 Data Science Group Project and ECM2434 Group Software Engineering Project, both project-based modules. For COM1012, I was the sole module lead, and for ECM2434, I was a co-module lead, leading the module’s sustainability theme. This essay will present my experience using project-based learning [21] and research-focused learning [28] to integrate the climate crisis into Data and Computer Science education into the two modules mentioned above. It will examine the lessons learned and provide my recommendations for other educational practitioners to do the same in their courses. The Data Science module COM1012 focused on providing insight and intelligence to tackle the climate crisis by providing evidence for policy interventions. The Computer Science module ECM2434 focused on user participation in a student-produced app to help drive sustainable actions. Both module framings support the university’s strategic aims and its role in the climate crisis [29].


The approach taken in the COM1012 Data Science Group Project was to base the problem definition on the United Nations (UN) Sustainable Development Goals (SDGs) [30], a set of goals the UN set for the world to work towards for a more just and prosperous future. The set-up allowed students to choose a UN SDG they felt was important to them and their group and design and conduct a semester-long project tackling the goal outlined. The purpose of this was to allow the students to pick something they were personally interested in while tackling a problem of worldwide importance using project-based learning. The intention was to encourage the students to engage with the content critically rather than being directly presented with a narrative of the current situation, making use of experiential learning [31]. The approach taken within ECM2434 Group Software Engineering Project was for students to take a more independent approach to the given problem. The assignment brief was for the students to create a game that promoted sustainable actions on campus through gamification directly related to the sustainability goals set out by the University of Exeter [32]. As it was project-based, the students proposed sustainable actions rather than me prescribing these, leaning again on experiential learning.


The most critical insight from my experience integrating the climate crisis into the curriculum was the importance of aligning interests. An assumption made in my initial approach was to expect that students would care about sustainability for the sake of sustainability. While it is true that some students cared about sustainability outright, they were not the majority and broadly were not the students who needed convincing of the importance of the climate crisis. Instead, the students within the course exhibited a range of perspectives, aligning with results from studies on climate change sentiments [33]. I found that the best way to engage students on the climate crisis’ importance was to look at the problem through their lens and attempt to align to their interests. As the climate crisis is such a complex and wicked problem, it affects many facets of life, from wildlife welfare and the ecological crisis [34], to inequality and the cost of living crisis [35], alongside the more commonly discussed aspects such as extreme weather [36]. However, even further removed from the problems discussed above, some students discussed being more environmentally conscious in the hope of improving customer engagement with their app. This mirrors research being done in the business disciplines concerning sustainability and customer engagement [37]. These experiences highlighted to me the range of interests that any individual could take in the climate crisis, with the ability to leveragetheir interests to foster engagement with the topic.


Integrating the climate crisis into university education is essential to ensure students are exposed to various viewpoints on such an important topic; this ensures that echo chambers are not formed, as is common with other mediums when discussing climate change [38]. Echo chambers can limit a person’s perspective on an issue leading to radicalization in the individual’s view on a particular problem, causing them to become more insular and inward focusing [39], the complete opposite of the goal for the University of Exeter in developing global citizens. The university curriculum has the potential to foster critical skills for its students, helping them to discuss and tackle the climate crisis, whatever their perspective, an opportunity not present in many forums.

In summary, COM1012 focused on indirectly tackling the climate crisis, and ECM2434 focused on direct action. The integration of the climate crisis into both curricula was a success. In the case of COM1012, the student evaluation for the module improved considerably from the 2021/22 academic year. The previous iteration of the module focused on a contrived problem set by the module leader, where the overall score was 3.6/5, with 3.7/5 being achieved for the particular question of ”The module was interesting and intellectually challenging”. In the 2022/23 academic year, the module achieved an overall score of 4.4/5 and 4.5/5 for the particular question of ”The module was interesting and intellectually challenging”, an increase of 0.8 in both scores. The improvement in module feedback highlights that student engagement and reception to the content improved when the module was structured around the climate crisis. In ECM2434, the sustainability aspect of the module took a lesser role but was still the domain in which the software engineering was applied. It was well received, with the module being recommended on a university-wide initiative called ”Responsible Futures”, a university-supported programme forming part of the transformative education initiative.


Across all sectors, there is a growing discussion of the climate crisis, with more businesses than ever helping to tackle the climate crisis [40], more research than ever going into the climate crisis [41] and policy interventions being designed and implemented affecting our personal lives, such as the London Ultra Low Emissions Zone (ULEZ) [42]. Integrating the climate crisis into the Computer and Data Science curriculum reflects this importance. At Exeter, the integration of the climate crisis has been a success for the students and the staff partaking in the module. The outcome of the modules described has helped foster critical thinking and collaboration between student teams to solve real-world problems, helping to prepare them for the world they will soon be entering.

 

References

[1] W. Finzer, “The data science education dilemma,” Technology Innovations in Statistics Education, vol. 7, no. 2, 2013.

[2] R. D. De Veaux, M. Agarwal, M. Averett, B. S. Baumer, A. Bray, T. C. Bressoud, L. Bryant, L. Z. Cheng, A. Francis, R. Gould et al., “Curriculum guidelines for undergraduate programs in data science,” Annual Review of Statistics and Its Application, vol. 4, pp. 15–30, 2017.

[3] US Bureau of Labor Statistics, “Computer and Information Technology Occupations: Occupational Outlook Handbook [Accessed 31/05/2023],” Sep 2022. [Online]. Available: https://www.bls.gov/ooh/ computer-and-information-technology/home.htm

[4] ——, “Data scientists: Occupational outlook handbook [Accessed 31/05/2023],” May 2023. [Online]. Available: https://www.bls.gov/ooh/math/data-scientists.htm

[5] University of Exeter, “Data Science BSc Undergraduate Study University of Exeter [ Accessed 31/05/2023].” [Online]. Available: https://www.exeter.ac.uk/study/undergraduate/courses/ datascience/databsc/

[6] Office for Students, “Postgraduate conversion courses in Data Science and Artificial Intelligence [Accessed 31/05/2023],” Aug 2019. [Online]. Available: https://www.officeforstudents.org.uk/advice-and-guidance/skills-and-employment/ postgraduate-conversion-courses-in-data-science-and-artificial-intelligence/

[7] UKRI, “How we work in artificial intelligence – UKRI artificial intelligence Centres for Doctoral Training [Accessed 31/05/2023].” [Online]. Available: https://www.ukri.org/what-we-offer/how-we-work-in-ai/ukri-artificial-intelligence-centres-for-doctoral-training/

[8] C. Romero and S. Ventura, “Educational data science in massive open online courses,” Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, vol. 7, no. 1, p. e1187, 2017.

[9] M. Almgerbi, A. De Mauro, A. Kahlawi, and V. Poggioni, “A systematic review of data analytics job requirements and online-courses,” Journal of Computer Information Systems, vol. 62, no. 2, pp. 422–434, 2022.

[10] I. Bile Hassan, T. Ghanem, D. Jacobson, S. Jin, K. Johnson, D. Sulieman, and W. Wei, “Data science curriculum design: A case study,” in Proceedings of the 52nd ACM Technical Symposium on Computer Science Education, ser. SIGCSE ’21. New York, NY, USA: Association for Computing Machinery, 2021, p. 529–534. [Online]. Available: https://doi.org/10.1145/3408877.3432443

[11] D. Donoho, “50 years of Data Science,” URL http://courses. csail. mit. edu/18, vol. 337, p. 2015, 2015.

[12] R. Schwartz, J. Dodge, N. A. Smith, and O. Etzioni, “Green AI,” Communications of the ACM, vol. 63, no. 12, pp. 54–63, 2020.

[13] B. Berisha, E. M¨eziu, and I. Shabani, “Big data analytics in Cloud computing: an overview,” Journal of Cloud Computing, vol. 11, no. 1, p. 24, 2022.

[14] T. N. Theis and H.-S. P. Wong, “The end of moore’s law: A new beginning for information technology,” Computing in Science and Engineering, vol. 19, no. 2, pp. 41–50, 2017.

[15] P. Anderson, J. Bowring, R. McCauley, G. Pothering, and C. Starr, “An undergraduate degree in data science: Curriculum and a decade of implementation experience,” in Proceedings of  the 45th ACM Technical Symposium on Computer Science Education, ser. SIGCSE ’14. New York, NY, USA: Association for Computing Machinery, 2014, p. 145–150. [Online]. Available: https://doi.org/10.1145/2538862.2538936

[16] University of Exeter, “PhD Environmental Intelligence Postgraduate Study – PhD and Research Degrees University of Exeter [Accessed 31/05/2023].” [Online]. Available: https://www.exeter.ac.uk/study/pg-research/degrees/climatechange/environmental-intelligence/

[17] J. Johnes, “University rankings: What do they really show?” Scientometrics, vol. 115, no. 1, pp. 585–606, 2018.

[18] A. Beckett, “PPE: The oxford degree that runs Britain [Accessed 31/05/2023],” Feb 2017. [Online]. Available: https://www.theguardian.com/education/2017/feb/23/ ppe-oxford-university-degree-that-rules-britain

[19] University of Exeter, “High Performance Computing Module Description Page [Accessed 31/05/2023].” [Online]. Available: http://emps.exeter.ac.uk/modules/ECMM461

[20] I. A. T. Hashem, I. Yaqoob, N. B. Anuar, S. Mokhtar, A. Gani, and S. U. Khan, “The rise of “big data” on cloud computing: Review and open research issues,” Information systems, vol. 47, pp. 98–115, 2015.

[21] D. Kokotsaki, V. Menzies, and A. Wiggins, “Project-based learning: A review of the literature,” Improving schools, vol. 19, no. 3, pp. 267–277, 2016.

[22] HPC-UK, “HPC Facilities – HPC facilities avaliable to UK users [Accessed 31/05/2023].” [Online].Available: https://www.hpc-uk.ac.uk/facilities/

[23] P. Psilo, “AWS Edcuate – Impact Stories [Accessed 31/05/2023].” [Online]. Available: https://pages.awscloud.com/rs/112-TZM-766/images/AWS%20Educate%20Impact%20Stories.pdf

[24] L. Berrisford, “Berrli/conceptsforefficientprogramming: Repo for the resources used to deliver the “Concepts for efficient programming” course to EI CDT students in 2023. [Accessed 31/05/2023].” [Online]. Available: https://github.com/berrli/ConceptsForEfficientProgramming

[25] B. Latter and S. Capstick, “Climate emergency: UK universities’ declarations and their role in responding to climate change,” Frontiers in Sustainability, vol. 2, p. 660596, 2021.

[26] University of Exeter, “Featured news – University declares an environment and climate emergency – University of Exeter [Accessed 31/05/2023],” May 2019. [Online]. Available: https://news archive.exeter.ac.uk/featurednews/title 717135 en.html

[27] ——, “GreenFutures [Accessed 31/05/2023],” Mar 2023. [Online]. Available: https://greenfutures.exeter.ac.uk/

[28] M. Healey and A. Jenkins, Developing undergraduate research and inquiry. Higher Education Academy York, 2009.

[29] University of Exeter, “Transformative education – Sustainability [Accessed 31/05/2023].” [Online]. Available: https://www.exeter.ac.uk/about/vision/successforall/transformativeeducation/#a2

[30] United Nations, Department of Economic and Social Affairs – Sustainable Development, “Transforming our world: the 2030 agenda for sustainable development,” p. 16301, 2015. [Online]. Available:https://sdgs.un.org/2030agenda

[31] J. A. Moon, A handbook of reflective and experiential learning: Theory and practice. Routledge, 2013.

[32] University of Exeter, “Sustainability At the University of Exeter [Accessed 31/05/2023].” [Online]. Available: https://www.exeter.ac.uk/about/sustainability/

[33] B. Dahal, S. A. Kumar, and Z. Li, “Topic modeling and sentiment analysis of global climate change tweets,” Social network analysis and mining, vol. 9, pp. 1–20, 2019.

[34] B. O’Connor, S. Bojinski, C. R¨o¨osli, and M. E. Schaepman, “Monitoring global changes in biodiversity and climate essential as ecological crisis intensifies,” Ecological Informatics, vol. 55, p. 101033, 2020.

[35] N. Islam and J. Winkel, “Climate change and social inequality,” 2017.

[36] P. Stott, “How climate change affects extreme weather events,” Science, vol. 352, no. 6293, pp. 1517– 1518, 2016.

[37] X. Chen, X. Sun, D. Yan, and D. Wen, “Perceived sustainability and customer engagement in the online shopping environment: The rational and emotional perspectives,” Sustainability, vol. 12, no. 7, p. 2674, 2020.

[38] H. T. Williams, J. R. McMurray, T. Kurz, and F. H. Lambert, “Network analysis reveals open forums and echo chambers in social media discussions of climate change,” Global environmental change, vol. 32, pp. 126–138, 2015.

[39] K. O’Hara and D. Stevens, “Echo chambers and online radicalism: Assessing the internet’s complicity in violent extremism,” Policy & Internet, vol. 7, no. 4, pp. 401–422, 2015.

[40] K. Yanosek and D. G. Victor, “How big business is taking the lead on Climate Change [Accessed 31/05/2023],” 2022. [Online]. Available: https://www.mckinsey.com/capabilities/sustainability/our-insights/sustainability-blog/how-big-business-is-taking-the-lead-on-climate-change

[41] UKRI, “Responding to climate change [Accessed 31/05/2023].” [Online]. Available: https: //www.ukri.org/news-and-events/responding-to-climate-change/

[42] Transport for London, “Ultra Low Emission Zone [Accessed 31/05/2023].” [Online]. Available: https://tfl.gov.uk/modes/driving/ultra-low-emission-zone