What is doxing, and how can you protect yourself?

The blog group loved this article from The Conversation, originally published on 13 February 2024. We hope you do too!

Rob Cover, RMIT University

The Australian government has brought forward plans to criminalise doxing, bringing nationwide attention to the harms of releasing people’s private information to the wider public.

The government response comes after the public release of almost 600 names and private chat logs of a WhatsApp group of Australian Jewish creative artists discussing the Israel-Hamas war.

As a result, some of the people whose details were leaked claim they were harassed, received death threats and even had to go into hiding.

While we wait for new penalties for doxers under the federal Privacy Act review, understanding doxing and its harms can help. And there are also steps we can all take to minimise the risk.

What is doxing?

Doxing (or doxxing) is releasing private information — or “docs”, short for documents — online to the wider public without the user’s consent. This includes information that may put users at risk of harm, especially names, addresses, employment details, medical or financial records, and names of family members.

The Australian government currently defines doxing as the “malicious release” of people’s private information without their consent.

Doxing began as a form of unmasking anonymous users, trolls and those using hate speech while hiding behind a pseudonym. Recently, it has become a weapon for online abuse, harassment, hate speech and adversarial politics. It is often the outcome of online arguments or polarised public views.

It is also becoming more common. Although there is no data for Australia yet, according to media company SafeHome.org, about 4% of Americans report having been doxed, with about half saying their private emails or home addresses have been made public.

Doxing is a crime in some countries such as the Netherlands and South Korea. In other places, including Australia, privacy laws haven’t yet caught up.

Why is doxing harmful?

In the context of the Israel-Hamas war, doxing has affected both Jewish and pro-Palestinian communities and activists in Australia and abroad.

Doxing is harmful because it treats a user as an object and takes away their agency to decide what, and how much, personal information they want shared with the wider public.

This puts people at very real risk of physical threats and violence, particularly when public disagreement becomes heated. From a broader perspective, doxing also damages the digital ecology, reducing people’s ability to freely participate in public or even private debate through social media.

Although doxing is sometimes just inconvenient, it is often used to publicly shame or humiliate someone for their private views. This can take a toll on a person’s mental health and wellbeing.

It can also affect a person’s employment, especially for people whose employers require them to keep their attitudes, politics, affiliations and views to themselves.

Studies have shown doxing particularly impacts women, including those using dating apps or experiencing family violence. In some cases, children and family members have been threatened because a high-profile relative has been doxed.

Doxing is also harmful because it oversimplifies a person’s affiliations or attitudes. For example, releasing the names of people who have joined a private online community to navigate complex views can represent them as only like-minded stereotypes or as participants in a group conspiracy.

A person using a laptop and smartphone simultaneously
There are steps you can take online to protect yourself from doxing without having to complete withdraw. Engin Akyurt/Pexels

What can you do to protect yourself from doxing?

Stronger laws and better platform intervention are necessary to reduce doxing. Some experts believe that the fear of punishment can help shape better online behaviours.

These punishments may include criminal penalties for perpetrators and deactivating social media accounts for repeat offenders. But better education about the risks and harms is often the best treatment.

And you can also protect yourself without needing to entirely withdraw from social media:

  1. never share a home or workplace address, phone number or location, including among a private online group or forum with trusted people
  2. restrict your geo-location settings
  3. avoid giving details of workplaces, roles or employment on public sites not related to your work
  4. avoid adding friends or connections on social media services of people you do not know
  5. if you suspect you risk being doxed due to a heated online argument, temporarily shut down or lock any public profiles
  6. avoid becoming a target by pursuing haters when it reaches a certain point. Professional and courteous engagement can help avoid the anger of those who might disagree and try to harm you.

Additionally, hosts of private online groups must be very vigilant about who joins a group. They should avoid the trap of accepting members just to increase the group’s size, and appropriately check new members (for example, with a short survey or key questions that keep out people who may be there to gather information for malicious purposes).

Employers who require their staff to have online profiles or engage with the public should provide information and strategies for doing so safely. They should also provide immediate support for staff who have been doxed.

Rob Cover, Professor of Digital Communication and Co-Director of the RMIT Digital Ethnography Research Centre, RMIT University

This article is republished from The Conversation under a Creative Commons license. Read the original article.

Data poisoning: how artists are sabotaging AI to take revenge on image generators

Over the break we read and loved this article from The Conversation, originally published on 18 December 2023. We hope you do too!

T.J. Thomson, Author provided

T.J. Thomson, RMIT University and Daniel Angus, Queensland University of Technology

Imagine this. You need an image of a balloon for a work presentation and turn to a text-to-image generator, like Midjourney or DALL-E, to create a suitable image.

You enter the prompt: “red balloon against a blue sky” but the generator returns an image of an egg instead. You try again but this time, the generator shows an image of a watermelon.

What’s going on?

The generator you’re using may have been “poisoned”.

What is ‘data poisoning’?

Text-to-image generators work by being trained on large datasets that include millions or billions of images. Some generators, like those offered by Adobe or Getty, are only trained with images the generator’s maker owns or has a licence to use.

But other generators have been trained by indiscriminately scraping online images, many of which may be under copyright. This has led to a slew of copyright infringement cases where artists have accused big tech companies of stealing and profiting from their work.

This is also where the idea of “poison” comes in. Researchers who want to empower individual artists have recently created a tool named “Nightshade” to fight back against unauthorised image scraping.

The tool works by subtly altering an image’s pixels in a way that wreaks havoc to computer vision but leaves the image unaltered to a human’s eyes.

If an organisation then scrapes one of these images to train a future AI model, its data pool becomes “poisoned”. This can result in the algorithm mistakenly learning to classify an image as something a human would visually know to be untrue. As a result, the generator can start returning unpredictable and unintended results.

Symptoms of poisoning

As in our earlier example, a balloon might become an egg. A request for an image in the style of Monet might instead return an image in the style of Picasso.

Some of the issues with earlier AI models, such as trouble accurately rendering hands, for example, could return. The models could also introduce other odd and illogical features to images – think six-legged dogs or deformed couches.

The higher the number of “poisoned” images in the training data, the greater the disruption. Because of how generative AI works, the damage from “poisoned” images also affects related prompt keywords.

For example, if a “poisoned” image of a Ferrari is used in training data, prompt results for other car brands and for other related terms, such as vehicle and automobile, can also be affected.

Nightshade’s developer hopes the tool will make big tech companies more respectful of copyright, but it’s also possible users could abuse the tool and intentionally upload “poisoned” images to generators to try and disrupt their services.

Is there an antidote?

In response, stakeholders have proposed a range of technological and human solutions. The most obvious is paying greater attention to where input data are coming from and how they can be used. Doing so would result in less indiscriminate data harvesting.

This approach does challenge a common belief among computer scientists: that data found online can be used for any purpose they see fit.

Other technological fixes also include the use of “ensemble modeling” where different models are trained on many different subsets of data and compared to locate specific outliers. This approach can be used not only for training but also to detect and discard suspected “poisoned” images.

Audits are another option. One audit approach involves developing a “test battery” – a small, highly curated, and well-labelled dataset – using “hold-out” data that are never used for training. This dataset can then be used to examine the model’s accuracy.

Strategies against technology

So-called “adversarial approaches” (those that degrade, deny, deceive, or manipulate AI systems), including data poisoning, are nothing new. They have also historically included using make-up and costumes to circumvent facial recognition systems.

Human rights activists, for example, have been concerned for some time about the indiscriminate use of machine vision in wider society. This concern is particularly acute concerning facial recognition.

Systems like Clearview AI, which hosts a massive searchable database of faces scraped from the internet, are used by law enforcement and government agencies worldwide. In 2021, Australia’s government determined Clearview AI breached the privacy of Australians.

In response to facial recognition systems being used to profile specific individuals, including legitimate protesters, artists devised adversarial make-up patterns of jagged lines and asymmetric curves that prevent surveillance systems from accurately identifying them.

There is a clear connection between these cases and the issue of data poisoning, as both relate to larger questions around technological governance.

Many technology vendors will consider data poisoning a pesky issue to be fixed with technological solutions. However, it may be better to see data poisoning as an innovative solution to an intrusion on the fundamental moral rights of artists and users.

T.J. Thomson, Senior Lecturer in Visual Communication & Digital Media, RMIT University and Daniel Angus, Professor of Digital Communication, Queensland University of Technology

This article is republished from The Conversation under a Creative Commons license. Read the original article.

Highlights reel for 2023 

by the members of the Digital Dexterity Blog Group

Emma Chapman, Auckland University of Technology | Te Wānanga Aronui o Tāmaki Makau Rau:

AI has sure been big on the agenda this year. I’ve gone through all stages of excitement and grief with this topic. I think there was a time mid-year when AI-fatigue set in. But, the latest post on prompt engineering really re-ignites interest for me – as does the development of new, improved AI models. Sadly, as most of these are paid, I think an AI-digital divide could be the next thing we see. Meantime, I’ll keep working on trying to craft killer prompts (and keep trying to make GIFs that do not make me seasick). Merry holidays and a peaceful new year to all.  

Kristy Newton, University of Wollongong:

I’m not sure if we can refer to the year that has been 2023 without also saying the phrase “Generative AI”, and libraries (like everyone else) scrambled to understand how we could use these tools, whether we could use them ethically, and what this all meant for critical literacies. It’s been both exciting and fatiguing as others noted, but an absolute game changer. The blog has been a great space to facilitate discussions, share opinions, and learn from each other about this and about all things digital dexterity. 

Krista Yuen, University of Waikato | Te Whare Wānanga o Waikato:

I only joined the DigiDex community and blog group about halfway through 2023, and I honestly think I’m still finding my feet. That said, getting to know the fellow blog group has no doubt been a highlight for me. Coupled with the upward trend of Generative AI and navigating a new world of literacies in libraries and education, it has certainly made for a very interesting time to be involved with DigiDex. It has been a real honour to partake in and witness all the discussions we’ve had around the use of AI and how to best support and embrace these advancements. I’m definitely looking forward to seeing what 2024 will bring! 

Sara Davidsson, CAVAL:

The diversity of topics and voices in the blog has been my highlight for 2023. We have been able to deliver posts from our extended DigiDex community, written especially for the blog, as well as showcasing interesting articles from far and near. I am so happy to see that our readers keep returning every month for more posts! 

Danielle Degiorgio, Edith Cowan University:

It’s been an absolutely fascinating year diving into the world of AI. I’m genuinely thrilled by how these technologies are revolutionising the way we work in libraries and education as a whole. I’ve particularly enjoyed exploring how generative AI tools can support and foster creativity and innovative learning. It’s an exciting time to be in the field, and I’m looking forward to seeing what the future holds! 

Also, a big shoutout to our DigiDex blog group for their amazing work this year. They’ve done a stellar job in capturing these advancements and discussions around AI in libraries. It’s been inspiring to see their dedication and creativity in action. Kudos to the team for their exceptional work. 

Marianne Sato, University of Queensland:

I love reading the new Digital Dexterity blog posts each month. And being part of the blog group, I often get a sneak preview! The posts about different aspects of AI, finding and creating inclusive OER, and how websites work have been highlights for me this year. The blog posts always have so many great ideas or innovative solutions that I can apply to my work. AI definitely had a big impact this year and I suspect every year from now on. I look forward to reading more great posts in 2024! 

From all of us, we wish our loyal readers a happy and peaceful holiday season and all the best for 2024! We will return with a new blog post on 29 January.

Decorative image of tree branches laid out in a festive way
Photo by Annie Spratt on Unsplash

Being Prompt with Prompt Engineering

Krista Yuen, The University of Waikato
Danielle Degiorgio, Edith Cowan University

Warning – ChatGPT and DALL-E were used in the making of this post.

Experienced AI users have been experimenting with the art of prompt engineering to ensure they are getting the most useful and accurate responses from generative AI systems. As a result, they have created and synthesised techniques to ensure that they are getting the best output from these systems. Crafting an effective prompt, also known as prompt engineering, is arguably a skill that may be needed in a world of information seeking, as the trend of AI continues to grow.

Whilst AI continues to improve, and many systems now encourage more precise prompting from their users, AI is still only as good as the prompts they are given. Essentially, if you want quality content, you must use quality prompts. The structure of a solid prompt requires critical thinking and reflection in the design of your prompt, as well as how you interact with the output. While there are many ways to structure a prompt, these are the three more important things to remember when constructing your prompt:

Context

  • Provide background information
  • Set the scene
  • Use exact keywords
  • Specify audience
  • You could also give the AI tool a role to play, e.g. “Act as an expert community organiser!”

Task

  • Clearly define tasks
  • Be as specific as possible about exactly what you want the AI tool to do
  • Break down the steps involved if needed
  • Put in any extra detail, information or text that the AI tool needs

Output

  • Specify desired format, style, and tone
  • Specify inclusions and exclusions
  • Tell it how you would like the results formatted, e.g. a table, bullet point list or even in HTML or CSS.

Example prompt for text generation e.g., ChatGPT

You are an expert marketing and communications advisor working on a project for dolphin conservation and need to create a comprehensive marketing proposal. The goal is to raise awareness and promote actions that contribute to the protection of dolphins and their habitats. The target audience includes environmental activists and the general public who might be interested in marine conservation.

The proposal should highlight the current challenges faced by dolphins, including threats like pollution, overfishing, and habitat destruction. It should emphasise the importance of dolphins to marine ecosystems and their appeal to people due to their intelligence and playful nature. It should include five bullet points for each area: campaign objectives, target audience, key messages, marketing channels, content ideas, partnerships, budget estimation, timeline, and evaluation metrics.

Please structure it in a format that is easy to present to stakeholders, such as a PowerPoint presentation or a detailed report. It should be professionally written, persuasive, and visually appealing with suggestions for imagery and design elements that align with the theme of dolphin conservation.

Example prompt for image generation e.g., DALL∙E

Create a captivating and colourful image for a marketing campaign focused on dolphin conservation. The setting is a serene, crystal-clear ocean under a bright blue sky with soft, fluffy clouds. In the foreground, a group of three playful dolphins is leaping gracefully out of the water. These dolphins should appear joyful and full of life, symbolising the beauty and intelligence of marine life.

The central dolphin, a majestic bottlenose, is at the peak of its jump, with water droplets sparkling around it like diamonds under the sunlight. On the left, a smaller, younger dolphin, mirrors its movement, adding a sense of playfulness and family. To the right, another dolphin is partially submerged, preparing to leap. In the background, a distant, unspoiled coastline with lush greenery and a few palm trees provides a natural, pristine environment. This idyllic scene should evoke a sense of peace and the importance of preserving such beautiful natural habitats.

This image was created with DALL·E 2 via ChatGPT 4 (November 22 Version).

Not getting the results you want?

If your first response has not given you exactly what you need, remember you can try and try again! You may need to add more guidelines to your prompt:

  • Try adding more words or ideas that might be needed. What kind of instructions might make your prompt obtain more?
  • Provide some more context, like “I’m not an expert and I need this explained to me in simpler terms.”
  • Do you need more detailed information that will make your response more relevant and useful?

Want to learn more?

There are a few places you can go to learn more about developing good prompts for your generative AI tool:

LinkedIn Learning: How to write an effective prompt for AI

Learn Prompting: Prompt Engineering Guide

What does the word “community” mean to you in the context of teaching and research?

We loved this Open Access Australasia blog post by Richard White, Chair of the Open Access Australasia OA Week 2023 organising group, originally published on 12 July 2023.

Richard is also a member of the International OA Week organising committee and the Manager, Copyright & Open Access at University of Otago.

I vividly remember a senior researcher telling me a few years ago, as we were talking about making versions of our work available openly in repositories, that they didn’t need to worry about that because everyone who needed to had access to their publications.* Frankly I was flabbergasted at such a statement of privilege and assumption. I am afraid I didn’t come up with a counter argument to convince that person that there was no way they could possibly know who might be interested in their work. Still, that conversation has stayed with me, and this year’s Open Access Week theme resonates with how I felt about it.

Community over Commercialization. Open Access Week

Let’s ask that question again, then: what does the word “community” mean to you in the context of teaching and research? It’s true that many of us will first think about the disciplinary or professional communities we work with. Increasingly, though, we’re broadening our thinking. It might be professionals, teachers, policy makers, businesses and innovators, non-public-sector research organisations, citizen scientists, not to mention all the institutions, researchers and students around the world that cannot afford subscription access. Could it even mean the people or local communities who have contributed to our work or the people who might benefit from our work? If we tell people – especially those we’re writing about or working for – about our work in ways that require payment we should ask ourselves the question: are we doing research for us or for them?

In broadening the communities we want to engage with, however, we have a problem; we’re hindered by the systems we have built. Checking the COKI Open Access Dashboard, we can see that only about 40% of research publications by authors from Aotearoa and Australia from the past 20 years are free to read.** I say “we” have built them because we cannot absolve ourselves of the responsibility for these systems, even though we might complain that “Big Publishing” made them for us. 

The theme for this year’s OA week, running from October 23 to 29, is community over commercialisation. This theme was chosen by SPARC’s international OA week advisory committee to encourage conversation about the approaches to open scholarship that prioritise the interests of the public and the academic community.

The UNESCO Recommendation on Open Science, adopted by its 193 members, highlights the need to prioritise community over commercialisation. It calls on members to ensure that science does not involve the “unfair and/or inequitable extraction of profit from publicly funded scientific activities” and to support “non-commercial publishing models and collaborative publishing models with no article processing charges.”

All this should not be reduced to: commercialism = bad. Many of the institutions we work for explicitly encourage commercialism and, naturally, commercial entities are constantly developing innovative ways of doing things. The distinction to make, perhaps, is that ideally investment should serve the needs of the community in sustainable ways.

For Australasian OA week this year, we’re planning a series of topics and discussions, with a star-studded cast of speakers, panellists and experts, that we hope will provoke discussion and debate. Naturally, our focus will be on our corner of the world, examining questions like these: 

  • What would community ownership of the scholarly communication ecosystem look like? What about a research system centred on indigenous knowledge?
  • How can we ensure our knowledge is made as widely available as possible in ways that are sustainable? What about book publishing and open educational resources, which often play second-fiddle to journal publications in OA conversations?
  • What safeguards need to be in place to ensure knowledge is used appropriately?
  • What opportunities and challenges does the emergence of generative AI (controlled by huge commercial entities) pose for open knowledge? 

We’re hoping the sessions will be not just food for thought but also provide some practical opportunities to work together and meet people. We are looking forward to it!

* Having just checked this person’s publications I am sad to report that, even in 2023, only 20 percent are free-to-read, which is much lower than the average for New Zealand researchers (which is about half of publications being open).

** The COKI OA Dashboard shows OA rates for Aotearoa and Australia over the last 20 years as 38% and 42% respectively. [Curtin Open Knowledge Initiative Open Access Dashboard. https://open.coki.ac/ Accessed 4 July 2023]

Attributions: 

DIY degree? Why universities should make online educational materials free for all

We loved this article from The Conversation, originally published on 29 May 2023.

Richard F. Heller, University of Newcastle

This article is part of our series on big ideas for the Universities Accord. The federal government is calling for ideas to “reshape and reimagine higher education, and set it up for the next decade and beyond”. A review team is due to finish a draft report in June and a final report in December 2023.

Sam Lion/Pexels


As part of the federal government’s bid to overhaul higher education, the Universities Accord discussion paper is seeking to “widen” opportunities for people to access university. It also wants to “grow a culture” of lifelong learning in Australia. As the review team note, most people in Australia who study at university are under 35.

Lifelong learning can help to ensure that workforce skills are up to date and that jobs in high demand can be filled, as well as enabling people to create new job opportunities through innovation.

These issues need to be approached in many ways. And will inevitably include proposals for shorter forms of learning as well as addressing the financial cost of attending university.

My proposal – also outlined in this journal article – is that a proportion of educational resources generated by publicly funded universities should be made public and freely available.

This could radically expand opportunity and flexibility and potentially allow students to design their own degrees, by doing multiple different units from different universities.

This idea is not completely new

There is a precedence for this idea. The international Plan S initiative is led by a group of national research funding organisations. Since 2018, it has been pushing for publicly funded research to be published in open-access journals or platforms.

Australian chief scientist Cathy Foley similarly wants all Australian research to be “open access, domestically and internationally, and for research conducted overseas to be freely available to read in Australia”.

When it comes to university learning, a 2019 UNESCO report encouraged member states to make higher education educational resources developed with public funds free and freely available.

In a March 2023 report, the Productivity Commission recommended the federal government require “all universities to provide all lectures online and for free”. The commission said this would increase transparency in teaching performance and encourage online learning.

But this also has the ability to make to higher education more accessible.

There is already plenty of international experience sharing educational materials online – including the global Open Educational Resources public digital library. This includes resources from early learning through to adult education.

The Productivity Commission says universities would not lose income by making educational resources open access. This is because universities “sell” credentials, not resources. It is also argued overworked academics can save time by using materials created by others.

A mother works on her computer next to her young son.

But there is resistance from institutions and academics, including a perception free resources will be poor quality and take a lot of time to create. There is also a lack of technological tools to adapt resources. This may explain why open education has not yet taken off in Australia.

Making resources free will increase access to higher education in Australia. Shutterstock

How would this work?

My plan would require open online sites to host educational materials produced by academics. These would need to be moderated or curated and published under an open access license.

It would include a peer review system for educational materials like the one already used for research publications. Academics could get credit for publishing, updating or reviewing resources and the publication of education output would be included in the university metrics.

This could also help reverse the current downgrading of teaching in Australian universities in favour of research.

There could be three types of users:

  1. students who access materials through the university that produced them, as per current practice

  2. individual students outside the university that created the materials who access materials for their own learning at whatever stage of life they are relevant to them

  3. other organisations, including other universities, that then contextualise and deliver the materials to their students.

What kind of materials are we talking about?

The Productivity Commission has talked about “lectures” being made available for free. But lectures are not a good way of transmitting information, especially online. For one thing, they do not promote critical thinking.

My plan proposes whole courses or at least sections of courses with assessments, would be provided. This includes text, videos and software and can include course planning materials and evaluation tools.

An indication of the academic level to which the course speaks, and the amount of possible credit, should also be provided.

What about accreditation?

Accreditation of learning should be considered as part of this.

The OERu is an international organisation where partner universities (including Penn State in the US and Curtin University in Australia) offer free access to online courses. Students pay reduced fees if they want to submit assignments, which can earn them microcredits towards a degree offered by one of the partners.

A woman in a wheelchair work on a laptop in a cafe.

A more radical option would be to develop a system where students collect microcredits from whatever source they wish and present them to an accrediting body for an academic award rather than enrolling in a particular degree course.

Students could pay a fee if they want accreditation for their work. Marcus Aurelius/Pexels

Suggested recommendations

As it prepares its draft report, the accord review team should recommend:

  • most university-generated educational material should be public and free

  • as an interim goal, within three years, 10% of all public university courses should be freely available online

  • an organisation should be created to develop the infrastructure needed to do this. This includes, open repositories, a peer review system for open educational materials, and systems for offering microcredits to students and academic credit to academics who take part.

Why is this a good idea?

The Productivity Commission says making this material public will encourage higher quality teaching, empower students and assist in lifelong learning. On top of this, there is the potential for true reform of the educational landscape.

It provides opportunities for collaboration between universities, rather than a competitive business model. And it would make teaching more important, rather than an “inconvenient task” by those seeking academic advancement through research.

Finally, it would genuinely make learning more accessible and more affordable, no matter who you are or where you live.The Conversation

Richard F. Heller, Emeritus Professor, University of Newcastle

This article is republished from The Conversation under a Creative Commons license. Read the original article.

Is ChatGPT cheating? The complexities of AI use in tertiary education. 

Craig Wattam, Rachael Richardson-Bullock

Te Mātāpuna Library & Learning Services, Auckland University of Technology

“The university is at the stage of reviewing its rules for misconduct because they really don’t apply as much anymore.” 

– Tom, Student Advocate, on the Noisy Librarian Podcast

Cheating in the tertiary education sector is not new. Generative AI technologies, while presenting enormous opportunity, are the latest threat to academic integrity. AI tools like Chat GPT blur the lines between human-generated and machine-generated content. They present a raft of issues, including ambiguous standards for legitimate and illegitimate use, variations in acceptance and usage across discipline contexts, and little or inadequate evidence of their use. A nuanced response is required.

Fostering academic integrity through AI literacy

Academic integrity research argues pervasively that a systematic, multi-stakeholder, networked approach is the best way to foster a culture of academic integrity (Kenny & Eaton, 2022). Fortunately, this is also the way to foster ethical, critical reflective and skilful use of AI tools, in other words, a culture of AI literacy. Ironically, to support integrity, we must shift our attention away from merely preventing cheating to ensuring that students learn how to use these tools responsibly. Thus, we can ensure that our focus is on learning and helping students develop the skills necessary to navigate the digital age ethically and effectively.

Hybrid future 

So, the challenge of AI is an opportunity and an imperative. As we humans continue to interact with technology in high complexity systems, so the way we approach academic work will continue to develop.  Rather than backing away or banning AI technologies from the classroom all together, forging a hybrid future, where AI tools play a role in setting students up for success, will benefit both staff and students.

Information and academic literacy practitioners, and other educators, will need to be dexterous enough to respond to the eclipsing, revision, and constant evolution of some of our most ingrained concepts. Concepts such as authorship, originality, plagiarism, and acknowledgement. 

What do students say? 

This was the topic of discussion in a recent episode of the Noisy Librarian Podcast. Featured guests were an academic and a student – a library Learning Advisor and a Student Advocate. The guests delved into the complexities of academic integrity in today’s digital landscape. Importantly, their discussion underscored the need for organizations to understand and hear from students about how AI is impacting them, how they are using it, and what they might be concerned about. Incorporating the student voice and understanding student perspectives is crucial for developing guidelines and support services that are truly effective and relevant.  

Forget supervillains! 

Both podcast guests emphasised that few cases of student misconduct involve serial offenders or super villains who have made a career out of gaming the system. Rather than intending to cheat, more closely, misconduct is related to a lack of knowledge or skill. Meantime, universities are facing challenges – needing to adapt their misconduct rules and provide clear guidelines on the acceptable use of AI tools. 

Listen to the Noisy Librarian podcast episode Is ChatGPT cheating? The complexities of AI use in tertiary education

Podbean

Or find us on Google Podcasts, Apple Podcasts or I Heart Radio

Reference:

Kenny, N., & Eaton, S. E. (2022). Academic Integrity Through a SoTL Lens and 4M Framework: An Institutional Self-Study. In Academic Integrity in Canada (pp. 573–592). Springer, Cham. https://doi.org/10.1007/978-3-030-83255-1_30

The power of large language models to augment human learning 

By Fernando Marmolejo-Ramos, Tim Simon and Rhoda Abadia; University of South Australia

In early 2023, OpenAI’s ChatGPT became the buzzword in the Artificial Intelligence (AI) world. A cutting-edge large language model (LLM) that is part of the revolutionary generative AI movement. Google’s Bard and Anthropic’s Claude are other notable LLMs in this league, transforming the way we interact with AI applications. LLMs are super-sized dynamic libraries that can respond to queries, abstract text, and even tackle complex mathematical problems. Ever since ChatGPT’s debut, there has been an overwhelming surge of academic papers and grey literature (including blogs and pre-prints) both praising and critiquing the impact of LLMs. In this discussion, we aim to emphasise the importance of recognising LLMs as technologies that can augment human learning. Through examples, we illustrate how interacting with LLMs can foster AI literacy and augment learning, ultimately boosting innovation and creativity in problem-solving scenarios. 

In the field of education, LLMs have emerged as powerful tools with the potential to enhance the learning experience for both students and teachers. They can be used as powerful supplements for reading, research, and personalised tutoring, benefiting students in various ways. 

For students, LLMs offer the convenience of summarising lengthy textbook chapters and locating relevant literature with tools like ChatPDF, ChatDOC, Perplexity, or Consensus. We believe that these tools not only accelerate students’ understanding of the material but also enable a deeper grasp of the subject matter. LLMs can also act as personalised tutors that are readily available to answer students’ queries and provide guided explanations. 

For teachers, LLMs may help in reducing repetitive tasks like grading assignments. By analysing students’ essays and short answers, they can assess coherence, reasoning, and plagiarism, thereby saving valuable time for meaningful teaching. Additionally, LLMs have the potential to suggest personalised feedback and improvements for individual students, enhancing the overall learning experience. The caveat, though, is that human judgement is to be ‘in-the-loop’ as LLMs have limited understanding of teaching methodologies, curriculum, and student needs. UNESCO has recognised this importance and produced a short guide on the use of LLMs in higher education, providing valuable insights for educators (see table on page 10). 

Achieving remarkable results with LLMs is made possible through the art of “prompt engineering” (PE) – a term referring to the art of crafting effective prompts to guide these language models towards informed responses. For instance, a prompt could be as straightforward as “rewrite the following = X,” where X represents the text to be rephrased. Alternatively, a more complex prompt like “explain what Z is in layman’s terms?” can help clarify intricate concepts. In Figure 1, we present an example demonstrating how students can use specific prompts to learn statistical concepts while simultaneously gaining familiarity with R coding.

Figure 1.  Example of a prompt given to ChatGPT to create R code. The plot on the right shows the result when the code is run in R. Note how the LLM features good code commenting practices and secures reproducibility via the ‘set.seed( )’ function.

Additionally, Figure 2 reveals that not all LLMs offer identical responses to the same prompts, highlighting the uniqueness of each model’s output.

Figure 2. Example of how ChatGPT (left) and Claude (right) respond to the same prompt. Claude seemed to give a better response than ChatGPT and provided an explanation of what was done.

However, the most interesting aspect of PE lies in formulating appropriate questions for the LLMs, making it a matter of problem formulation. We believe this crucial element is at the core of effective prompting in educational contexts. Seen this way, it’s clear that good prompts should have context for the question being asked, as context provides reference points for the intended meaning. For example, a teacher or student could design a prompt like: “Given the information in texts A and B, produce a text that discusses concepts a1 and a2 in text A in terms of concepts b1 and b2 in text B”; where A and B are paragraphs or texts given along with the prompt and a1, a2, b1 and b2 are specific aspects from texts A and B. Admittedly, that prompt lacks context. Nonetheless, context-rich prompts could still be conceived (see Figure 3). These examples also hint at the idea that prompts work in a “rubbish prompts in; rubbish responses out” fashion; i.e. the quality of the prompt is directly proportional to the quality of the response.

Figure 3.  Example of a prompt with good context. This prompt was obtained via Bard through the prompt “construct a prompt on the subject of cognitive science and artificial intelligence that provides adequate context for any LLM to generate a meaningful response”.

PE is thus a process that involves engaging in a dialogue with the LLM to discover creative and innovative solutions to problems. One effective approach is the “chain-of-thought” (CoT) prompting, which entails eliciting the LLM to provide more in-depth responses by following up on previously introduced ideas. The example shown in Figure 4 was output by Bard after the prompt “provide an example of a chain of thought prompting to be submitted to a large language model”. The green box contains the initial prompt, the orange box represents three subsequent questions, and the blue box represents a potential answer given by the LLM. Another way of CoT prompting can be achieved by starting by setting a topic (e.g. “The Role of Artificial Intelligence (AI) in Education”), then ask questions such as “start by defining Artificial Intelligence (AI) and its relevance in the context of education, including its potential applications in learning, teaching, and educational administration.”, “explore how AI can personalise the learning experience for students, catering to individual needs, learning styles, and pace of progress.”, “discuss the benefits of AI-powered adaptive learning systems in identifying students’ strengths and weaknesses, providing targeted interventions, and improving overall academic performance.”, “examine the role of AI in automating administrative tasks, such as grading, scheduling, and resource management, to enhance efficiency and reduce the burden on educators.” etc.

Figure 4. Example of a CoT prompt.

Variants of CoT prompting can be considered by generating several CoT reasoning paths (see the following articles Tree of Thought Deliberate Problem Solving with Large Language Models and Large Language Models Tree -of -Thoughts.). Regardless of the CoT prompting used, the ultimate goal is to solve a problem in an original and informative ways.

It’s crucial not to overlook AI technologies but rather embrace them, finding the right balance between tasks delegated to AI and those best suited for human involvement. Fine-tuning interactions between humans and AI is key when exchanging information, ensuring a seamless and effective collaboration between the two.

Library strategy and Artificial Intelligence

by Dr Andrew M Cox, Senior Lecturer, the Information School, University of Sheffield.

This post was originally published in the National Centre for AI blog, owned by Jisc. It is re-printed with permission from Jisc and the author.

On April 20th 2023 the Information School, University of Sheffield invited five guest speakers from across the library sectors to debate “Artificial Intelligence: Where does it fit into your library strategy?”

The speakers were:

  1. Nick Poole, CEO of CILIP
  2. Neil Fitzgerald, Head of Digital Research, British Library
  3. Sue Lacey-Bryant, Chief Knowledge Officer; Workforce, Training and Education Directorate of NHS England
  4. Sue Attewell, Head of Edtech, JISC
  5. John Cox, University Librarian, University of Galway

A capacity 250 people had signed up online, and there was a healthy audience in the room in Sheffield.

Slides from the event can be downloaded here . These included updated results from the pre-event survey, which had 68 responses.

This blog is a personal response to the event and summary written by Andrew Cox and Catherine Robinson.

Impact of generative AI

Andrew Cox opened the proceedings by setting the discussion in the context of the fascination with AI in our culture from ancient Greece, movies from as early as the start of the C20th, through to current headlines in the Daily Star!

Later on in the event, in his talk John Cox quoted several authors saying AI promised to produce a profound change to professional work. And it seemed to be agreed amongst all the speakers that we had entered a period of accelerating change, especially with Chat GPT and other generative AI.

These technologies offer many benefits. Sue Lacey-Bryant shared some examples of how colleagues were already experimenting with using Chat GPT in multiple ways: to search, organise content, design web pages, draft tweets and write policies. Sue Attewell mentioned JISC sponsored AI pilots to accelerate grading, draft assessment tasks, and analyse open text NSS comments.

And of course wider uses of AI are potentially very powerful. For example Sue Lacey-Bryant shared the example of how many hours of radiologists time AI was saving the NHS. Andrew Cox mentioned how Chat GPT functions would be realised within MS Office as Copilot. Specifically for libraries, from the pre-event survey it seemed that the most developed services currently were library chatbots and Text and Data Mining support; but the emphasis of future plans was “Promoting AI (and data) literacy for users”.

But it did mean uncertainty. Nick Poole compared the situation to the rise of Web2.0 and suggested that many applications of generative AI were emerging and we didn’t know which might be the winners. User behaviour was changing and so there was a need to study this. As behaviour changed there would be side effects which required us to reflect holistically, Sue Attewell pointed out. For example if generative AI can write bullet point notes, how does this impact learning if writing those notes was itself how one learned? She suggested that the new technology cannot be banned. It may also not be detectable. There was no choice but to “embrace” it.

Ethics

The ethics of AI is a key concern. In the pre-event survey, ethics were the most frequently identified key challenge. Nick Poole talked about several of the novel challenges from generative AI, such as what is its implication for intellectual freedom? What should be preserved from generative AI (especially as it answers differently to each iteration of a question)? Nick identified that professional ethics have to be:

  • “Inclusive – adopting an informed approach to counter bias
  • Informed & evidence-based – geared towards helping information users to navigate the hype cycle
  • Critical & reflective – understanding our own biases and their impact
  • Accountable – focused on trust, referencing and replicability
  • Creative – helping information users to maximise the positive benefits of AI augmented services
  • Adaptive – enabling us to refresh our skills and expertise to navigate change”

Competencies

In terms of professional competencies for an AI world, Nick said that there was now wider recognition that critical thinking and empathy were key skills. He pointed out that the CILIP Professional Knowledge and Skills Base (PKSB) had been updated to reflect the needs of an AI world for example by including data stewardship and algorithmic literacy. Andrew Cox referred to some evidence that the key skills needed are social and influencing skills not just digital ones. Skills that respondents to the pre-event survey thought that libraries needed were:

  •        General understanding of AI
  •        How to get the best results from AI
  •        Open-mindedness and willingness to learn 
  •        Knowledge of user behaviour and need
  •        Copyright
  •        Professional ethics and having a vision of benefits

Strategy

John Cox pointed to evidence that most academic library strategies were not yet encompassing AI. He thought it was because of anxiety, hesitancy, ethics concerns and inward looking and linear thinking. But Neil explained how the British Library is developing a strategy. The process was challenging, akin to ‘Flying a plane while building it”. Sue Attewell emphasised the need for the whole sector to develop a view. The pre-event survey suggested that the most likely strategic responses were: to upskill existing staff, study sector best practice and collaborate with other libraries.

Andrew Cox suggested that some key issues for the profession were:

  • How do we scope the issue: As about data/AI or a wider digital transformation?
    • How does AI fit into our existing strategies – especially given the context of institutional alignment?
    • What constitutes a strategic response to AI? How does this differ between information sectors?
  • How do we meet the workforce challenge?
    • What new skills do we need to develop in the workforce?
    • How might AI impact equality and diversity in the profession?

Workshop discussions

Following the presentations from the speakers, those attending the event in person were given the opportunity to further discuss in groups the professional competencies needed for AI. Those attending online were asked to put any comments they had regarding this in the chat box. Some of the key discussion points were:

  • The need for professionals to rapidly upskill themselves in AI. This includes understanding what AI is and the concepts and applications of AI in individual settings (e.g. healthcare, HE etc.), along with understanding our role in supporting appropriate use. However, it was believed this should go beyond a general understanding to a knowledge of how AI algorithms work, how to use AI and actively adopting AI in our own professional roles in order to grow confidence in this area.
  • Horizon scanning and continuous learning – AI is a fast-paced area where technology is rapidly evolving. Professionals not only need to stay up-to-date with the latest developments, but also be aware of potential future developments to remain effective and ensure we are proactive, rather than reactive.
  • Upskilling should not just focus on professional staff, but all levels of library staff will require some level of upskilling in the area of AI (e.g. library assistants).
  • Importance of information literacy and critical thinking skills in order to assess the quality and relevance of AI outputs. AI should therefore be built into professional training around these skills.
  • Collaboration skills – As one group stated, this should be more ‘about people, not data’. AI requires collaboration with:
    • Information professionals across the sector to establish a consistent approach; 
    • Users (health professionals, students, researchers, public etc.) to establish how they are using AI and what for;
    • With other professionals (e.g. data scientists).
  • Recruitment problems were also discussed, with it noted that for some there had been a drop in people applying for library roles. This was impacting on the ability to bring in new skillsets to the library (e.g. data scientist), but on the ability to allow existing staff the time to upskill in the area of AI. It was discussed that there was the need to promote lifestyle and wellbeing advantages to working in libraries to applicants.

Other issues that came up in the workshop discussions centered around how AI will impact on the overall library service, with the following points made:

  • There is the need to expand library services around AI, as well as embed it in current services;
  • Need to focus on where the library can add value in the area of AI (i.e. USP);
  • Libraries need to make a clear statement to their institution regarding their position on AI;
  • AI increases the importance of and further incentivises open access, open licencing and digitisation of resources;
  • Questions over whether there is a need to rebrand the library.

The attendees also identified that the following would useful to help prepare the sector for AI:

  • Sharing of job descriptions to learn about what AI means in practice and help with workforce planning. Although, it was noted how the RL (Research Libraries) Position Description Bank contains almost 4000 position descriptions from research libraries primarily from North America, although there are many examples from RLUK members; 
  • A reading list and resource bank to help professionals upskill in AI;
  • Work shadowing;
  • Sharing of workshops delivered by professionals to users around the use of AI;
  • AI mailing lists (e.g. JISCmail);
  • Establishment of a Community of Practice to promote collaboration. Although it was noted that AI would probably change different areas of library practice (such as collecting or information literacy) so was likely to be discussed within the professional communities that already existed in these areas.

Workshop outcome

Following the workshop Andrew Cox and Catherine Robinson worked on a draft Working paper which we invite you to comment on @ Draft for comment: Developing a library strategic response to Artificial Intelligence: Working paper.

Getting comfortable with data: Ideas for explaining some basic data types

By Leah Gustafson

Leah is working for the Languages Data Commons of Australia project. It is a University of Queensland and ARDC co-funded project building digital infrastructure for preserving language data. They provide support to researchers and the broader community around making language data FAIR with CARE

(This post is in part a summary of an article that first appeared in The Living Book of Digital Skills).

It seems that every which way we turn, the mysterious concept of data is ever present and lurking in the background of our everyday lives. In the professional setting of the library, data is not a foreign concept – we are surrounded by books and journals and often help students navigate the world of information. But being in such close and constant proximity to data can lead to elements of expert bias creeping in despite the best efforts to keep them at bay! This can make it difficult to explain data concepts in simple terms.

João Batista Neto, CC BY 3.0 https://creativecommons.org/licenses/by/3.0, via Wikimedia Commons

So, what are some ideas for demystifying the concept of data for a wide audience? Particularly those who might be exposed to many different types (and potentially without even realising)…

First, the word itself. Remember that in its purist form data isn’t just digital! Wikipedia defines data as a “unit of information” about a person or object that could be a fact, statistic, or other item of information.

And then to progress to modern contexts (as so much of the data we deal with is digital), the Oxford English Dictionary entry states that it can be “quantities, characters, or symbols” in the form of electrical signals that can be used, stored, or transmitted with computer equipment. 

Once there is an understanding of what data is, trying to explain it further can suddenly become wildly more complicated! It may be helpful to compartmentalise that data can be structured, meaning that it is ready for analysing. Otherwise it may be unstructured because perhaps it has just been collected or perhaps multiple data sources are being combined to created a larger dataset. A dataset is just a collection of data points that are together – maybe they came from the same source or maybe they are about the same topic.

Terms that many people will be familiar with are qualitative and quantitative. Qualitative data is an opinion or generalisation about something – a user gives a rating of 5 out of 5 for their experience watching a film. This type of data can be descriptive, be true or false, or give a rank. On the other hand, quantitative data is an objective measurement of something and it generally numerical – for instance, the piece of string is 23 centimetres long. They can also be a number of items or the number of times something happened: there are 2 dishwashers and 1 cupboard and the cupboard was opened 46 times today.

Adapted from an image by Koen Leemans, CC BY-SA 4.0 https://creativecommons.org/licenses/by-sa/4.0, via Wikimedia Commons

For data to be used in an analysis, it must be structured in such a way that a computer program can interpret it. For example, data that is output from remote sensing equipment is generally already structured, whereas data that is gathered in a survey where someone’s experience was described would be unstructured.

This has been a very brief introduction to the intriguing world of data! More information can be found in the Chapter 2: Information literacy, media literacy and data literacy – Types of Data section of the The Living Book of Digital Skills. There are also some helpful resources that provide more in-depth details about the fundamental data concepts discussed.

Sharing is caring: comment below on techniques and approaches that you find helpful when needing to explain data concepts!