In 2021, The Guardian took part in the Journalism AI Collab Challenges, a project connecting global newsrooms to understand how artificial intelligence can improve journalism. 

One particular challenge was to answer the question “How might we use modular journalism and AI to assemble new storytelling formats and reach underserved audiences?”

Anna Vissens, Lead Scientist, and Michel Schammel, Senior Data Scientist, at Guardian News & Media, United Kingdom, joined WAN-IFRA’s virtual Newsroom Summit in late April to talk about the learnings from this project.
What are quotes?
The team defined modules as fragments of a story that live independently but can be repurposed, or even replaced, by another fragment. Based on this definition, quotes are strongly qualified as modules.

Taking Wikipedia as the starting point, here’s how the team defined a quote: 

A quotation is the repetition of a sentence, phrase, or passage from speech or text that someone has said or written. In oral speech, it is the representation of an utterance that is introduced by a quotative marker, such as a verb of saying. For example: John said: “I saw Mary today.” 

In written text, quotations are signaled by quotation marks.

“It looks simple but we wrestled with questions like – what about song lyrics? Or poems? Are they quotes? What if someone doesn’t say it but thinks about it? Do we treat thoughts as we would speech?” said Vissens.
Why are they doing this?
There are several use cas…

Keen to read more? This content is exclusively available to our WAN-IFRA Members. If you believe your company is already a member, or you’d like to join in a personal capacity, please contact If you’re a media journalist please reach out to

The post How The Guardian and AFP used machine learning to understand quotes appeared first on WAN-IFRA.