Better content moderation for better discourse: A conversation with Madhuri Rahman of WeLivedIt.Ai
Online discourse has been soured by extremism and hate speech. This is especially the case for marginalised people who experience more abuse online and whose content gets moderated more severely.
Surely there is a way to improve how we debate online? Maybe AI and large language models could be useful in detecting hate speech? Especially if the communities most at risk from toxic online content were able to use their lived experience to help train the AI models.
That’s what WeLivedIt.Ai aims to do, to use marginalised peoples’ lived experience to improve content moderation. I chatted to Madhuri Rahman, co-founder of WeLivedIt.Ai, to talk about their work, big tech, the current ‘one size fits all’ approach to moderation and The Online Safety Act 2023.
Alastair: What prompted the creation of WeLivedIt.ai, and what gap in moderation and online safety were you aiming to fill?
Madhuri: “My co-founder and I met in the token engineering space. We were looking at voting mechanisms and that led us to think about lived experience in technology development in general. Hate speech and online toxicity being one of those crucial areas within which having lived experience of the problem gives you a unique perspective in designing a solution, because if you've experienced certain marginalisation then you're more likely to receive a backlash and hate in a particular area, and you're more likely to recognise it when it's more subtle. That is what led us to then look at solutions that are AI-driven.”
What distinguishes WeLivedIt.ai from the existing moderation tools platforms already use?
“At the moment, one size fits all moderation only really benefits big tech platforms, and then with servers like Discord, there's a lot of human moderation going on.
“The first thing [WeLivedIt.ai offers] is entering your context to tailor the search. The second thing is being able to collaboratively define what toxicity looks like in your space by voting on real data examples, and then seeing what the consensus has been on different types of comments and speech. The third thing is using [this data] to train the model. What we also want to do then is to have the model training be as transparent as possible.
“What we're trying to do, which is different, is to make it all friendly for the individual user, because all content moderation tools out there now are like: ‘Okay, we'll do more context aware moderation,’ but they're still consultancy firms [where] you have to go through an individual who asks your needs, and the data already comes labelled. So, [there are] biases that exist in these big tech models. We're working towards moving away from that once a community puts in the work to labelling the training data that is filtered out from their online space. So, it's not random data. It's their lived experience, which is why we called it WeLivedIt.ai because we're trying to connect lived experience to this development of a big tech solution that could help lots of people.”
How does the lived experience of marginalised people online inform the tool?
“We’re trying to focus on people that experience hate speech directly to start with. So, you've got death threats, violence, demonising and dehumanising language, like negative character; rhetoric that just calls people crazy, for example, or aggressive. These are the different intensity levels, and we're starting with that first one, death threats, the kind of hate speech that's about online safety.
“This is hate speech identification. We're not stopping anyone from posting anything or saying anything, but the person that has lived experience of the problem has a choice as to whether or not they want to see that.
“Then you've got things like: “make me a sandwich.” Which of these categories would this fit into? The connotations around that kind of thing, these large language models in general won't pick up, because that's really nuanced, it's a trending piece of hate speech, but if you're aware of Andrew Tate and the misogynist rhetoric that is going around online, then you can submit that [and] the community can vote on whether they want that there or not.”
How are you making sure you are getting a broad representation of lived experience of all different areas of hate speech?
“That's an ongoing challenge, but our approach differs significantly from existing moderation tools. We're not trying to create one-size-fits-all solutions, which inevitably struggle to represent everyone. That means we don't face the same pressure to build a single, universal dataset that works everywhere. The question isn't 'have you represented everyone perfectly?' - that's impossible. The question is: 'When a collective representing a marginalised community joins your platform, can they teach the system about their toxicity patterns better than they could teach Twitter or Meta?’
“We think yes, because:
They control the configuration directly
They provide examples from their lived experience
They can correct the AI when it's wrong
They're not competing with billions of other users for the platform's attention
“We're also realistic: This requires communities to have the capacity to onboard, configure, and maintain their models. That's a significant ask, especially for already-marginalised groups. That's why it's important for us to ensure that we acknowledge and centre mental wellbeing and digital agency in the work we're doing. People aren't just involved to take on the labour of fixing a system that works against them. They're accessing community and people-power to reclaim online spaces from the toxic minority.
“We're trying to build a system where representation can happen bottom-up rather than top-down. We'd rather launch with transparency about limitations and improve through partnership with communities than claim we've solved a problem we haven't.”
Minority content is often over-moderated. How do you deal with that challenge?
“There's a lot of research which still shows that marginalised communities get over moderated, which is crazy because they're the ones that are experiencing a lot more of the online toxicity, but because, for example, LGBT content is more likely to be historically seen as sexual and adult, there's been a lot of over purging of that content online.
“We're starting at the safety level, but when I talk to a lot of people and we talk about hate speech, they will then talk about politicians [and] hate speech in terms of divisive speech in the way people talk about others, like specific identities or communities. It's similar rhetoric when a new kind of identity becomes the target for the divide and rule political strategy that uses the same kind of language. It's always ‘they're inherently violent’ ‘they're unnatural’ ‘they are a danger to children, to women’. All the same kinds of narratives come out and that is something that I'm exploring. What [we’re doing] right now is we're looking at filtering out pure hate to keep people safe.”
Are you seeing new forms of hate speech with generative AI, such as deepfakes or bots?
“We haven’t focused on that yet, but I am having a call with the LGBT Foundation because they recently ran a national campaign called ‘This is what a woman looks like’ and it was a trans positive campaign, and it ended up being derailed. They [the derailers] used certain pictures that were nasty and then were posting them to disrupt the campaign entirely. It was definitely bots.
“I want to talk to them and understand what specific patterns they saw during the campaign, and what they did, how they responded to it, and whether that's something that content moderation can do more to catch and deal with.
“In fact, I think it would be easier if you know what your campaign is and what you're going to post, having the context, being able to engage in data training for the model earlier, would be easier. It's just about whether the organization wants to engage in doing it or not.
“That's definitely a big, I was going to say, upcoming risk factor, but it's already really quite nasty for some of the deep fake videos that's coming out. It's something that content moderation definitely should be flagging.”
The UK’s Online Safety Act has just come into effect. How do you see regulation shaping your work?
“Tech companies will now have to assess the risk their platforms pose of disseminating the kind of racist misinformation that fuelled last year’s summer riots. The fact that there is something that now says that tech companies have to do that is good, it just doesn't really solve the issue that these content moderation tools don't work.”
Looking ahead, what’s your vision for the future of WeLivedIt.ai?
“At the moment we're working with journalists, but we want to expand to people who are visible online. That is because people who historically have faced a lot of toxicity are really good at identifying it.
“[We’re] also trying to capture more research and data around what the problem of online toxicity looks like, but also how we can improve the way AI is trained to solve that problem as well. We want to do a lot more.
“With the Online Safety Act, by 2027 the part of it that will come into effect is the part where platforms will have to report on disinformation. At the moment, the part that's come into effect is in relation to child safety, but in 2027 they'll be reporting on the risk that their platforms pose in disseminating misinformation. So, we would like to be able to contribute to that conversation, whilst also helping people that have lived experience of the problem.”
“Also, we will lay data to train the model on catching [hate speech] better. That's something that then we can also work with platforms and organizations, so they can see what the level of misogyny is on their platform by accessing that data. We want to create an ecosystem and try and contribute to improving regulation. It's a whole new minefield anyway, with tech regulation, but what would be the ideal is completely shifting the way tech platforms are regulated while keeping people safe online.”
You can find out more about WeLivedIt.ai by visiting their website or following them on LinkedIn to keep up to date with their mission to protect marginalised communities from hate speech online.
