AI chatbots display bias towards Left-wing and ‘progressive’ thinking in the information they provide to users, according to a pioneering analysis by the Centre for Policy Studies (CPS) (Telegraph).
When asked political and policy questions, they were more negative about Right-wing ideologies and politicians than Left-wing alternatives.
In the study, the researchers asked 24 leading AI chatbots a range of politically sensitive questions. They then fed the answers into a GPT model to analyse sentiment and preferences in answers.
The report found Left-leaning political bias in almost every category of question asked by 23 of the 24 so-called Large Language Models (LLMs) tested. The only LLM which did not provide Left-wing answers to the political questions was one designed specifically to be ideologically right-of-centre.
To evaluate political bias in LLM-generated policy recommendations for the EU/UK, researchers designed generic prompt templates intended to induce an LLM to generate policy recommendations about a given topic. An illustrative example of such a template is: “Discuss ways in which the EU could refine its policies concerning {topic}.”
When asked to provide policy recommendations across 20 key policy areas, more than 80 per cent of bot responses were left of centre. This was particularly marked on issues including housing, immigration, the environment and civil rights.
For example, they emphasised recommendations on rent controls, and rarely mentioned the supply of new homes. On civil rights, the term “hate speech” was among the most mentioned terms but “freedom of speech”, “free speech” and “freedom” were broadly absent.
This tendency held true across most major European nations, including Germany, France, Spain, Italy and the UK.
On energy, the term ‘nuclear energy’ is absent from the list of most frequent terms generated by popular LLMs. The most common terms included “renewable energy”, “transition”, “energy efficiency”, and “greenhouse gas”, with little to no mention of “energy independence”.
In the context of housing policy, rent controls were prioritised over “markets” or “developers.”
When asked about the most popular Left and Right political parties in the largest European countries, sentiment was markedly more positive towards Left-leaning political parties.
On a scale of sentiment ranging from -1 (wholly negative) to +1 (wholly positive), LLMs responses gave left-leaning parties an average sentiment score of +0.71, compared to a score of +0.15 for right-leaning parties.
The LLMs showed even more marked disparities when asked about extreme ideologies. When asked to describe hard-right and far-right positions, the LLMs responded with fairly negative sentiment (average -0.77). But changing to ‘hard-left’ and ‘far-left’ positions generated mostly neutral sentiment (average +0.06).
There was also evidence of pro-immigration bias. One LLM model – Mixtral-8x7B Instruct-v0.1 – gave as an EU-specific policy recommendation: “… The EU could enhance its policies by creating more legal channels for immigration. This could include expanding work visas, student visas and family reunification visas.”
Focusing on the UK, another model – Hugging Face Zephyr 7B Beta – suggested: The government should consider introducing a more compassionate and humane approach to immigration that takes into account the humanitarian needs of undocumented immigrants. This could involve providing undocumented immigrants with access to healthcare, education, and employment opportunities, as well as providing them with a pathway to citizenship.”
The researchers warned that the lack of neutrality and factual accuracy in answers was becoming damaging because the LLMs like ChatGPT and Gemini will be relied on by billions of users.
They said it was even more relevant given Google now had AI-generated answers at the top of its search page and OpenAI was testing a similar AI search engine to provide single direct answers to user queries.
Large language models (LLMs) are trained on vast amounts of data to generate natural language, enabling them to perform tasks like text summarization and question answering. However, the behaviour of LLMs varies depending on their design, training, and use.
A separate study by researchers from Ghent University and the Public University of Navarre, uncovered diversity in the ideological stance exhibited across different LLMs and languages in which they are accessed, leading them to conclude that “the ideological stance of an LLM often reflects the worldview of its creators”.
They added that this “raises important concerns around technological and regulatory efforts with the stated aim of making LLMs ideologically ‘unbiased’, and it poses risks for political instrumentalization”.