Creativity, Style and the Flattening Threat in Large Language Models

Creativity, Style and the Flattening Threat in Large Language Models

The debate on creativity has intensified with the rise of generative AI, especially of large language models (LLMs). Recent research shows that these systems can produce work that competes with, and in some cases exceeds, human creativity (Guzik et al., 2023; Bohren et al., 2024). At the same time, their use brings serious concerns about value, authenticity, and the long-term safeguarding of human creative practices (Mei et al., 2025; Messer, 2024). This tension highlights what might be called the “flattening threat”: there is a perceived risk that even as LLMs make it easier to generate ideas and boost productivity, they could also diminish the diversity, style and authenticity that enrich human creativity.

Rethinking Creativity

Defining creativity has long since been a challenge in research. A commonly accepted definition comes from Stein (1953) and Barron (1955) and is supported in later studies: creativity involves both novelty (originality, unusualness) and usefulness (value, adaptability, appropriateness) (Acar et al., 2017), along with related dimensions such as elaboration and aesthetic quality, though these play a secondary role compared to originality and usefulness (Acar et al., 2017). However, research shows that these two aspects do not carry the same weight: originality consistently stands out as a stronger predictor of creativity compared to usefulness (Acar et al., 2017; Diedrich et al., 2015). Additional perspectives further enrich this definition: the U.S. Patent Office, for example, includes “non-obviousness”, which Simonton (2012) rephrases as “surprise”. Bruner (1962) similarly emphasized “effective surprise” as key to novelty.

A further distinction must be made between creativity and innovation: creativity concerns the generation of new and valuable ideas, whereas innovation requires those ideas to be implemented and adopted in practice. A product can be novel and useful but still not count as innovative if it fails to gain acceptance; yet in both creativity and innovation, originality remains central (Acar et al., 2017).

LLMs and Creative Power 

Recent research highlights the impressive, though nuanced, creative abilities of LLMs, particularly of the more advanced models. Studies show that in several ways, generative AI not only matches but often exceeds human performance. Guzik et al. (2023) report that ChatGPT-4 scored in the top 1% for originality and fluency on the Torrance Tests of Creative Thinking, significantly outpacing human abilities in original thinking. Similarly, Bohren et al. (2024) found that ChatGPT’s creative ideas were consistently rated higher than those produced by humans, with eight of the seventeen top-rated answers in their study generated by ChatGPT. It is true that ChatGPT-4, taken here as a representative example of more advanced models, shows strengths in certain creative tasks: in exercises requiring imagination and unlikely scenarios, like the “Just Suppose” tasks, 95% of its responses were judged original compared to just 24% from the human control group (Guzik et al., 2023). It also performed well in activities such as asking questions, guessing causes and imagining consequences. At the same time, its creativity has limits. While overall flexibility scores were high, ChatGPT scored lower in specific tasks like “guessing causes”, “guessing consequences” and “product improvement”, possibly due to sensitivity to prompts or limits in flexible reasoning (Guzik et al., 2023). Bohren et al. (2024) also noted that while AI matches humans in idea diversity overall, humans still excel in generating the most unique ideas. Yet concerns that AI merely recycles existing thoughts appear overstated: Mei et al. (2025) found that AI can contribute to collective creativity without diminishing intellectual diversity.

Style, Bias and the Flattening Threat

So, if LLMs can produce highly original ideas, why are their outputs often seen and judged as less creative? Research consistently points to human bias as the key reason. People tend to undervalue AI-generated works, even when the quality is high. This is known as algorithm aversion (Bohren et al., 2024). Hattori et al. (2024) found that participants rated creativity significantly lower when they were told an idea was produced by AI rather than a human. Yet, interestingly, raters struggle to tell the difference between the two: they correctly identified human ideas only 63% of the time, compared to 61% for ChatGPT and just 37% for Bard (Bohren et al., 2024). This bias stems from the effort heuristic: humans often assume AI uses less effort, and, since perceived effort is a common shortcut for judging quality, AI outputs receive lower ratings (Magni et al., 2023; Messer, 2024). However, this bias is not consistent: it changes depending on the area and context, being stronger in aesthetic and artistic products, where authenticity and originality matter more (Magni et al., 2023; Messer, 2024). Indeed authenticity plays a big role as well: in art, people often see AI-assisted works as less authentic, especially when AI is involved in creating rather than just in the initial idea stage (Messer, 2024).  In contrast, in commercial settings like advertising or startup ideas, bias lessens, and AI outputs are viewed more fairly (Magni et al., 2023). Interestingly, bias is also stronger for outputs that are very useful but not novel, while new products trigger less bias (Hattori et al., 2024). The flattening threat comes into play here. While studies like Mei et al. (2025) suggest that worries about AI reducing diversity may be exaggerated, there are real risks: using ChatGPT (again considered here as a paradigm of more advanced models) can lower the effort and difficulty of creative writing but can also reduce enjoyment of the task and decrease the perceived value of creativity (Mei et al., 2025). Furthermore, it has been shown that competition with AI could risk slightly reducing human creativity (Bohren et al., 2024). These findings indicate that even as LLMs enhance human performance, they may also change creative work into something less valued, less enjoyable and more uniform.

Recap, Implications and Open Questions

We have seen how, on one hand, empirical evidence suggests that originality scores of LLMs, such as GPT-4, can be higher than those of humans (Guzik et al., 2023; Bohren et al., 2024), while, on the other hand, human evaluators often rate AI-generated works as less creative when they believe they are ‘machine-made’. This perception arises mainly from the belief that these works involve less effort and lack authenticity (Magni et al., 2023; Hattori et al., 2024; Messer, 2024) and this paradox is central to current discussions about creativity, style and the possible ‘flattening effects’ of LLMs.

Indeed the topic is rather important, as the rise of LLMs such as ChatGPT has major implications for both academia and industry. In academic settings, Mei et al. (2025) point out that AI can improve creativity and writing performance, particularly for non-native English speakers by reducing language barriers. However, they warn that using AI might decrease students’ enjoyment of creative tasks and their perceived value of writing skills. To balance this, educators need to promote deeper cognitive engagement so that learning remains intellectually fulfilling (Mei et al., 2025). When generative AI is employed in creative fields, highlighting human involvement, focusing on intrinsic motivation and emphasizing effort can help restore perceptions of authenticity and value (Messer, 2024; Magni et al., 2023). Looking ahead, current research points out several unresolved questions; long-term studies are necessary to explore how AI assistance impacts creative skills over time (Mei et al., 2025). The subtleties of effort and authenticity need further investigation, along with how AI’s ability to represent or mimic humans influences evaluations (Magni et al., 2023). Expanding research to include professionals, underrepresented groups and various cultural contexts will also be key (Bohren et al., 2024; Mei et al., 2025).

In the end, the question might be less about what AI can do and more about how we decide to live with, learn from and reshape ourselves around it.

References:

Acar, S., Burnett, C., & Cabra, J. F. (2017). Ingredients of creativity: Originality and more. Creativity Research Journal, 29(2), 133–144.

Barron, F. (1955). The disposition toward originality. The Journal of Abnormal and Social Psychology, 51(3), 478–485.

Bohren, N., Hakimov, R., & Lalive, R. (2024). Creative and strategic capabilities of generative AI: Evidence from large-scale experiments. IZA – Institute of Labor Economics.

Bruner, J. S. (1962). The conditions of creativity. In Contemporary Approaches to Creative Thinking (Symposium, University of Colorado). Atherton Press.

Diedrich, J., Benedek, M., Jauk, E., & Neubauer, A. C. (2015). Are creative ideas novel and useful? Psychology of Aesthetics, Creativity, and the Arts, 9(1), 35–40.

Guzik, E. E., Byrge, C., & Gilde, C. (2023). The originality of machines: AI takes the Torrance Test. Journal of Creativity, 33(3), 100065.

Hattori, E. A., Yamakawa, M., & Miwa, K. (2024). Human bias in evaluating AI product creativity. Journal of Creativity, 34(2), 100087.

Magni, F., Park, J., & Chao, M. M. (2024). Humans as creativity gatekeepers: Are we biased against AI creativity? Journal of Business and Psychology, 39(3), 643–656.

Mei, P., Brewis, D. N., Nwaiwu, F., Sumanathilaka, D., Alva-Manchego, F., & Demaree-Cotton, J. (2025). If ChatGPT can do it, where is my creativity? Generative AI boosts performance but diminishes experience in creative writing. Computers in Human Behavior: Artificial Humans, 4, 100140.

Messer, U. (2024). Co-creating art with generative artificial intelligence: Implications for artworks and artists. Computers in Human Behavior: Artificial Humans, 2(1), 100056.

Simonton, D. K. (2012). Taking the US Patent Office criteria seriously: A quantitative three-criterion creativity definition and its implications. Creativity Research Journal, 24(2–3), 97–106.

Stein, M. I. (1953). Creativity and culture. The Journal of Psychology, 36(2), 311–322.

Further reading/watching/listening:

Books & Articles:

Bridle, J. (2022). Ways of being: Animals, plants, machines: The search for a planetary intelligence. Penguin UK.

Damaševičius R. (2025). Disruptive Creativity with Generative AI: Case Studies from Science, Technology and Education. Cambridge Scholars Publishing.

Du Sautoy, M. (2020). The creativity code: Art and innovation in the age of AI. Belknap Press.

Franceschelli, G., & Musolesi, M. (2024). On the creativity of large language models. AI & SOCIETY, 1-11.

Videos & Podcasts:

Artificial Creativity – Aalto LASER Talk
https://www.youtube.com/watch?v=9IU33Iquhf4

AI & Creativity | Dialogues on Technology and Society | Ep. 9 | will.i.am and James Manyika
https://www.youtube.com/watch?v=Eh_ksvdoUt0.

 

Image Attribution

Generated by: Midjourney

Date: 22/08/2025

Prompt:A human artist and a humanoid robot sitting face to face. The human paints the robot with expressive, vibrant, glowing colours. The background is filled with precise, geometric, polygonal patterns. But as the human paints the robot, the colours appear on his own face, reflecting the interchange between humans and technology. The different styles and colours merge seamlessly where they meet, symbolizing the blending of human creativity and artificial intelligence. Futuristic, surreal, highly detailed, polygonal textures, cinematic lighting, digital art style.”

Contact Us

FIll out the form below and we will contact you as soon as possible