Metaphor has been a much-debated trope ever since Aristotle and Cicero. Its study got a big boost with the publication of Andrew Ortony’s Metaphor and Thought (1979) and George Lakoff and Johnson’s Metaphors We Live By (1980). Lakoff and Johnson claim that we do not only use metaphor in language, but that we actually think metaphorically. For instance, the expressions “she attacked me during the discussion,” “when he responded to me, his defence was weak,” “Obama won the debate,” all reveal the underlying conceptual ARGUMENT IS WAR metaphor, in which “argument” is the metaphor’s “target,” and “war” its source. For recent overviews of current thinking about metaphor, see Gibbs (2008) and Kövecses (2010).
Unsurprisingly, given that metaphor was primarily “a matter of thought and action, and only derivatively a matter of language” (Lakoff and Johnson 1980: 153), scholars gradually developed an interest in non-verbal and multimodal manifestations of metaphor. We can hitherto basically distinguish two strands. The first focuses on the contribution of gestures, in combination with spoken language, to metaphorical expression (see e.g., Cienki and Müller 2008, Müller 2008). The second strand, which will here be discussed in some more detail, is pictorial (or: visual) metaphor and multimodal metaphor involving pictures (see e.g., Forceville 1996, Forceville and Urios-Aparisi 2008). In order to be able to discuss multimodal metaphor, it is important to decide on what is a “mode.” This is a thorny issue. For practical reasons Forceville refrains from defining “mode” in favour of simply postulating the following modes: spoken language, written language, visuals, gestures, music, sound, smell, taste, touch (see Forceville 2006 for more discussion). Multimodal metaphors, then, are “metaphors whose target and source are each represented exclusively or predominantly in different modes” (Forceville 2006: 384), while monomodal metaphors are “metaphors whose target and source are exclusively or predominantly rendered in one mode” (Forceville 2006: 383).
Forceville (1996) distinguished three types of pictorial metaphor which he nowadays labels “contextual metaphor,” “hybrid metaphor,” and “simile,” depending on whether it is the visual context that metaphorizes an object (contextual metaphor), whether target and source are physically conflated (hybrid metaphor) or whether they are juxtaposed (simile) – see figures 1-3. The fourth type introduced in Forceville (1996), “verbo-pictorial metaphor” (see figure 4), was later seen as being, in fact, a subtype of multimodal metaphor. Moreover, another type was introduced for three-dimensional metaphorical objects (see Van Rompay 2005, Cila 2013). This type has been labelled both “integrated metaphor” (by Forceville) and “product metaphor” (by Cila, see figure 5).
Whereas pictorial metaphors – which in pure form are specimens of the monomodal variety – now receive a fair amount of scholarly attention, research on multimodal metaphor is still in its infancy. In principle, any combination of modes can give rise to multimodal metaphor, but in practice, only the following have been regularly examined: visuals + written language; visuals + sound/music; spoken language + gestures (see Forceville and Urios-Aparisi 2009 for leads).
An important spin-off of Lakoff and Johnson’s conceptual metaphor theory (CMT) is Fauconnier and Turner’s “Blending Theory” (also called: “Conceptual Integration Theory”; see Fauconnier and Turner 2002). The basic idea is that there are many hybrids that are (1) not metaphorical; and (2) have more than just two “input spaces” (while metaphors always have two: target and source). This theory is now also branching out into the pictorial and multimodal realm (see Forceville 2013 for an example).
Here are some of the areas in the subdiscipline of pictorial and multimodal metaphors that deserve to be investigated more systematically. Which permutations of modes allow for multimodal metaphors? Which role plays (sub)cultural knowledge & ideology in their construal and interpretation? How do multimodal metaphors differ per genre (advertising, art, manual ...) and medium (static picture, film, comics, facebook, instagram ...)? Which other tropes (metonymy, symbol, hyperbole, irony, oxymoron ...) do have visual and multimodal counterparts, and how can they be distinguished from one another? What are the differences between “creative metaphor” (Black 1979) and “structural metaphor” (Lakoff and Johnson 1980)? For more discussion of all of these issues, and to other sources, please see Forceville (in prep.) and my online Course on Pictorial and Multimodal Metaphor.