That's green, well maybe more blueish. You mean Grue?

Terrence

November 08, 2012

During my senior year at Savannah College of Art and Design, I took Language, Culture and Society with Désiré Houngues. Two cultural insights about language stuck with me. In some societies men and women speak with entirely different vocabularies but still communicate verbally with one another. The second was that some languages only have two words for color, white and black (light and dark); if a language includes a third color, it is always red. This led me to research by Brent Berlin, an anthropologist, and Paul Kay, a linguist. They made the first hypothesis about how color terms enter a language in a certain order. Later, I came across the World Color Survey, which was established in an effort to continue research into Berlin and Kay's hypothesis. The WCS makes their data available to the public, and I found that this was exactly what I needed to help answer my many questions. The result of the WCS data exploration is below, where about 800,000 individual color chips are grouped by the terms used to describe them.

The WCS collected data from 2696 native speakers, representing 110 languages, asking each of them to identify 330 colors. With Processing, I wrote code to read the survey data and explore different ways to categorize and group it. These sketches developed into the final image, where results for each language are shown as a series of blocks that extend from the center, in order of the most frequently used term to the least. For instance, terms used for a greenish-blue color are most prevalent, followed by terms for what we might perceive as red, black, white, etc. The speakers of one language used only three color terms to describe the color spectrum, while others used over sixty. Organizing the languages by geographic location highlights regional similarities in the number of unique color terms. Languages are also grouped by family within each geographic location.

Above is a group of languages from the Ivory Coast in Africa. They used as few as three color terms. As a result, the colors are ambiguous and could be identified in English as reddish orange or dark bluish green. It is a surprise to see the more pure blue in a language family and region that, for the most part, does not identify it.

This is a detail of some languages in the Oto-Manguean family, located in present day Mexico. They used over sixty color terms, many with recognizable names like 'cafe' (a coffee-colored brown) and 'rosa' (a reddish-pink). The color blocks are more saturated because they are grouped into many classifications, compared to the muddled color blocks for languages that have fewer terms.

The Chiquitano language is unique because it is the only language where the most-used term had a purple hue. The language is spoken in Bolivia mostly by children and young adults. Further below the purple block, Karajá is an example of a dialect (like those described in the introduction above) where men and women use different vocabulary. The contrast between the two languages is striking, because colors are identified in such a dissimilar way, even though they're both part of the same language family.

Where did this all start?

In 1969, Berlin and Kay published their book “Basic Color Terms: Their Universality and Evolution". They made two hypotheses about basic color terms and the categorization of the color spectrum. First they claimed that “there is a restricted universal inventory of such categories”; second, that “a language adds basic color terms in a constrained order, interpreted as an evolutionary sequence.” Below is the diagram they created to categorize languages into six stages.

(From “Basic Color Terms: Their Universality and Evolution” by Berlin & Kay)

Berlin and Kay surveyed 20 native speakers representing 20 languages and asked them each to identify 330 colors. They administered a preliminary interview to establish a set of “basic color terms”. Below, the languages are sorted by number of terms used rather than by region and language family. It is useful to compare both visualizations because it shows how different these studies were in both size and results.

The debate between "relativism" and "universalism" considers how language and thought affect each other. The relativist Benjamin Lee Whorf stated that language terms are arbitrary, i.e. personal experiences shape our understanding of words, which in turn shapes our world view. Universalists suggest that languages share the same framework from which we derive cognitive understanding. Both these arguments might reveal why certain geographic locations and language families tend to perceive color in similar ways.

Berlin & Kay concluded that their universal inventory study was more conclusive because all the languages they surveyed identified colors within their hypothesized categorization diagram. These findings attracted a lot of criticism, most significantly due to their small number of participants and limited range of languages. Participants identified only a small subset of color chips, possibly as a result of the preliminary interview. However, the syntax of languages like Japanese does not support the definition of a “basic color term.”

The World Color Survey was established to address these criticisms. They began collecting data from non-industrialized societies without writing systems. Surveying languages without writing systems eliminated the need to define basic color terms.

Color naming has been a useful tool for understanding the relativist and universalist views because it is ubiquitous. It has practical applications across languages because it can be used to identify food, objects, places, even feelings e.g. the blues. Cultures celebrate the world in many ways, but finding connections that might demonstrate how we see the world in similar ways is an exciting and worthwhile exploration.

Berlin and Kay's hypothesis suggested that every language must have a term for white and black. I took the lightest and darkest result from each language, piecing them together in the hopes of depicting variations of how white and black are perceived. Whether or not that's successful, I think this shows how Berlin & Kay's color term classifications are relative to how we name colors.

I should also explain a little bit about how we see color. At this very moment, you are being exposed to a continuous (and therefore infinite) color spectrum, but our eyes are only able to discern millions (or perhaps billions) of unique colors. This is why screens need only display "true color" to be adequate for the human eye. True color or 24-bit is based on the RGB color model, which combines different amounts of red, green, and blue light. Each color is 8-bit, which means 2 to the eighth power, or 256 values each. So 256 to the third power (red, green, blue) equals 16,777,216 colors. The trained eye of a photographer or photo retoucher may be able to perceive the difference in 36-bit color, which displays 68.71 billion colors, still nowhere near infinity.

Our eyes work somewhat similarly to the RGB model, as our eyes respond to red, green and blue. There are about 6.5 million cones in a human eye: 64% of them are “red,” 32% are “green,” and 2% are “blue”. Even though we have fewer blue cones, they are highly sensitive. Some women have been found to possess a fourth cone—a condition known as tetrachromacy—which allows for better color differentiation. It does not necessarily mean one gains perception of an additional color. When light collides with an object, specific cones activate and tell our brains what color we are seeing. For example, let's say light hits a berry. Some of that light is absorbed and some bounces off at a specific frequency or wavelength. We are able to see wavelengths between 400-700 nanometers. If the wavelength is 450nm, more of the blue cones in our eyes will activate, causing our brains to tell us that this is a blueberry. On the other hand, if the wavelength is closer to 700nm, this is likely a strawberry or raspberry.

It's always a mixture of the the red, green, and blue cones that activate. You might think that if green and red cones activate together you would get brown, but RGB is an additive color space (red, green, and blue combine to make white). If you were to project a red, green, and blue spotlight, the intersection of all three would produce white, while each pair of lights intersect to produce yellow, cyan, and magenta.

From “Eye, Brain & Vision” by David H. Hubel

We’d love to hear what you’re working on, what you’re curious about, and what messy data problems we can help you solve. Drop us a line at hello@fathom.info, or you can subscribe to our newsletter for updates.