Category Archives: Object Recognition

Fetch’s Anamorphic Projection

So, this perhaps may be the nerdiest blog post of the year, because it deals with a really cool detail in a video game called Infamous: Second Son. In the mission, you are finding clues about a rogue character named Fetch who has special powers, because you want to find her and absorb them for your own use. Her power is the ability to manipulate neon energy for use as a weapon and a way to be very mobile throughout the city. The clue search takes you to a hideout inside an advertisement sign and here you take various photos about Fetch’s personal life. She accidentally killed her brother Brent over drug use and wants to avenge him by killing drug dealers, so one of the clues is a neon viewpoint of his face that she created. However, this viewpoint is not any regular viewpoint but actually an Anamorphic projection.

Accidental viewpoints are viewing positions that produce a regular image not seen in the real world. The way that the retinal images align in our eyes, the image appears in a way that it is not seen at other angles. This comes back to the theme that an infinite number of images in space can create the same images on our retina and in our brain at various angles and viewpoints. Now, what is seen in Infamous may not be a true Anamorphic projection, since this view is seen from above and not necessarily uses linear prospective as a monocular clue. Still, this is an accidental viewpoint and instead of the image looking normal at all other angles but one, this image looks strange at all angles but one. There are 3 separate images in different parts of the viewpoint. One is located right where the person is standing and the other two are located in spots down below on a rooftop and the street. When the main character takes the picture for the clue, he needs to move to a position where the 3 pictures align perfectly to see the normal image.

This was very awesome to see the first time playing, because immediately I recognized this as an accidental viewpoint relating to the ideas we talked about in class. It was very interesting that video game developers would use an aspect like this in a video game although it is probably much easier to produce art like this in a video game rather in real life.

psych2

psych 3

psych 4

Images from YouTube Video: https://www.youtube.com/watch?v=BHTTw36VQOo&list=PLs1-UdHIwbo6msTIJm0OtY_GKjt5r92AK

 

Train Travel and Motion Parallax

When I was in high school I used to travel other states and it was mostly by taking the train. Every time I was on the train what I do was listening to music and watching landscape outside. During the traveling, the outside view was usually countryside where most of field was undeveloped or farming environment with small houses. There were usually mountains in the distance and sometimes domestic animals such as cows and horses. I am definitely sure that I recognized that as the train moves faster, objects moved and disappeared faster. However, I did not think about why and how objects outside moves faster as the train’s speed goes up. Once I have learned about motion parallax, I started to have different point of view when I am on the train and watching outside. Also, I started observed how motion parallax actually works in our daily life.

Motion parallax is one of many monocular depth cues that help people to determine distance between objects. It provides perceptual cues especially about difference in distance and motion, which is also associated with depth perception. According to our lecture notes, images closer to the observer appear to move faster across the visual field than images farther away. It means that objects that are moving faster are closer to the observer but there is more than what is from the lecture notes. It is similar to the idea of relative height, which is also one of the monocular depth cues. According to relative height depth cue, when objects are below the horizon, objects higher in the visual field appear to be farther away and when objects are above the horizon, objects lower in the visual field appear to be farther away. The reason that motion parallax is similar to relative height is because when there are no obstacles that block further view point, objects that are above the horizon moves in opposite direction than are below the horizon. Not only that, when objects are above the horizon objects move to same direction of the observer and objects lower in the visual field, which is closer to the observer, move faster than those in higher visual field. monocular_IV Source for photo: http://www.rhsmpsychology.com/Handouts/monocular_cues_IV.htm

Surprise Color Blindness

After learning about the different types of color blindness and what causes them I discovered that one of my friends was color blind. During the weekend he and I were talking about how people saw different colors and as I was telling him how the system works we began analyzing different objects in the room and stating what color we thought they were. I initially made the mistake of asking him what colors he could not see, to which he quickly responded how can I tell you which colors I can’t see? After he said that I began to think about the way the color opponent theory worked and how I could apply that to tell him what colors I was seeing. At one point we were both arguing about the color of this candy wrapper that was sitting on his table. He looked at it and said it was green. And then I proceeded to ask him what type of green it was. He was baffled by the question. He could not tell the difference between different types of greens because he only knew what green was based on what he has been told throughout his life. The green color we were looking at was actually much closer to being yellow than green. Just to confirm his inability to differentiate between shades of green I asked him how he would mix paints to make the color we were looking at. He told me that he would have no idea how to make that color if he tried. This made me wonder which color was missing in his perception. Since he could see that the object was green but also could tell there was yellow I concluded that he was unable to see blue. The most interesting part of the experience was that he knew the color was green but he didn’t know why. In his eyes the color was clearly different than blue or yellow but he was unable to distinguish between different shades of green. Although I am not a doctor or have any training beyond what I have learned in this class I told my friend Kevin that he might in fact be a Tritanope. I explained to him that a Tritanope is someone who suffers from a lack of S-cones. I also explained to him what the other types of color blindness were in case he disagreed with my thoughts. Because he is a stubborn person he disagreed instantly, so we then proceeded to do online color blindness tests. He completed all the tests, and then when we got to the test that includes a 42 that consists of red colored dots and a background of different green dots, he was unable to see the number in the middle. After failing this test I realized that he was not a Tritanope. According to what we found while doing our internet search people who were unable to see the red 42 on the green background were Deuteranopes. This meant that his deficient cones were in fact the M-cones. This diagnosis was confirmed by his inability to properly perceive colors that were bright green with yellows. Being color blind has never affected my friend’s life, other than his inability to dress himself and match colors (which truly doesn’t matter). I think it is rather interesting how he has been able to adapt his whole life to what colors he has been taught without ever noticing that he could not see certain colors at all.

Drawing Geons

One of the concepts that we’ve learned about that relates to a lot of my experiences is the concept of geons. Geons are part of a theory about how we recognize objects. The Recognition by Components theory, developed by Biederman in 1987, incorporates the structural description theory and says that there are 36 three dimensional shapes that all objects are made up of. These shapes are called geometrical icons or geons (or primitives). These geons and the idea that all objects are made up of them is very similar to the basic process of learning how to draw. I started drawing when I was really young. Like most kids I started doodling as soon as I was big enough to hold a crayon. But the hobby stuck with me and developed over the years. I was self-taught for almost my entire life and only took an actual art class when I entered high school. It was difficult at first to kind of unlearn the ways I was used to drawing and relearn some of the basics of sketching. Some aspects didn’t help improve my art at all so I didn’t use them as much. But the one important skill I learned that I’ve taken with me throughout the rest of my life was doing your initial sketching by using what are, essentially, geons. Visually, everything, including human figures, is composed of basic 2 and 3 dimensional shapes like squares, cirlces, triangles, and cylinders. Once you can visualize how this works, it makes drawing much easier. Take a human figure: the head is a circle, the shoulders and all the joints are circles, the arms and legs are rectangles or cylinders, the torso is an upside down triangle, the pelvic bone is an upright triangle, the feet and hands are ovals with thin rectangles protruding from them. Although a theory about how we recognize objects is obviously different than a skill used for drawing, the similarities made it easier for me to understand Recognition by Components theory because in a way, I’d been practicing a rudimentary version of it for years.

Object Recognition

I have a second cousin who is now only three, going on to four years old. Since the vocabulary she knows is limited, she sometimes has to use the superordinate-level category and feature theory of object recognition to name objects. Superordinate-level category of object recognition is when someone is naming objects they see in a more general term. For instance, instead of naming an object by its name, someone recognizes it by the category it is in. Feature theory of object recognition is when people recognize objects by remembering different features and parts of the object, and when they see the same features again in another object, they assume the two objects is the same thing.
Toddlers and children tend to do this when they have not enough words and experience stored in their head. When my second cousin sees something she has never seen before she uses the feature theory at times, naming objects the wrong name. For instance, when she first saw a tiger, she called it a big cat because they basically have the same features except tigers are bigger in size. Although tigers are under the cat category, she only calls it cat because it looks like a bigger cat for her.
Another example is when she was drawing a picture on newspapers once, she asked me to hand her the “pens.” She was pointing to her crayons at that time. This is not only an example of feature theory, because crayons and pens do look alike, but also an example of structural description theory. When she named crayons as pens because she knew them not only as having the similar features as pens, but also they have similar functions.
Both the examples of misnaming the tigers and crayons are also examples of superordinate-level category of recognizing objects as well. Because my second cousin is naming tigers as their general category- cats, and naming crayons in its general category- pens, this shows that she is using this concept of superordinate-level or object recognition.

Chromatic Adaptation

Adaptation is the ability of our perceptual system to adjust to the surroundings.  Our response to sustained stimulation decreases, indicating that our perceptual system is dynamic and depends on change to elicit a reaction.  Therefore, the longer we are surrounded by the same stimulus, the less we notice, and it essentially becomes the new normal.

Our visual system is very adept at two particular types of adaptation:  light and chromatic.  Light adaptation allows us to adjust to changes in the level of illumination in our surroundings.  For example, when viewed side by side, an illuminance difference by a factor of ten would look drastically different.  However, if taken to a classroom with only electric lighting and then taken to a different classroom with electric lighting and daylighting, you would probably classify both spaces as having suitable lighting conditions, even though the illuminance in the daylit room could be more than ten times greater in certain areas than the other classroom.  This is also a demonstration of Steven’s Power Law, with brightness being a stimulus with an “n” value less than one where the perceived increase is less than the actual increase.  The other type of visual adaptation is chromatic adaptation, and this involves our perception of color.  Every light source, including the sun, has a distinct spectral power distribution that defines the way colors appear when illuminated by that stimulus.  Chromatic adaptation is the ability of our visual system to allow us to see objects and colors similarly under various light sources, and this is the concept behind my real-world example.

To prove the visual system’s ability to adapt, my lighting professor took my class to the lighting laboratory in the engineering units.  The room is equipped with different spaces containing various light sources for experimentation.  We entered the room and our professor had a discussion with us, telling us to look around at each other and our clothing.  We then moved to one of the experimental areas that was dimly lit and shielded from the primary room by a black curtain, and we learned about the type of work that was being conducted there.  Once finished, we returned the first room.  Upon returning to the space, I looked around once again and noticed nothing out of the ordinary.  I looked at my skin, at my clothes, and at the people around me.  Only because we had been instructed to look around initially did I notice that the color red that one of my classmates was wearing looked slightly different than before.

Our professor broke the news to us that we were now sitting under completely different light sources than the ones we saw previously.  The TA had switched to an alternative set of luminaires after we had left the space.  While we had been observing the other space under dim lighting, our visual system had adapted to those lighting conditions.  Therefore, upon returning to the primary space, we once again adapted to the new, brighter environment.  At no point were we exposed to the different lighting conditions in the primary space in succession.  The most interesting part of the experiment was when the lights that were originally illuminating the room were switched back on.  The differences in color rendition were incredible.  Everything was so much more vibrant under the original stimulus, and as the professor switched back and forth, my skin looked dull, almost sickly under the second stimulus.  It was hard to believe that the conditions under the second stimulus had appeared normal to me.  However, our eyes quickly adapted to the new lighting conditions, and using chromatic adaption, the colors and objects around us became familiar and appeared normal.  However, in a side by side comparison, the two lighting environments were drastically different.  It was a great example of adaptation to the surroundings by our perceptual system and how expectations play a role in defining our sensory experiences.

Where at THON is Jesse?

Every February Penn State’s Dance Marathon or “THON” packs thousands of people into the Bryce Jordan Center in order to dance for children with pediatric cancer, and this year I was on a committee called DAR or Donor Alumni Relations. Our committee gives mostly tours and often we have a lot of down time to simply go into the stands and watch THON casually together. Each of the tours crosses the floor twice once in the beginning of the tour and once near the end. People on tours can become astray in the crowd easily, and it can be very hard sometimes to locate lost donors, parents, and in our case committee members. Unfortunately, one girl was lost we will call her “Jesse” to protect the privacy of the individual, and one tour guide simply mentioned that she didn’t return to the concourse and no one has been able to find her on the floor.

During downtime, many of the committee members were in the upper bowl of the BJC watching THON, and we were trying to locate her if possible. One of my friend’s called our search “Where at THON is Jesse” in honor of the game “Where is Waldo”. It was very fitting, because in the sea of hundreds of dancers, moralers, and pass list holders we were looking for one specific person. Be it as it may, the moment that my friend mentioned trying to find her, I see her in the upper right corner of the floor standing with what seemed to be two other dancers. I told everyone the approximate location where I saw her with some context clues, and they immediately began to see her as well. How is it that humans can notice people so far away but also mixed with a large crowd?

This phenomenon is weird, because in reality the brain only sees detail from a very small portion of the retina called the fovea. The retina is located in the back of the eye and contains photo receptors that converts light images into electrical signals that are sent to the brain.  You see less and less detail the further the object’s images is from the fovea. If you extend your arm at full length and take a look at the width of your thumb, that is on average the size of your fovea. This is the only portion of your perceived surroundings that you can see in detail at a particular time. The human eye moves constantly, so that is why it appears we have full detail of everything all the time. Additionally, there are theories that humans recognize objects by a list of features and shapes that are stored in memory. Jesse was wearing a blue THON 2014 DAR committee shirt, she has brown skin, and she has very long black hair. Using these features and the fact that many objects on the floor were far enough away that their whole image was the size of our thumb, or lied entirely on our fovea, we were actually able to recognize people very far away in the large dancer crowd. Later, it was confirmed by one of our captains that it was indeed Jesse on the floor who we had spotted.

“Eye” Spy Storage Wars

Where’s Waldo? and I Spy are two book series that provide hours of entertainment to children as they attempt to find the hidden objects and people. To adults, however, these books do not have enough action or excitement. Several years ago, I began watching a television show that interested me because it contained components similar to I Spy, yet with the excitement and drama of a reality TV show. It appears that the adult equivalent to Where’s Waldo? and I Spy is the show, Storage Wars, in which people view storage units from an outside viewpoint, then bid on the actual contents. The bidders must identify hidden objects that are blocked by other items in order to most accurately assess the value of the unit before waging a bid. It is important to understand the role of middle vision and how it allows the principle of Storage Wars to be physically possible for the human brain.

Middle vision is responsible for processing and perceiving an object as a unique and entire item. Understanding the broader concept of object recognition helps in clarifying the function of middle vision because middle vision is a fundamental aspect in perceiving an “unknown” item. Object recognition allows people to identify something that has personally never been seen before, recognize partially occluded items, and distinguish an object from any visual perspective, called viewpoint invariance. Using the tools of middle vision and basic feature extraction, object recognition becomes an automatic and rapid process.

Some of the main components of middle vision include detection of edges, contour, and grouping. When factored together, these characteristics exemplify Gestalt psychology- “the whole is greater than the sum of its parts.” For example, the image below is a unit from Storage Wars and contains various items that are not positioned in normal context. By utilizing the Gestalt grouping rules of similarity and parallelism, it is apparent that the five oblong pieces of wood protruding forward on the right side of the picture are all legs connected to a desk. Even though the sixth leg is cut off in the picture and this is an unconventional viewpoint of a desk, a person should still be able to identify it with the help of middle vision and personal experience. This is just one of the many instances that middle vision allows people to differentiate items from one another even if the whole object is not in view.

www.aetv.com

www.aetv.com

After spending several minutes studying this picture, you will be surprised at the number of objects you are able to identify without previously seeing this image or any of the exact items. Although many of the show’s treasures are hidden deep within the vast pile of junk, there are occasions when a bidder’s middle vision manages to identify the “wow factor” from a six-by-six inch exposure of the object. However, that really makes you wonder whether the object was truly spotted or if that is just reality TV at its finest.

Middle Vision and the Panda

            The example of perception that I have witnessed is the World Wildlife Foundation symbol. To many people, it simply looks like a panda. When I first saw this image on the television, I also immediately perceived this as a panda. However, many people do not realize that this is an example of an illusion that they are seeing in their everyday lives. Since the panda is mostly black, the other lines are not necessary to complete the outline of the panda. The human eye automatically fills out the rest of the panda, even though there is not a black line at the top of the panda, and it is actually just blank space.

            This is an example of middle vision, which is a stage of visual processing that comes after basic feature extraction and before object recognition and scene understanding. With the panda, the main features of the panda that are black have already been recognized. However, it requires middle vision in order to determine what the object is completely, since the panda is not completely drawn. Middle vision also involves the perception of edges and surfaces. This is what is happening with the panda. The panda is seen as a whole object because we are perceiving its edges.

            Middle vision also helps to determine which regions of an image should be grouped together into objects. For example, the black parts of the panda are immediately grouped together at first. These black parts of the panda then help to determine the parts that should be grouped together to make the entire symbol. The white parts of the panda is actually just blank space, not part of the panda. Middle vision plays a part in the perception that the blank space actually being a part of the panda, even though it technically is not.

A Flash-Bang Phenomenon

Waking up in the morning, for many, involves turning on a light after turning off an alarm.  Although some people learn to navigate their way to the shower in the dark, most of us have to deal with a dramatic change in luminance, opening our eyes and immediately being overwhelmed by light.  Whether this occurs because your parents are trying to get you up and moving, or because you want to see what you are doing in the bathroom, similar biological processes seem to be at work.  Throughout the night while we sleep, most of us with our eyelids sealed shut, we adapt to a world of darkness.  When experiencing such a sudden and drastic change in luminance – as when someone shines a flashlight in your eyes, you begin to see blobs of undefined figures circling around in your receptive field.  When this happens to me, I usually see what I would compare to a slightly transparent blue ameba – regardless my vision becomes impaired to the extent that it is much harder to make out what is directly in front of me.  I do not know exactly why this happens, but I believe it is primarily due to the bleaching of photopigments.

Retinal ganglion cells respond to changes in light, not the overall amount of light covering their receptive field – the change from dark to light is what causes visual impairment in the prior situations, not how bright either light source is.  Whenever someone shines a light in your eye, a lot of photopigments are used up because they are trying to adapt to the change in luminance. This leaves fewer pigments available to process whatever other information is in one’s environment.  Furthermore, I believe the more of one’s visual field encompassed by a sudden spurt of light, the more neurons fire, because light is coming at them from many more orientations/angles than if such a change occurred farther away.  The more cells that fire, the more information striate cortex neurons have to filter out.  Perhaps when there is too much information to process efficiently, this filtering sequence is slowed and disrupts the feed forward process.  If so, would one’s mind start to group objects/stimuli that it normally would not combine?  In essence, the more depleted photopigments, the more acuity is impaired, and the less detail an individual can make out.

Although cone photoreceptors regenerate more quickly than rods, they are also exhausted at a faster rate.  Since such an abundance of cones are used up when one perceives a drastic change in luminance, (I think) the whole visual processing system falls into a chain reaction of information overload.  The parvocellular pathway, dealing with details of stationary objects, simply tries to pass on too much information.  Because there are now less photopigments available to process the current environment, and the filtering process in the striate cortex is busy with information it has just received, middle vision processes start to receive jumbled information.  My guess is that this leads us to perceive similarities and differences from stimuli in our environment when such relationships do not exist.  Furthermore, the feed forward process allows our brain to draw conclusions about what it perceives without needing a later processing sequence to communicate with earlier processing events, which I believe can lead to uncontrollable perceptual errors. In other words, an error that causes us to see bright blue blobs where they don’t exist.  Perhaps law enforcement personnel use flash-bang grenades for the same reason; they have found a practical use for our visual system’s sensitivity to abrupt changes in luminance.