Photos play a prominent role on Facebook and, until now, visually impaired users have out on a lot of updates from their friends. Facebook’s engineers have incorporated artificial intelligence in an attempt to describe these pictures to blind or partially blind users.
Facebook is calling the system “automatic alternative text” and it’s based on a neural network with vast, complex databases designed to mimic the human brain as closely as possible.
The AI software doesn’t actually “see” the picture, but it can compare the objects in it with its vast internal database of similar photos and make an educated guess about what’s being shown. Part of the challenge, Facebook says, is in getting computers to recognize what’s most important in an image, whether that’s the people, the background or the “action.”
For each image, the AI system returns a confidence score indicating how sure it is that it can identify what’s in the picture. If this is above 80 percent, an automatically-generated caption appears. According to the engineers behind the system, that target is already being hit for half of all the pictures on the social network, and the underlying technology is getting better all the time.
When objects and people have been identified, Facebook’s software constructs a sentence to describe the picture. If there’s some doubt about the picture then the sentence starts with “image may contain” to express that uncertainty.
The feature is live now in the Facebook iOS app, as long as your language is set to English. Facebook says it hopes to roll out the service to more platforms, languages and markets in the near future. It actually works with any screen reader software – on iOS you can enable it via the VoiceOver tool in the Accessibility section of Settings (under General), for example.
Twitter has also started experimenting with a similar feature, though in this instance captions are added manually. Users on iOS and Android are being encouraged to add their own alt text captions for the benefit of the visually impaired. Letting humans do the work means more accuracy in the description, but it does depend on people putting in the time and effort to explain what they’re posting.