Author: Kalev Leetaru / Source: Forbes Facebook's 'Like' icon. (Chris Ratcliffe/Bloomberg) The intense media coverage this past week of
The intense media coverage this past week of the so-called “Facebook killer” drew attention once again to the horrific ways in which social media platforms can provide a global audience to people who wish to do themselves or others grievous harm and indeed begs the question of whether in the absence of such instant fame would at least some of these acts have been prevented?
In the immediate aftermath of the Steve Stephens video, Facebook issued a statement condemning the video and noting that it had been removed from the platform, but only after it had garnered global attention.
Perhaps the greatest challenge to stopping violent imagery from finding an audience on social media is simply the sheer volume of material being posted every second. No army of human moderators could hope to manually review every uploaded image or video in a reasonable amount of time and the rise of live streaming brings with it new time pressures in that the video may begin and end before a human moderator even knows it has aired. While one could argue whether social media companies bear any kind of ethical or moral responsibility in hiring the necessary staff to do a far better job at reviewing content, one could also argue that as public companies they owe a fiduciary duty to their shareholders to minimize their expenditures and take a calculated risk that a certain volume of violent imagery will be immortalized by their network.
Yet, automated technologies, while far from perfect, offer a powerful and compelling opportunity to both flag the most egregious content and prevent banned content from being reposted to the network in a giant game of whack-a-mole.
After processing more than a quarter billion global news images through Google’s Cloud Vision API last year, I’ve found that Google’s deep learning algorithms are extraordinarily adept at identifying violence in a myriad contexts, spotting even situations that a typical human was likely to miss unless they looked very carefully. From a person holding a gun to another’s face to blood pools on the pavement to any number of other situations, Google’s API has been able to recognize an incredible diversity of imagery that one might characterize as depicting violence of some fashion. In short, deep learning has reached a point where it is able to recognize many classes of “violence” simply by looking at a photograph and understanding the objects and activities it depicts – all within a fraction of a second and infinitely scalable.
Imagine taking such a tool and using it to filter every single image uploaded to Facebook in realtime. If the algorithm flags an image as being potentially violent, the user would receive a warning message and would be asked to provide a textual explanation of why the image should be permitted, such as that the warning is an error or that the image legitimately depicts violence, but that its publication is in the public good (such as documenting police violence). A simple linguistic AI algorithm would evaluate the description for detail and linguistic complexity ask the user to provide additional detail as needed. The final report would then be provided to a human reviewer for a final decision.
No algorithm is perfect and such an approach would still allow some level of violent imagery onto Facebook (which could still be flagged through the traditional review process), but it would at the very least filter out the most egregious and graphic violence. False positives would result only in a slight delay as a human reviewer confirms whether the image was in fact violent or overrides the algorithm in the case of a mistake.
Moreover, every time the algorithm misses an image or yields a false positive, all of that information can be fed back on an ongoing basis to retrain the algorithm, meaning the system will get more and more accurate over time.
Such a system would completely invert Facebook’s current…