The strength of Google’s Cloud Vision API has been put to the test by a trio of researchers.
In a new report, Hossein Hosseini, Baicen Xiao and Radha Poovendran have produced a paper entitled ‘Google’s Cloud Vision API Is Not Robust To Noise’. The researchers from the University of Washington (Seattle)’s Network Security Lab have discovered that the addition of a little noise has affected – and “blinded” the images analysed by Cloud Vision API. The trio commented in the paper: “In essence, we found that by adding noise, we can always force the API to output wrong labels or to fail to detect any face or text within the image.”
The paper explained what the API can do – which is to quickly classify images into thousands of categories, detect individual objects and faces within images, and to find and read printed words contained within images. The API can also be used to “detect different types of inappropriate content from adult to violent content.”
However, a number of tests were carried out to see what difference image noise would make to the images analysed by the API. A notable factor was that not much noise was required to actually affect the results: on average, only 14.25% impulse noise.
In the report, Hosseini, Xiao and Poovendran showed what happened when two pictures of faces selected from the Faces94 dataset were treated with noise. The faces (respectively affected by 20 and 30% noise levels) could not be detected by the API as a result. The same happened when a picture containing text was treated with 35% impulse noise. The text was not recognised by the API.
In another experiment, the noise also affected the labels returned by the API. The paper presents three pictures and labels, which the API scanned and labelled correctly. After the noise had been added, the API scanned these pictures again, and came up with completely different labels – for example, an airplane became a ‘bird’; a teapot became ‘biology’, while property became ‘ecosystem’.
The Register report pointed out that this could present problems. For example, the authors claimed that the deliberate addition of noise could be some form of attack vector as “an adversary can easily bypass an image filtering system, by adding noise to an image with inappropriate content.”
Specifically, in the crime sector, treated photos scanned by the API could present a challenge. If photographs of criminals on record were corrupted, this could mean that these lawbreakers could be caught on CCTV cameras and would not be recognised.
Ironically, according to the researchers, the images that fooled the API did not fool human eyes.