Researchers from the University of Sao Paolo have created an algorithm which they claim demonstrates 95% effectiveness in identifying fake reviews intended to damage the reputation of new companies, and which are likely to have been originated by the companies’ competitors.
ORFEL: super-fast detection of defamation or illegitimate promotion in online recommendation [PDF], by Gabriel Gimenes, Jose F Rodrigues Jr and Robson L F Cordeiro presents a system entitled Online-Recommendation Fraud ExcLuder (ORFEL). Though not the first work of its type undertaken in this field, ORFEL is unusual in that it seeks to identify both negative and positive types of fake reviews. Successfully picking lies out of the daily chatter of user-submitted reviews is genuinely a Big Data problem, and the report observes that ‘catching up these kinds of attacks is a challenging task, especially when there are millions of users and millions of evaluated products defining billion-scale interaction daily. In such at- tacks, multiple fake users interact with multiple products at random moments…in a way that their behavior is camouflaged in the middle of million-per-second legitimate interaction.’
With false reviews, both positive and negative, gradually eroding community trust in peer opinion at sites such as Amazon and TripAdvisor, reputation management is being forced to develop new technologies to identify commercial insincerity. Fake reviews at the TripAdvisor location and locale review site have got the company into trouble intermittently in recent years, with a scandal in 2011 and a prosecution and fine from the Italian government regarding PR mendacities masquerading as public opinion.
Gimenes et al concentrate on identifying lockstep behaviour, identifiable when groups of users coordinate their efforts and begin to interact with products at the same time. It’s a difficult vector to identify when an attack is made upon a newly-launched product which is likely in any case to be the subject of unusual review activity and general online attention. The primary method of individuating lockstep attacks begins with examining the patterns in this kind of extremely short-term cluster.ORFEL uses a vertex-centric algorithm which can individuate lockstep traces in web-scale graphs by use of parallel processing, and is among the first of newly-developed algorithms to employ this method. Since scalability and accuracy are essential to make meaningful identifications in such an avalanche of data, ORFEL tasks itself with equalling the efficiency of the cluster-based techniques that have hallmarked previous approaches. Since ORFEL’s signature has broader scope and ambit than other projects, it’s able to extend its accuracy of results to both ‘damaging’ and ‘boosting’ reviews.
Two of the datasets tested in the research were data sampled from Amazon – apposite timing, since the online retailing giant recently launched proceedings against over 1,000 fake online reviewers that it deemed to be damaging the integrity of its peer reviews community. The metrics go beyond ratings at obvious target communities such as Amazon, to take in collateral lockstep attacks via social networks such as Google+ and Facebook, and the attacks methods observed involve the stealing of credentials via malware, social engineering and the systematic creation of fake users.
In June the UK government’s Competition and Markets Authority published a report condemning the practice of companies using online marketing companies to place ‘arranged’ positive reviews, promising unlimited fines or even imprisonment for those who continue to breach the regulations.