A new paper from researchers in India and Australia highlights one of the strangest and ironically most humorous facets of the problems in machine learning – humour.
Automatic Sarcasm Detection: A Survey [PDF] outlines ten years of research efforts from groups interested in detecting sarcasm in online sources. The problem is not an abstract one, nor does it centre around the need for computers to entertain or amuse humans, but rather the need to recognise that sarcasm in online comments, tweets and other internet material should not be interpreted as sincere opinion.
The need applies both in order for AIs to accurately assess archive material or interpret existing datasets, and in the field of sentiment analysis, where a neural network or other model of AI seeks to interpret data based on publicly posted web material.
Attempts have been made to ring-fence sarcastic data by the use of hash-tags such as #not on Twitter, or by noting the authors who have posted material identified as sarcastic, in order to apply appropriate filters to their future work.
Join The Stack in September for a look at the latest Pharma Tech - at the largest gathering of industry professionals in Europe.
Some research has struggled to quantify sarcasm, since it may not be a discrete property in itself – i.e. indicative of a reverse position to the one that it seems to put forward – but rather part of a wider gamut of data-distorting humour, and may need to be identified as a subset of that in order to be found at all.
Most of the dozens of research projects which have addressed the problem of sarcasm as a hindrance to machine comprehension have studied the problem as it relates to the English and Chinese languages, though some work has also been done in identifying sarcasm in Italian-language tweets, whilst another project has explored Dutch sarcasm.
The new report details the ways that academia has approached the sarcasm problem over the last decade, but concludes that the solution to the problem is not necessarily one of pattern recognition, but rather a more sophisticated matrix that has some ability to understand context. Any computer which could reliably perform this kind of filtering could be argued to have developed a sense of humor.