A new research paper submitted yesterday brings back into focus one of the fundamental obstacles to the evolution of the self-driving car – how does an autonomous vehicle even know if it is on a road?

Vision-Based Road Detection using Contextual Blocks [PDF] addresses the problems that automated navigational analysis systems face in understanding the lineaments of a road – where the road begins and where it ends. The more popularly-discussed issue of obstacle recognition depends on understanding the context for obstacles – the road itself. A road can be a very indistinct entity in the face of GPS errors or service loss, or of unexpected diversions or other changes which are as yet unregistered with any of the topographical or GPS-based maps that the autonomous navigation system may be relying on.

The paper, by Caio César Teodoro Mendes, Vincent Frémont and Denis Fernando Wolf, proposes a new method of classifying ‘contextual blocks’ in hand-labelled images as one solution to the potential difficulty that Advanced Driver Assistance Systems (ADAS) may experience in distinguishing the characteristics of the road that it is on. The problem is one that also plagues current AI and Big Data research in the field of image-recognition: context. The paper states:

A common limitation of most machine learning methods is that they independently classify each image region or pixel, ignoring the contextual information and are therefore subject to misclassifying areas of similar appearance.’

‘Illustration of the proposed block scheme. The classification block is show in red, the contextual blocks in orange, the possible support block in blue and the road blocks in green.’ - http://arxiv.org/pdf/1509.01122v1 (‘Vision-Based Road Detection using Contextual Blocks‘)

‘Illustration of the proposed block scheme. The classification block is show in red, the contextual blocks in orange, the possible support block in blue and the road blocks in green.’ – http://arxiv.org/pdf/1509.01122v1 (‘Vision-Based Road Detection using Contextual Blocks‘)

The researchers’ solution is the incorporation of contextual ‘cues’ that put penumbral information into a definite context. The experiments involved the use of the KITTI Vision Benchmark Suite, and was divided into the three categories ‘urban unmarked’, ‘urban marked’ and ‘urban multiple marked lanes’, though they eschewed the stereo images provided by default in the suite in favour of monocular data. The work was implemented via the Python-based SciPy software array together with the scikit image library for feature extraction purposes.

In the second image, from the paper, contextual analysis succeeds in increasing the definition of the road edge using the same sampled information.

In the second image, from the paper, contextual analysis succeeds in increasing the definition of the road edge using the same sampled information.

Throughput speed is one of the key challenges of accurate road definition. The paper notes that other researchers’ use of Convolutional Neural Networks comes at the cost of real-time processing – an absolute requisite in this particular field. Tagging existing image data allows real-time assessment at the cost of currency, but the present choices seem to be between maintaining relatively well-updated ‘baked’ data or considering far more powerful on-board computing systems than are currently anticipated for the commercial autonomous vehicle sector. Neither do ‘mainframe-style’ cloud services provide a viable third solution even assuming excellent coverage, due to potential (and potentially disastrous) lag.

The commercial market may have to await such trickle-down as the military may eventually permit from its own commitment to developing ‘off-road’ autonomous navigation systems, such as the Marines’ Ground Unmanned Support Surrogate (GUSS) Autonomous navigation system, in trials since 2012.