Johnson Poh, head of data science at the Big Data Analytics Centre of Excellence at DBS Bank, writes on the critical steps to take when developing a big data strategy
To implement an effective big data strategy, businesses must develop themselves in three core areas – data engineering, advanced analytics and visualization.
Developing a core data engineering layer is necessary for setting up a robust data repository. Distributed file systems and data processing technologies are used to build the foundation for an implementation pipeline.
This foundation can then be overlayed with advanced analytics and statistical learning models, which set the stage for predictive analysis, pattern recognition and optimization. Tools to build this secondary layer include open source programming environments such as R and Python.
Finally, to bridge the gap between statistical models and decision analysis, it is essential to provide a visualization interface for users to make sense of the insights from the analytic models. Such tools include open source visualization libraries, as well as enterprise software.
Businesses should not be afraid to start small by using data they already have to experiment and create a feedback loop
The three core areas may also be represented by six phases of the full-scale analytics delivery pipeline, which covers data production, ingestion, storage, processing, analytics and consumption.
Understanding big data components
Businesses that have honed their craft in big data technologies have become industry game changers. Amazon, eBay, Lazada have transformed the retail industry through personalized product recommendation and customized consumer content because of their data-focused foundation, which has supported their business strategies.
Similarly, Netflix’s data-driven management has led the paradigm shift in the entertainment industry through on-demand video streaming and predicting content that viewers want to watch. Data-driven businesses such as these are better prepared for surviving the next decade in the new economy.
It is through an in-depth appreciation of this pipeline that the promise of big data can be practically realized, most appropriately through the efficient allocation of the right skillsets for the relevant phases in order to ensure minimal mismatch and overstretch of business resources.
The value of big data analytics typically comes in the form of a useful data application, the creation of new knowledge and most importantly, in its ability to inspire prescriptive actions and decision-making based on the insights generated, which may be of a descriptive, diagnostic or predictive nature.
It is pertinent to harness the right mix of tools, techniques and resources to achieve the synergy needed for the successful delivery of the analytics pipeline.
Leveraging new technologies
More generally, the guiding principles in developing big data roadmaps encompass the following three points: Designing a data repository; Investing in the right data architecture; Building an effective big data team in tune with business objectives.
The first two are natural extensions from the big data development pipeline, while the last one relates to people as the epicenter of all critical resources.
Businesses should not be afraid to start small by using data they already have to experiment and create a feedback loop. As you experiment, the relevance of new data fields may emerge that you may wish to start collecting.
Some questions that can guide your data design and construct include what data do we need to meet my business objectives? What existing data do I have? What data do we not have, but would like to start collecting? Is there data that is difficult to collect but we need to find proxies to?
Businesses that refine their data repository through continuous experimentation will reap the benefits of drawing deeper insights.
An effective big data team should be led by a business leader, who is well-grounded in technical concepts with robust hands-on experience
Having a good understanding of big data solutions in the market ensures that businesses only spend on what is necessary. Getting the right data capabilities to meet your business needs keeps costs on your balance sheet at bay.
Businesses just starting out may want to focus on developing a data science workbench and take on a modular approach in developing a data architecture. Open sourced frameworks such as Hadoop and Spark are affordable and low-cost solutions for fulfilling data analysis needs.
As businesses progress along their data-ready roadmap, they may find the need to scale up in capacity and complexity. When business strategies evolve from pure analysis to predictive modelling, growing data repositories require larger storage and advanced data analysis for more effective generation of insights. These businesses may find the need to customise software and go a step further to invest in their own R&D and develop data applications that improve their business’ operations and ultimately, productivity.
Choosing the correct people to lead the effort is important. Having a combination of technical know-how and the business objectives is essential. An effective big data team should be led by a business leader, who is well-grounded in technical concepts with robust hands-on experience.
Transiting into a data-ready culture does not have to be a major overhaul of organization structure
Getting the relevant composition of data engineers, data scientists and software developers is important. For instance, businesses at the onset of their data roadmap will require more data engineers, who are implementers and managers of big data platforms and repositories.
As businesses progress along their roadmap, more data scientists will be required to build data analytics and predictive algorithms. Finally, software engineers are necessary for developing front-end dashboards and data visualization tools that help product managers in their day-to-day decision making and operations.
Dealing with legacy
While modern stack big data solutions can be used to gain insights more efficiently for swifter business decisions, getting companies to embrace data-driven approach is often fraught with inertia.
Starting out can be daunting. How do you structure an effective big data team to deliver on business objectives? How much of an investment do you make on infrastructure revamp and manpower reallocation? The truth is that transiting into a data-ready culture does not have to be a major overhaul of organization structure, but a well thought out roadmap with long-term objectives is necessary to ease the transition.
In order for a data-driven culture to permeate throughout the whole organisation, the integration of big data science teams within organisations demands that every team member has a strong appreciation of the business objectives and the ability to communicate with key stakeholders, instead of acting in silo.
A commitment to securing stakeholder buy-in is part and parcel of a well-assimilated big data team within the business. Consistent engagement with IT offices, product managers and perhaps more significantly, the consumers is key to boosting a data-driven culture in the enterprise setting.
Johnson Poh will be speaking at the forthcoming Big Data World Singapore, 11th and 12th October 2017 at the Marina Bay Sands Expo and Convention Centre. To hear from Poh and other Big Data experts from around the world, register today for your FREE ticket.