by Des Nnochiri
The term DataOps was coined back in 2015 but only really became a significant force in professional circles during the latter part of 2017. But what is this latest tour de force in software development methodology?
Over the last decade, cloud technology has been well and truly embraced by organizations as a method of completing tasks in a more automated, elastic, and on-demand fashion. Cumbersome tasks, which would have once taken weeks of development time, can now be achieved in minutes or even seconds. This efficiency offered by automation and the cloud has led to the proliferation of new development philosophies such as DevOps.
DevOps, which leads to continuous delivery initiatives, allows organizations to push out software in greater quantities, with more frequency, and at a higher quality than at any point in history. This, in turn, is leading to a gold rush among organizations to acquire the latest developments in technology to gain or maintain a lead.
The infrastructure needed to embrace these developments is readily available. However, the one thing holding these organizations back is a lack of maturity in data analysis. And this is the gap into which DataOps inserts itself.
DataOps
DataOps has a strong focus on data cultivation and management practices designed to improve the speed and accuracy of analytics. DataOps methodology encompasses data access, quality control, automation, integration, model deployment and management.
In short, DataOps is about changing the way your organization approaches data. Instead of simply gathering and analyzing data en masse, DataOps has you only carrying out analytics with specific goals and objectives in mind. For example, data can be used to reduce customer churn rates by building specific recommendation software to promote bespoke products which will encourage repeat purchases. DataOps makes sure your IT teams have access to the data they need to build these tools and deploy them within your business systems.
The DataOps Manifesto
DataOps has become so prevalent today that there exists a DataOps manifesto which lays out 18 key principles that the philosophy operates under:
#1 Continually satisfy your customer
The highest priority is to satisfy the customer through the early and continuous delivery of valuable analytic insights from a couple of minutes to weeks.
#2 Value working analytics
Believing the primary measure of data analytics performance is the degree to which insightful analytics are delivered, incorporating accurate data, atop robust frameworks and systems.
#3 Embrace change
Welcome evolving customer needs, and in fact, embracing them to generate competitive advantage. The belief that the most efficient, effective, and agile method of communication with customers is face-to-face conversation.
#4 It’s a team sport
Analytic teams will always have a variety of roles, skills, favorite tools, and titles.
#5 Daily interactions
Customers, analytic teams, and operations must work together daily throughout all projects.
#6 Self-organize
The belief that the best analytic insight, algorithms, architectures, requirements, and designs emerge from self-organizing teams.
#7 Reduce heroism
As the pace and breadth of need for analytic insights ever increases, analytic teams should strive to reduce heroism and create sustainable and scalable data analytic teams and processes.
#8 Reflect
Analytic teams should fine-tune their operational performance by self-reflecting, at regular intervals, on feedback provided by their customers, themselves, and operational statistics.
#9 Analytics is code
Analytic teams use a variety of individual tools to access, integrate, model, and visualize data. Fundamentally, each of these tools generates code and configuration which describes the actions taken upon data to deliver insight.
#10 Orchestrate
The beginning-to-end orchestration of data, tools, code, environments, and the analytic team’s work is a key driver of analytic success.
#11 Make it reproducible
Reproducible results are required and therefore everything must be versioned. Data, low-level hardware and software configurations, and the code and configuration specific to each tool in the toolchain.
#12 Disposable environments
The belief that it’s important to minimize the cost for analytic team members to experiment by giving them easy to create, isolated, safe, and disposable technical environments that reflect their production environment.
#13 Simplicity
The belief that continuous attention to technical excellence and good design enhances agility. Likewise, simplicity is essential.
#14 Analytics is manufacturing
Analytic pipelines are analogous to lean manufacturing lines. Therefore, a fundamental concept of DataOps is a focus on process-thinking aimed at achieving continuous efficiencies in the manufacture of analytic insight.
#15 Quality is paramount
Analytic pipelines should be built with a foundation capable of automated detection of abnormalities in code, configuration, and data, and should provide continuous feedback to operators for error avoidance.
#16 Monitor quality and performance
The goal is to have performance and quality measures that are monitored continuously to detect unexpected variation and generate operational statistics.
#17 Reuse
We believe a foundational aspect of analytic insight manufacturing efficiency is to avoid the repetition of previous work by the individual or team.
#18 Improve cycle times
We should strive to minimize the time and effort to turn a customer need into an analytic idea, create it in development, release it as a repeatable production process, and finally refactor and reuse that product.
Final Thoughts
DataOps has the potential to change the ways organizations analyze and process the information they gather during the day to day DevOps operations. With a sharp focus on goals and company mission statements, DataOps has the power to revolutionize the software development cycle.