We nurture lasting relationships, enabling stronger teams, bold and intelligent decisions, better products and services.
For over 25 years, Torry Harris' focus on integration solutions has fostered seamless digital connectivity, enabling better and faster commerce for businesses through platform business models.
From innovation hubs to delivery centers, we bring the right people, skills, and technology together to support your digital transformation journey.
Our relentless focus on excellence has earned us prestigious awards and recognition across various domains. Learn about our achievements.
From enhancing customer experiences to optimizing complex integrations, we’re proud to be a trusted partner in helping organizations achieve their strategic goals. Explore our client transformation stories.
Our WeCare initiative is more than just a program-it’s a promise to uplift and empower individuals who are often overlooked, helping them find a sense of purpose, self-worth, and economic independence. Whether through training, collaboration with social enterprises, or providing direct support, we work to ensure that dignity is restored and futures are reclaimed, one project at a time.
We believe that the right partnerships can make all the difference. Our strong partnerships enable us to deliver on our promise of high performance, flexibility, and competitive pricing, ensuring that our customers achieve their strategic objectives with confidence.
Our summary decks bring together years of collective experience and industry knowledge, offering actionable industry insights. Condensed for quick consumption, these resources are packed with strategic insights, case studies, and methodologies that can help you adapt and excel.
Cloud & Automation: Changing CSPs’ OpEx outlook
Data is continuously evolving, and with it, we’re seeing rapid changes in streaming architecture, data warehousing, platform clustering and more. Exponential growth in the volume and velocity of available data has made centralized organization and management extremely difficult, with bottlenecks emerging in transforming and governing data workloads for near real-time applications across multiple business domains. While every organization today wants to be data-driven, and why not, since it holds the key to advanced insights, trends analysis, and business transformation and personalization, evolving business requirements call for a contemporary and dynamic architectural solution that can scale evolving data objectives.
A data mesh is designed to address scalability and complexity issues with a domain-driven approach to data management and analytics. It shifts the responsibility for data integration, retrieval and analytics from a centralized data team to the respective domains, leading to large-scale decentralization and scalability towards enterprise BI (Business Intelligence) needs.
At a conceptual level, a data mesh is very similar to a microservices architecture. It enables individual domains to approach data as the primary product, allowing them to connect to other domains and perform cross-domain queries and analytics, much like how APIs (application programming interfaces) allow services and applications to communicate with each other.
In her groundbreaking publication, How to Move Beyond a Monolithic Data Lake to a Distributed Data Mesh, Zhamak Dehghani outlined four primary data mesh principles for a distributed data architecture:
Domain ownership:
While domain ownership is a concept most businesses are familiar with at an operational level, data and pipeline ownership is very often centralized under monolithic platforms. A data mesh federates the data ownership to individual domains, incentivizing ownership of how they collect, transform and distribute the data as a product. While the domains will collectively have a standardized set of capabilities to manage the entire data pipeline, including ingestion, cleansing and transformation and distribution, they will individually be responsible for aligning the data parameters to business standards.
Data as a product:
Democratizing data ownership contributes to making it ‘bite-sized’ and more manageable, which leads to the next principle; treating the data itself as a product. It imprints the awareness that the data is being consumed and applied outside of the domains, encouraging data owners to improve data quality and interoperability for the ‘product’. Similar to how public APIs work, projecting a product mindset on data encourages the domains responsible to make it more discoverable, addressable, trustworthy, accurate, accessible, interoperable and secure, adding value throughout the entire pipeline.
Self-service infrastructure platform:
A self-service infrastructure is key to securing stakeholder engagement in a data mesh architecture. Individual business domains will need domain-agnostic tools, resources and systems to train and develop a product mentality to create and consume data as a product. Having a self-serve platform goes a long way towards making analytical data products accessible for general developers.
Federated governance:
Implementing a data mesh is a significant cultural shift for any organization, and securing stakeholder buy-in will pose a significant challenge. Domain heads might not be comfortable with the additional workload and processes required to streamline a distributed data architecture. A federated governance framework ensures there’s a centralized set of policies and standards in place to align data processes with business rules and industry standards while giving individual domains the necessary flexibility to modify certain parameters while maintaining interoperability. These parameters can be defined and mutually agreed upon through SLAs (service level agreements).
While the general approaches to these data mesh principles will differ from organization to organization based on maturity and business goals, any basic data platform with mesh capabilities will have functions for operating the technology stack and creating, accessing, and discovering data products. Assigning data owners and engineers to individual domains can secure developer engagement towards adopting a distributed data mesh system.
A traditional data platform covers all aspects of the data pipeline, across ingestion and transformation to consumption and reporting. In most instances, data engineering teams are responsible for designing ETL (Extract, Transform and Load) pipelines, running reports, evaluating data quality, loading data into data warehouses and online analytical processing (OLAP) databases, and more.
Since the architecture is primarily monolithic with centralized data lakes for storing and transforming streaming data, most of the data ends up being combined in the end. As such, in traditional setups, since all queries go through the central data teams, they have ownership over most of the reports and data products, leading to bottlenecks in downstream applications. In some instances, shadow IT teams develop and manage their own demarcated data platforms to meet their functional and technical requirements quickly without explicit IT approval.
Data mesh, on the other hand, is built around a self-service infrastructure model that prioritizes building data infrastructure components which domain teams can then use to create and serve their own data products (hence, self-serve). Since data mesh depends on business domain ownership, it can function with just the metadata of the data products. It simplifies integration on the metadata level, where all uses like reporting, warehousing, analytics are downstream applications of the data as a product.
In addition to that, a distributed data infrastructure platform should have tools and technologies for data observability, policy and compliance automation, monitoring functions and educational resources for training domain teams on accessing and exposing analytical data products. The federated computational governance framework should give the domains enough flexibility to maneuver around the standardized data components while sticking to relevant product roadmaps.
Depending on the data requirements within an organization, the self-serve platform can be configured to serve different purposes, including data provisioning, lifecycle management and supervision. This current model diminishes the gap between operational and analytical data while respecting the autonomy of individual domains. The domain-centric approach allows businesses to scale their data architecture along with their business needs and maintain data quality and interoperability within a federated governance structure.
Nokia needed to implement a data framework that will provide real-time insights into the performance of business-critical domains to support advanced 5G services. However, traditional data lakes can’t effectively manage distributed data computation, data streaming from disparate sources and fragmented data handling at the same time.
In order to realize their vision of a data mesh, Nokia used their existing Nokia Open Analytics (NOA) framework to create a low footprint, analytical and unified architecture providing complete visibility across multiple application use cases and data products.
Figure: Data products under Nokia’s data mesh(Source: Importance of Data Mesh Architecture for Telcom Operators | by Adnan Khan | Medium)
Netflix began experimenting with data mesh to explore alternate avenues for increasing visibility into studio data at scale. In the entertainment industry, it’s even more important to be able to react to market changes and adapt production parameters accordingly; Netflix needed near real-time visibility into activities taking place under various business domains.
With a data mesh platform, Netflix tried to overcome existing challenges with latency in ETL pipelines, stale data, inconsistent security controls, broken entity onboarding and several other issues. By using a data mesh to streamline data movement and consume, transform and retrieve CDC (change data capture) events, they have managed to successfully address most of the pressing pain points.
Netflix has used reusable processors (a unit of data processing logic for CDC events) and schema evolution to keep reporting pipelines up to date with upstream data changes. With Apache Iceberg as the data warehouse sink, they have provided data workers with the functionality to build analytical views directly on top of data tables. The data mesh platform has built in metrics and dashboards at both the processor and the pipeline level to maintain data quality. Trackers provide near real-time reports through downstream applications to provide maximum business impact for data workers, leading to improved data observability and change management.
Since their initial successes with data mesh, Netflix has further increased its scope with a growing number of applications. You can read more about it here.
While the role of data mesh in shaping the future of data architecture is still not very clear, it is well positioned to become the data management framework of choice for businesses working with enormous amounts of data. Successful data mesh implementation requires considerable amounts of expertise and a large data engineering team. For organizations that don’t need to reckon with extensive volumes of data for operational reporting on a day-to-day basis, data integration and data virtualization are preferable frameworks to meet business data needs in terms of complexity and cost optimization.
When it comes to data, it’s difficult for most businesses to evaluate their maturity level, especially if the existing data architecture is siloed. Large-scale procedural and structural changes require a solid foundation and a comprehensive roadmap; therefore, it’s essential to understand where you are in the pipeline.
While implementing and following data mesh best practices will eventually give you more ownership of your data and make you more self-sufficient, you should consider a data lake or data warehouse for your OLAP systems if:
However, you should consider upgrading to a data mesh if:
From our experience, most businesses can solve a majority of their immediate problems with an outcome-oriented approach to data integration. Data mesh strategy and execution require a longer timeline, stakeholder buy-in and comprehensive resources. In most cases, data virtualization can bring speed and scale to most business needs while giving organizations the time to prepare a roadmap towards implementing a distributed data architecture in the long run.
It’s a good idea to consult a data specialist to figure out where your business falls within the data management spectrum. The better of an idea you have, the easier it will be for you to ascertain what kind of solutions would be a good fit for your business data needs.
Categories : Digital Transformation , API Management
Previous Post
Next Post