Data Platform: replication and integration as a service
Background
The data platform is at the center of the organization's data-centric thinking model.
Data platform and the development costs
The data platform and other related services must be able to provide extensive support for development.
However, development is not free, what things do you spend money on in development?
Infra pays, either produced with an on-premises model or as a collection of cloud services. Typically, cloud services quickly provide a set of necessary services and at the same time provide flexibility. Cloud services turn capital investments into operating expenses, and a new way of thinking is needed to monitor costs.
Usually, however, the more important cost is the purchased or self-produced development services, in the form of service purchases or paid labor costs. In what way would it be possible to reduce the cost of development services or change them partly into infrastructure costs?
Two interesting possibilities are:
- Replication
- Integration as a service
Replication
Replication is simply the copying of data from a source to a destination, a few mechanisms are listed below:
- Copy and upsert
- Small data loads
- Incremental load
- Change-indicator selection
- Log replication / CDC
- Utilisation of db transaction log
- Provider – subscriber
- PostgreSQL Londiste (Skype)
The benefits are often that the data volumes can be smaller than traditional integration, almost real-time can be reached, and the CDC mechanism can have a light footprint in terms of the source database.
Things that cause problems or costs are
- May require special measures from the DBA team or ICT
- If synchronization fails, then restart and find out what was missed
- Does one product cover a sufficient number of use cases?
- In the case of file integrations, batch processing may be a simpler solution
Traditionally, in an on-premises environment, replication products have been expensive, and on the other hand, the ready-made functionalities of certain database products have not necessarily supported the fact that data is replicated from one type of system to another.
Integration as a service
Using integration as a service means that the cost of your own development work is changed to a cost in the form of service fees, this can be reasonable considering the salary level of developers and other factors.
Integration as a service can at best mean that the organization implements a set of different connectors, i.e. connections to data sources, in a plug-and-play style. These are just configured and after that the service makes sure that the data flows from the data sources to the destination.
Integration as a service offers time savings, the organization can immediately use, for example, its marketing data source and the transaction data in the payment service. In this case, it is possible to immediately examine which campaigns produced the best returns with the inputs used.
In addition to Microsoft, Ready Solutions Oy is a partner of Fivetran as an integration service, we can offer our customers the opportunity to use such a service.
In the next part of the data platforms writing series, I will write a little more about the benefits Fivetran offers.
About author
Asko Kauppinen is Principal Consultant at Ready Solutions Oy, and has years of experience with various data platforms.