Effective Information Management using a Modern Data Stack
The promise of value of data warehousing projects is often lost due to these common factors:
- Planning for the required hardware based on the estimated load and usage, is often a thumb-suck exercise and requires a significant upfront capital expenditure. All too often, data requirements are curbed to reduce the required storage and computing power. The opposite approach can also be adopted where high-spec servers are acquired to handle the significant load that takes place for a few days of the month, sitting idle for the remainder of the time.
- Finding and retaining the necessary database administration skills to ensure the data is readily available when needed, and indexed accordingly is challenging and costly.
- The development of the processes to extract, transform and load the data into a data warehouse can be costly and extremely time consuming.
- Maintaining these processes also requires a lot of effort, especially when critical processes go wrong due to changes to the source systems or hardware limitations.
- The impressive visualisations and insights that were sold to you upon purchasing your BI tool of choice becomes a distant memory when you have to wait three to six months (if you’re lucky) to see the results, and hopefully the results are what you were expecting.
With a modern data stack, companies can quickly realise value from their data initiatives. With a modern data warehousing approach, the traditional ETL (extract, transform and load) process has now become a faster, more agile ELT approach where all data is loaded and then the relevant data is transformed into useful information.
The diagram below illustrates the difference between the two approaches (courtesy of Fivetran).
Tools such as Fivetran saves companies time and money as the time to load your data is reduced significantly. Companies can not focus on data analysis rather than data preparation and loading. Whilst high performance cloud data warehouses such as Snowflake and BigQuery have significantly reduced the cost of storage, removing the limitation of only storing what you are using.
Achieving true data-driven decision making has just become easier.