We’ve only scratched the surface of the full potential for the data warehouse
Why I think the data warehouse will become the control centre for modern companies
It may feel like we’re at the peak point for the data warehouse. Data teams are approaching 50% of engineering team size in some companies, Snowflake revenue has grown more than 100% the last year and the modern data stack is now a commonly used term.
I think we’ve only just scratched the surface.
Sure, market researchers project the data warehouse market to grow 34% each year until it reaches $39b in 2026 but I think the long-term potential is way, way bigger.
The data warehouse will be the control centre for companies in the future. It will expand from analytics and become the core of sales, operations, finance and much more.
The data warehouse will transition from being thought of as a tool for business intelligence to being at the core of everything companies do. Here’s how I think it will play out (in fact, it’s already starting to happen)
Business intelligence: The data warehouse is used to help companies have one place to analyse their data.
Operational tools: Data will be sent to all operational tools and new internal tools will get built directly on top of the data warehouse.
Sales & marketing: With the rise of product-led growth, the data warehouse will be right at the centre for go-to-market teams.
Finance: The data warehouse will be able to truly work for 100% accurate data and finance and accounting teams will become some of the most active users of the data warehouse.
Everything: When all these pieces come together the data warehouse will be the centre piece of any modern company.
Let’s look into how this will happen.
Five phases of the data warehouse
1. Business intelligence - the data warehouse is mainly used for company dashboards and analytics
This is the era we still mostly live in. Most companies have been convinced on the value of data (that is why you are constantly being asked to do a “quick data pull”).
This is coincidently also how big companies brand their data warehousing products today
Accelerate your time to insights with fast, easy, and secure cloud data warehousing at scale - AWS Redshift
Power business decisions from data across clouds with a flexible, multicloud analytics solution - Google BigQuery
This is an important purpose and we still have a lot of unsolved problems. However, large pieces of the puzzle are little by little starting to fall into place. Nobody describes these problems better than Benn Stancil in his post The Modern Data Experience
“…It’s trying to figure out why growth is slowing before tomorrow’s board meeting; it’s getting everyone to agree to the quarterly revenue numbers when different tools and dashboards say different things; it’s sharing product usage data with a customer and them telling you their active user list somehow includes people who left the company six months ago; it’s an angry Slack message from the CEO saying their daily progress report is broken again.”
2. Operational tools - the data warehouse makes it into every part of the company
Led by Census (2018) and Hightouch (2018) this era has begun by making it easy to get data from the data warehouse right into the tools where people work every day such as Salesforce, Marketo and Hubspot.
This is a good start. If you work in sales you’re much more likely to act on an insight if it’s surfaced directly in Salesforce when you’re on the phone to a customer.
But this is just the beginning.
Companies I’ve spoken to are already building out entire products on top of the data warehouse. Whether it’s running a service to decide which cars to inspect at what time or powering a customer survey engine, this is starting to happen.
3. Sales & marketing - the data warehouse will be the invisible hand for go-to-market teams
Product led growth (PLG) is a go-to-market strategy where the main driver for lead generation and selling is the self-serve product. The core component of PLG is the data warehouse linking product usage data to understanding how customers use and get value from the product.
Slack, Atlassian, Dropbox and many more are already bought in. Their sales team get more qualified leads so they can sell larger contracts, do more expansion revenue, focus on the right leads and give customers more value.
We’re still in the early days of PLG and this is just one example of how the data warehouse becomes the centre piece for go-to-market teams.
4. Finance - the data warehouse will become the preferred tool for finance teams
Most companies I’ve spoken to have some data that just has to be accurate. This can be data around how customers are billed, usage patterns or data that is sent to regulators. Today, the data warehouse is rarely trusted for data that needs to be 100% accurate.
It’s not uncommon to see companies building a separate production system for this type of data spending a lot of developer time and making the same common mistakes.
This will change.
The data warehouse will some day have the right features to help ensure data is reliable and consistent, just as it would be in any other production grade system. Data teams will be able to encapsulate parts of the data warehouse that should have more strict rules around changing metrics and dimensions.
Finance and accounting teams will become some of the most frequent users of the data warehouse. As a consequence their roles will change drastically and they will be empowered to focus more on insights rather than data input and accuracy.
5. Everything - the data warehouse will become the core of modern companies
The data warehouse already collects data from all sources. It will soon be able to send that data to any tool you want and operational and sales & marketing teams will use it as the core decision engine for how they work. Finance teams will be able to trust the data and start pushing for moving away from spreadsheet to using the data warehouse. What happens in this world?
Will companies who don’t invest in data and their data warehouse be unable to compete?
Will 90% of accounting work as it looks today be automated without manual inputs needed from an accountant?
Will SAP just become an interface?
That’s the path I think we’re on.
How do we get there?
There are still some building blocks that need to fall into place before the data warehouse will reach its full potential.
Data teams need to adopt best practices from software engineering
The best data teams have already started to use testing, version control for code changes and documenting data in data catalogues and dbt. They’ve also started to culturally adopt ideas from the engineering way of working; data people are often embedded in product teams, companies are hiring data product manager and data is thought of less as a service function. More companies need to follow this path.
Metrics need to become first-class citizens
We’ve seen the rise of a few companies in the metric store space but with dbt announcing metrics at Coelesce in December, metrics are on their way to become mainstream. This is much needed and tackles some important problems. Metrics will be version controlled and there will be clear governance around how changes are made. They will be snapshotted so if you want to replicate a metric you created two years ago you can do that down to the last decimal. There will be one clear place to define metrics and they will no longer be defined in inconsistent ways all over the data stack. And data teams will be able to expose metrics to all other tools through a simple and consistent API.
Product teams will need to own the data from systems they own
The data mesh article called out some very real challenges around data ownership. Most product teams are working in decentralised, smaller teams and the push towards having product teams own the data they produce needs to continue. Engineers will up-skill on data and there will be key metrics around data quality and simpler ways to understand the health of the data assets owned by product teams.
Data teams need full visibility into how data flows across all systems
As the data warehouse expands in scope, in all directions and to all teams there need to be easier and more robust ways of staying on top of everything that goes on. A switchboard that gives just the right level of information from every tool in the data stack.
The role of the analytics leader at the leadership table will change
Once the data warehouse is at the centre of the company the role of the analytics leader will change. This is not just about the person at the top but also how data teams are structured.
“Most data teams aren’t set up for success. For many years, data teams have been buried in the IT function. Like IT functions, those data teams handled getting data out of their systems and presenting them to the stakeholder as CSVs from which the stakeholders could work their magic and come up with conclusions.” - Brian Offut
What excites me about all the things that need to happen is that many of them are already happening; the best companies are inventing their own tools, investing heavily in their data teams and the data community is excellent and full of friendly people making all this possible.
If you have some ideas, let me know!
From my experience, no 2 is the most powerful one and the team who can deliver the business value on no 2 will attract more power as they deliver business. I have a different term for it though, I call it embedded analytics. As for no 4, my past experience suggests reconciling the data from the ledger, and that will win the finance folks over.