How I Learned to Stop Worrying and Love Large Language Models

As we step into what’s next for data teams, it's not just about welcoming LLMs into our toolkit; it's about recognizing the shift they represent. It's a move towards democratizing data, breaking down barriers, and automating tasks and workflows that can drive tremendous impact
Last updated
April 11, 2024

As the modern data stack has matured, we’ve evolved from the small kid on the block to a mature ecosystem of tools with funding and lots of users. Companies are making increased investments in data, and with it, the expectations for ROI on data teams has increased. For some data teams, this has helped provide the needed clarity to focus on the most important work, and for others, it’s boiled over to an existential crisis as teams struggle to justify their impact. 

I’ve watched the modern data stack space evolve over the past few years and think some distinct inflection points are worth discussing. From my perspective, these three distinct periods are: pre 2023, 2023, and post 2023. I’ll try to go through my thoughts about these three time periods, how they have impacted data teams, and how they should shape how we think about the future.

2022 - Our search to find impact in data

Before 2023, when our data stacks were more immature, a hot topic at conferences was the ROI of data teams. Yet, quantifying the impact of data felt like trying to explain why "The Sopranos" is a masterpiece to someone who's never seen the show. It's not that we lacked the stories; it was more about building that bridge of trust with stakeholders, which often felt like trying to high-five a fish.

On the macro level, we were in the era of 0% interest rates and a flush of funding. It sounded like a dream, but the real puzzle was setting up a data stack that delivers real value. The enthusiasm was there, but channeling it into a streamlined, effective data operation proved to be a challenge. Up until mid-late 2022, data teams were in a constant uphill battle, trying to solidify their value, foster trust, and make the most of their increased resources. But things changed significantly in 2023 when interest rates jumped and funding became harder to secure. For the first time for many emerging modern data teams, it was time to make hard decisions about our operations.

2023 - Cutting costs and shifting priorities

In 2023, the game changed for data teams. Gone were the days of the 0% interest rate economy. With tighter budgets, proving ROI became even trickier. So, data teams did what anyone does when the going gets tough: they started to cut costs, clear out the unnecessary, and make sure their impact on the P&L was as light as possible. This shift was expected and healthy, but impeded many teams' ability to think about the bigger picture. 

Just as data teams were tightening their belts, the world of Large Language Models (LLMs) hit the gas pedal on innovation. In what felt like a blink of an eye, LLMs evolved from being fancy text predictors to invaluable sidekicks for developers, writers, and motion designers. And this is just the beginning. 

We're on the brink of seeing LLMs take giant leaps in real-world applications. Imagine LLMs that not only understand context but remember your history, kickstart workflows, and even collaborate with other LLMs. The one-year progress from a buggy image generator to models that can create hyper realistic images has been outstanding. So, while 2023 might have been a year of recalibration for data teams, it also marked the dawn of a thrilling era for data through LLMs. Going into 2024, LLM should give us a chance to reshape the narrative on the impact data teams have on company goals.

Mid Journey 1-year progress 

2024 - A new chapter in the modern data stack

As the economic tides begin to turn, data teams should feel a fresh wave of optimism about the future. Historically, data teams have been the unsung heroes, not because they're not hitting home runs but because sometimes the crowd doesn't quite get the game they're playing. 

LLMs have a chance to change that. Imagine a world where your ability to ask the right questions is the only barrier between you and insights, not your proficiency in SQL or coding. Skeptics might say, "But wait, can these LLMs really get SQL queries right?" It's a fair question, but even if 30% of queries are solved through AI, isn’t that worth the investment? Speaking from my experience using co-pilot, it’s okay that AI only gets the right answer 30% of the time, because it provides so much value by speeding up the basic workflows that used to take up more time. 

So, as we step into what’s next for data teams, it's not just about welcoming LLMs into our toolkit; it's about recognizing the shift they represent. It's a move towards democratizing data, breaking down barriers, and automating tasks and workflows that can drive tremendous impact to a P&L. And for data teams? It's their chance to pivot from being seen as mere number crunchers to becoming central playmakers. By leveraging AI, data teams can start justifying the investment in infrastructure, storage and compute.

The modern data stack is entering a new era of importance.

The shift of workloads to the cloud has paved the way for more advanced developments, particularly in the areas of AI and large language models (LLMs). The cornerstone of any successful AI project is a robust and well-integrated data platform, which data teams have been working towards for years.

Our seat at the table  

In the grand scheme of things, this isn't just about keeping up; it's about setting the stage for some seriously cool developments down the line. The modern data stack is going to power the next wave of innovation around data, making the foundation we've been building upon worth the time and effort. Now, with the LLM era upon us, all that groundwork is about to pay off in ways we're just beginning to explore. For a team to reach a place where AI can be useful, some steps need to be taken, in large part from the data team. Taylor Brown, the COO of Fivetran highlighted the following steps in their “AI readiness” roadmap that teams should follow to leverage AI. The steps are:

  1. Identify all data sources. This includes everything from databases and cloud storage to emails and documents. Engage with various departments to uncover unique data they might be using.
  2. Organize and categorize your data. Document what type of data you have and classify it based on importance, sensitivity, and regulatory compliance. 
  3. Evaluate data quality. It's crucial to assess the accuracy, completeness, and reliability of your data. High-quality data is essential for the success of AI projects.
  4. Document data access and usage. Understanding who accesses the data and for what purpose helps in managing dependencies and avoiding bottlenecks.

By laying a solid data foundation, we’ve enabled accelerated AI development, ensuring that our projects are secure, reliable, and impactful on the rest of the business. For data teams struggling to justify ROI, this should be embraced as the holy grail of value that we can provide. This gives data teams another path to have a direct impact on the P&L.

At Secoda, we believe that with a well-architected modern data stack, data teams can pave the way for AI initiatives that are not just innovative but are also reliable, secure, and immensely impactful on the bottom line. As a function that has struggled with our identity so much over the past few years, we should be welcoming AI with open arms as the catalyst that transforms our work from a reactive analyst to proactive change-makers with the power to shape the future. Isn’t this what we always wanted?

Keep reading

See all stories