Champion The Shift Towards A Collaborative Data Culture

March 1, 2021

I still remember the first time I signed up for Facebook. My friends and I wanted to share photos of a recent week-long trip we had taken and decided to share them through Facebook. Someone created an album of photos and we were all able to comment, share and like the photos that we had individually taken. This was the first time I remember being mesmerized by the power of a collaborative product.

The shift towards a collaborative enterprise

Over the next decade, many similar consumer products emerged and became the expectation. On top of these consumer products, a new trend of similar products had begun to take over the enterprise. People like myself, who grew up with these social products, began adopting similar products in the enterprise. Today, tools like Dropbox, Slack, Notion, Github, Jira, Figma have become the norm when working across departments.

In my last role in business operations, I spent the majority of my day jumping in between these tools to work with different team members to make decisions. And as the pandemic hit, the amount of time that I spent on these tools only increased as they became a source of truth for decision making, a social hub for different tasks and a team-oriented way to work together.

Different types of team-oriented decision-making require different levels of coordination and information sharing.

Below is a graph outlining how to think about collaboration and what tools are commonly used:

This is not meant to be an exhaustive list of collaboration tools, just a framework to understand how different tools might get used in organizations.

Of the tools presented in the graph above graph, our team used the following:

  • Notion (Asynchronous, function agnostic tool): We used Notion to document most decisions. Notion became our hub for product roadmaps, memos, how-to’s etc. Teams would write docs in Notion and ask for asynchronous feedback on the decisions.
  • Slack (Synchronous, function agnostic tool): Slack was used to centralize information in an ad hoc way. We had dedicated channels for communicating about different departments or projects.
  • Figma (Asynchronous, design specific tool): Figma was the tool we used to make decisions about website changes and product design. Teams would comment asynchronously after the designer put the work on paper.
  • Mode (Asynchronous, data visualization specific tool): Mode was a place to view dashboards created by the data team. These dashboards were built to measure performance or to answer a question that needed validation. More on this later.

A functional collaboration tools should give all employees who need access to information an easy way to find and understand that information.

The role of analytics in the upcoming decade

It’s generally common knowledge that enterprises are collecting more data than ever before. This is true of even small companies that have hundreds of tables and visualizations stored in data warehouses and visualization tools. Most businesses have started to attempt to derive business insights from their proprietary data sources. Product, marketing, and operations teams are expected to make data-driven decisions that demonstrate business value. This expectation will only increase as the amount of data collected grows and the cost of analyzing data decreases.

To create these insights, organizations rely on employees that can understand the data and extract value from the data using SQL. Some organizations consist of one core team of analysts and data scientists who are the driving force for how analytics will be run throughout the rest of the organization.

An alternative to the centralized approach, more teams are now adopting a decentralized data team. In these organizations, each department unit delivers its projects and functions and is supported by analytics throughout the process. Data analysis is not limited to the responsibility of a single data team. Instead, anyone can self serve their information to get the specific answers needed.

Data teams have adapted to the requests for decentralized analysis and self-service by building solutions depending on the technical aptitude of the requester. Below are the types of persona’s that are found in organizations:

  • Level 1: This type of employee is completely competent in their ability to analyze data through SQL and just needs access to the right information.
  • Level 2: This type of employee is confident in their data analysis abilities on basic queries, but could use some help with more complex queries.
  • Level 3: This employee is not confident with their SQL skills. But they know how to use excel to analyze data
  • Level 4: This type of employee is not confident with their data analysis skills and just wants to see graphs with final results.

In the upcoming decade, more teams will adapt towards a decentralized approach to speed up the time it takes teams to access information about the business.

How do teams collaborate with analytics today?

The amount of collaboration of data depends on the amount of competency and literacy that different employees have using data.

  • Level 1: These employees will store old SQL queries in Git / Github or their data visualization tool, document their work in Confluence or ad hoc conversations on Slack and view different tables in data grip or Jupiter.
  • Level 2: This type of employee will use DataGrip or Jupyter Notebook to write basic queries. They will ask the data team for information about tables on Slack or in Zoom meetings and will visualize their work in their data visualization tool.
  • Level 3: This employee will use the data visualization tool to search for different tables and Slack to ask questions about tables or specific numbers. Sometimes, this employee will ask for a way to download a spreadsheet of the data to run their analysis on Google sheets or use a tool like Google Analytics.
  • Level 4: This employee will use the data visualization tool to search for different tables and Slack to ask questions about tables or specific numbers.

The modern data collaboration stack is scattered across different warehouses, BI tools, SQL queries and reports that live in completely different tools. Additionally, the modern data stack relies heavily on context about tables or visualizations shared through slack or zoom meetings. This context is traditionally difficult to find months down the line, which usually causes similar questions to surface.

For example, imagine you’re an employee categorized by Level 2 data competency. Below are the steps you would take to get to a data-driven answer:

From start to finish it can take up to 30 days to get insights on a question from the data team

The collaborative part of the process described above is traditionally happening through Slack, Zoom and Confluence. These tools weren't built for data collaboration in mind.  Because the data documentation, data visualization and data discovery and collaboration processes are all conducted in separate tools, information is lost and teams spend weeks collaborating over one single metric. According to a McKinsey report, employees spend 1.8 hours every day searching and gathering information. On average, that’s 9.3 hours per week!

Some tools, like Looker, have made portions of this process a little more efficient through their extensive LookML layer. That being said, there is often still confusion around table names, definitions, common queries or the relevance of different information in these data visualization tools. The confluence document and ad hoc Slack conversations were not built to be Function specific for data understanding and analysis. These tools create missing or outdated information about tables, visualizations and queries around the organization and create data debt.

What’s missing from the traditional data collaboration stack?

The one missing piece from today’s analytics stack is a social way for everyone to easily search and understand data. We believe that this tool should contain a repository of all the tables, visualizations, pipelines, raw data and queries across the organization. The ideal interface would make these resources easily searchable through text. Tools like Amundsen and internal data tools at Shopify, Uber, Facebook and Airbnb have all taken a similar approach to data discovery to make data context available through one central place. We believe these data discovery tools are the missing link in the modern data discovery stack.

We also believe that a good data discovery tool should be a collaboration tool. This would mean that each table can have the social context that replaces the confluence docs and Slack conversations. Today, some data discovery tools have commenting, notifications and tagging features, but haven't embraced the collaborative features that tools like Notion and Figma have built.

We are building Secoda to be a novel automated and collaborative data discovery platform. Employees will be able to use the following features in our tools to create one central knowledge base for all company analytics.

  • Granular Threads: Rather than chatting about a table on slack, each data resource will have its thread that allows employees to keep an organized repository of shared information about a table.
  • Resource tagging: Employees will be able to tag tables with specific tags that are searchable through a text-based interface.
  • Employee tagging: Employees will be able to @ other employees who might know the answer to a specific question related to a table
  • Data dictionary tagging: Admins will be able to highlight and tag plain text to add a data definition to the text that will be viewable whenever someone hovers over the text
  • Columns comments: each column will get its comment section that can help clarify any information about the column
  • Team Spaces: Each team can create a space of commonly used tables and team members. This way, new employees can easily explore the commonly used tables for their roles.
  • SQL verifier: Employees will be able to submit SQL queries in the data discovery tool for checking by a “verified” admin.
  • Submit a request: Employees will be able to submit a data request to their data team through a single space. Once the question is answered, it will become searchable to all employees.
  • Owned by sections: Easy employees will get a profile with their commonly used tables, owned tables, commonly used dashboards and owned dashboards. Employees who leave the company will keep their profile so other employees can see who they should contact to get the information needed about a data asset
  • Verified tables: Admins will be able to verify tables to increase truth worthiness.
  • Favourited tables: All employees can favourite tables and dashboards that appeal to them.
  • Slack Integration: Employees will be able to add comments to tables without ever leaving their Slack workspace.

How To Champion A Shift Towards A Collaborative Data Culture

Shifting towards a collaborative data-driven culture requires teams to evaluate how their existing tools give every employee the confidence they need to analyze data. Below are some steps teams should take if they are interested in shifting towards a more collaborative data-driven culture. 

  1. Evaluate your existing tools: figure out how many employees can use the existing tools to drive insights. What kind of insights can these employees generate? 
  2. Break down data silos: give employees the confidence to find what they are looking for by adopting good tools for data visualization. Our recommendation is Looker for its functional self-service LookML layer. 
  3. Capture conversations and answers: Capture important documentation about data and conversations around decisions making. This could be through a data discovery tool or a confluence doc. It depends on how much work you want to do. 
  4. Understand where employees are getting stuck. When Airbnb ran this survey, they found that they consistently scored really poorly on the question, “The information I need to do my job is easy to find.” Data was often siloed inaccessible and lacking context. They adopted a data discovery tool to solve this problem.
  5. Hold workshops and answer questions: Create open spaces for people to ask their questions and adopt a culture of “no dumb questions” to make sure that every employee can ask their questions, no matter how minuscule or simple they seem. 
  6. Empower data champions: No one can do it alone. Bringing on people to champion data for different departments can help take the load off the data team. 
  7. Manage privacy for datasets through a centralized platform: using a tool to manage permissions can help you make certain data available easily. 
  8. Encourage teammates to learn SQL: Teach teammates to use SQL and work with them to figure out how to query certain tables. Store the common queries in a central place that makes it easy to onboard and explore.

What now?

We believe that a centralized, asynchronous, function agnostic data discovery tool can help all employees collaborate on data in a way that hasn’t been achieved by the existing tools. A tool like Secoda is built to break down data silos through collaboration. Our team is excited to welcome any team interested in trying out this new collaboration tool by signing up for our beta at Secoda.co.