RFP Criteria

The State of Data Catalogs 2024

Your evaluation guide to navigating
the data catalog ecosystem

Evaluation Criteria

As organizations continue to accumulate vast amounts of data, managing this data becomes an increasingly challenging task. One solution that has emerged in recent years is the implementation of a data catalog - a centralized repository for metadata that describes the available data assets within an organization. Data catalogs help organizations manage their data assets, making it easier to find, understand, and use data.

At Secoda, we believe that a data catalog should be more than just a repository of data assets. That’s why we view it as a component within our Data Enablement Platform. Our goal is to provide an easy way for data teams to update and maintain data products, while also being accessible and easy to consume for business stakeholders. Our mission is to reduce the burden on data teams to ensure that business decisions are powered by the most accurate information available.

While there is no shortage of data catalogs, the purpose of this guide is to provide a thorough comparison of the modern data catalog ecosystem to help you find the best solution to solve your specific needs. The guide will compare the differences between legacy, open source, and modern solutions in the context of these six prevalent considerations when evaluating a data catalog:

  1. Automation
  2. Ease of Use
  3. System Maintenance and Innovation
  4. Data Governance & Security
  5. Customizability
  6. Price

1. Automation

One of the main reasons why data catalogs are proving to be essential pieces of the modern data stack is because they are able to reduce the amount of time spent searching for correct data.

However, the ability to discover data at this level depends heavily on well-indexed and maintained metadata. Some data catalogs are often designed in a way that forces users to manually update their data, meaning they must manually enter, sort, and clean the data before it is accessible. This process can lead to time-consuming tasks that take away from value added activities and poor adoption of the tool. On the other hand, automation improves the accuracy and completeness of metadata to ensure that data quality is maintained. Ultimately, automated data cataloging can help organizations make better and more informed decisions by ensuring that the right data is accessible to the right people at the right time.


1. Automation
Secoda
Modern
Atlan
Modern
Select Star
Modern
Castor
Modern
Alation
Legacy
Collibra
Legacy
Amundsen
Open Source
Stemma
Open Source
Automatically ingest metadata
Yes
Yes
Yes
Yes
No
No
No
Yes
Automatically ingest table and column level lineage
Yes
Yes
Yes
Yes
No
No
No
Yes
Dedicated data request management workflow
Yes
No
No
No
No
No
No
No
Custom workflows via API
Yes
No
No
No
No
No
No
No
Ability to automatically generate documentation from metadata
Yes
No
No
No
No
No
No
No
Ability to run live queries in documents
Yes
Yes
Yes
No
No
No
No
No
Automated impact analysis
Yes
Yes
Yes
Yes
No
No
No
No

2. Ease of Use

For data catalogs to be valuable, they need to be user-friendly for both technical and non-technical users. If only data teams can access and utilize the catalog easily, the rest of the organization may not appreciate its value. On the other hand, if business teams are unable to understand and use the catalog effectively, they may not be able to make informed decisions based on the available data. Therefore, it is important to design data catalogs with the needs of both groups in mind.

A data catalog’s ability to surface data effectively is largely tied to how powerful and intuitive its search function is. By taking into account the context and meaning of search terms, data catalogs with AI powered semantic search can return more accurate results than traditional keyword-based search methods. Semantic search can also help users find information more efficiently by suggesting related search terms and providing more comprehensive results.

This can be particularly useful for large organizations that deal with vast amounts of data, where traditional search methods may not be effective. Overall, intuitive and contextual search can help improve the speed and accuracy of data discovery, making it easier for organizations to make informed decisions based on their data.

2. Ease of Use
Secoda
Modern
Atlan
Modern
Select Star
Modern
Castor
Modern
Alation
Legacy
Collibra
Legacy
Amundsen
Open Source
Stemma
Open Source
LLM powered search functionality
Yes
No
No
No
No
No
No
No
Semantic search
Yes
Yes
Yes
Yes
No
No
No
Yes
Search by customizable tags
Yes
No
No
No
No
No
No
No
Dedicated business user portal
Yes
No
No
No
No
No
No
No
Slack Integration
Yes
Yes
Yes
Yes
No
No
No
Yes
Multiplayer editing
Yes
No
No
No
No
No
No
No
Dedicated data dictionary component
Yes
No
No
No
No
No
No
Yes
Modern UI
Yes
Yes
Yes
Yes
No
No
No
Yes
Ability to create nested documents and establish folder hierarchy
Yes
No
No
No
No
No
No
No
Ability to immediately implement
Yes
No
Yes
Yes
No
No
No
No
Access to live support and training
Yes
No
No
Yes
No
No
No
Yes
Advanced editing and markdown support
Yes
No
No
No
No
No
No
No

3. Data Governance & Security

Data catalogs must have the ability to authenticate users and restrict access to certain data. Without appropriate controls, it is difficult, if not impossible, to enforce a high level of data governance. This can lead to unauthorized data access impacting data quality and reliability.

Data catalogs must also have the ability to support data governance workflows such as version control, version history, publishing workflow, role-based permissions and assignments, as well as access controls. This ensures that data is protected and accurate, and that the right people have access to the right data.

Role-based permissions and assignments are also essential to ensure that the right people have access to the right data. This helps to maintain data privacy and security, and ensures that sensitive information is not accessed by unauthorized users.

Data catalogs should also be flexible enough to deploy either on-premise or in the cloud. This enables organizations to choose the deployment method that best fits their needs, whether it be on their own servers or in a cloud-based environment.

3. Data Governance & Security
Secoda
Modern
Atlan
Modern
Select Star
Modern
Castor
Modern
Alation
Legacy
Collibra
Legacy
Amundsen
Open Source
Stemma
Open Source
Ability to assign unique roles and access permissions per role
Yes
No
No
No
No
No
No
No
Ability to automatically identify and tag PII data
Yes
Yes
No
Yes
No
No
No
No
Ability to require documents to be approved before publishing
Yes
No
No
No
No
No
No
No
Version history
Yes
No
No
No
No
No
No
Yes
Version control with Git
Yes
No
No
No
No
No
No
Yes
SOC 2 Compliance
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
SSH Tunnelling
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Included SAML, SSO, and MFA
Yes
Yes
Yes
Yes
Yes
No
No
Yes
Self-hosted Deployment
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Private Cloud Deployment
Yes
Yes
Yes
Yes
Yes
Yes
Yes
Yes

4. System Maintenance & Innovation

When evaluating different data catalog options, it's important to consider the maintenance and innovation of the software. Legacy software may have a higher cost of maintenance, and may have a slower release velocity than modern data catalogs managed by a dedicated team. By contrast, modern data catalogs are often managed by a team of experts who are focused on keeping the software up-to-date and relevant to the needs of users. These teams often release updates frequently, which can help ensure that the software is always improving and meeting the needs of users. Furthermore, modern data catalogs are often built on modern technology stacks, which can be more flexible and easier to integrate with other tools and services. Ultimately, using a modern data catalog managed by a dedicated team can help ensure that your organization is able to make the most of its data assets, while also keeping maintenance costs and risks low.

When evaluating whether to use open source software, it's important to consider the costs and benefits. On the one hand, open source software can be very cost-effective, as it is often available for free and can be customized to meet specific needs. Additionally, because the source code is open, organizations have greater control over the software and can modify it as needed.

While open source software has its advantages, it cannot be denied that there are also some serious downsides associated with its use. For instance, organizations that adopt open source software often find themselves needing to invest a significant amount of time and resources into maintaining and updating the software. This can include tasks such as debugging, troubleshooting, and patching. Additionally, because open source software is often developed by a community of volunteers, there may be concerns around the quality and reliability of the software. To address these concerns, organizations may need to hire dedicated staff or consultants to ensure that the software meets their specific needs and requirements.

4. System Maintenance & Innovation
Secoda
Modern
Atlan
Modern
Select Star
Modern
Castor
Modern
Alation
Legacy
Collibra
Legacy
Amundsen
Open Source
Stemma
Open Source
Weekly feature releases
Yes
No
No
No
No
No
No
No
High feature development velocity
Yes
Yes
Yes
No
No
No
No
No
Public roadmap
Yes
Yes
No
Yes
No
No
Yes
Yes
Ability to implement immediately
Yes
No
Yes
Yes
No
No
No
No
Open product feedback cycle
Yes
No
Yes
No
No
No
No
No

5. Customizability

Every organization has unique data requirements and workflows that may not be accommodated by off-the-shelf data catalogs. A data catalog that can be customized to meet these specific needs can provide significant advantages in terms of efficiency and effectiveness.
When evaluating whether to use open source software, it's important to consider the costs and benefits. On the one hand, open source software can be very cost-effective, as it is often available for free and can be customized to meet specific needs. Additionally, because the source code is open, organizations have greater control over the software and can modify it as needed.

Customization can take many forms, such as the ability to integrate with other tools and services, the ability to build on top of existing workflows, and the ability to create notifications and announcements. For example, a data catalog that can be integrated with an organization's existing workflow can help improve efficiency by reducing the need for manual data entry and data cleaning. Similarly, a data catalog that allows users to create notifications and announcements can help ensure that stakeholders are kept up-to-date with the latest information and changes.

Customizability can also help improve adoption rates. If a data catalog is difficult to use or does not fit into an organization's existing workflows, it may not be adopted by users. By contrast, a data catalog that can be customized to meet the specific needs of an organization is more likely to be adopted by users, leading to improved data quality and decision-making.

Finally, customizability can help ensure that a data catalog remains relevant and useful over time. As an organization's data requirements and workflows change, its data catalog must also change to reflect these changes. A customizable data catalog can be adapted to meet these changing needs, ensuring that it remains an essential tool for managing and using data effectively.

5. Customizability
Secoda
Modern
Atlan
Modern
Select Star
Modern
Castor
Modern
Alation
Legacy
Collibra
Legacy
Amundsen
Open Source
Stemma
Open Source
Out of the box integrations with modern cloud warehouses
Yes
Yes
Yes
Yes
No
No
Yes
Yes
Out of the box integrations with modern data stack tooling
Yes
Yes
Yes
Yes
No
No
Yes
Yes
API access to create custom integrations
Yes
No
No
No
No
No
No
No
API access to build data discovery process into existing workflows
Yes
No
Yes
No
No
No
No
No
Customizable broadcasts from changes to documents
Yes
No
No
No
No
No
No
No

6. Price

The prices of enterprise data catalogs and modern data catalogs can vary widely depending on the vendor and the specific features included. However, in general, legacy enterprise data catalogs, including Atlan, tend to be more expensive than modern data catalogs. Modern data catalogs are often offered on a subscription basis, with prices ranging from a few hundred dollars to a few thousand dollars per month, depending on the vendor and the specific features included. On the other hand, tools like Atlan, Alation, and Collibra can cost tens or even hundreds of thousands of dollars to simply implement, with ongoing maintenance costs adding to the total cost of ownership. It's important to carefully evaluate the features and costs of different data catalogs to find the best solution for your specific needs and budget.

Legacy data catalogs also structure their pricing based on a-la-carte modules which can add up to be significantly more than what you would pay for a modern solution, with less functionality.

Data catalogs should be customizable to meet the specific needs of an organization, while also offering a reasonable price that can scale with the organization's needs. Having the ability for price to scale in a reasonable way with no concern of number of assets, viewers or integrations is a key differentiator between point-solution data catalogs and all-in-one solutions.

6. Price
Secoda
Modern
Atlan
Modern
Select Star
Modern
Castor
Modern
Alation
Legacy
Collibra
Legacy
Amundsen
Open Source
Stemma
Open Source
All-in-one pricing
Yes
No
Yes
Yes
No
No
Yes
Yes
Public, transparent pricing
Yes
No
Yes
Yes
No
No
Yes
Yes
Under $1000/month for base plan
Yes
No
Yes
Yes
No
No
Yes
Yes
Implementation fees
No
Yes
No
No
Yes
Yes
No
Yes
Unlimited viewer roles
Yes
No
No
No
No
No
Yes
No
Unlimited data assets
Yes
No
No
No
No
No
Yes
No

A whole new level of data discovery

Secoda makes data discovery a breeze with simple integrations, search and documentation that makes sharing your data knowledge a piece of cake.