Datafold's 396-line llms.txt shows what thorough AI preparation looks like

A data diff identifies differences between two datasets by comparing:

396

Lines

-72% vs avg

Sections

-92% vs avg

742+

Companies

using llms.txt

Files

llms.txt + full

Visit Datafold View Raw llms.txt View llms-full.txt

Key Insights

Focused approach

A streamlined 2-section structure keeps things simple and scannable.

Comprehensive detail

396 lines of thorough documentation for AI systems.

Two-file approach

Uses both llms.txt and llms-full.txt for different AI use cases.

llms.txt Preview

First 100 lines of 396 total

View Full File

# Datafold

## Docs

- [Get Audit Logs](https://docs.datafold.com/api-reference/audit-logs/get-audit-logs.md)
- [Create a DBT BI integration](https://docs.datafold.com/api-reference/bi/create-a-dbt-bi-integration.md)
- [Create a Hightouch integration](https://docs.datafold.com/api-reference/bi/create-a-hightouch-integration.md)
- [Create a Looker integration](https://docs.datafold.com/api-reference/bi/create-a-looker-integration.md)
- [Create a Mode Analytics integration](https://docs.datafold.com/api-reference/bi/create-a-mode-analytics-integration.md)
- [Create a Power BI integration](https://docs.datafold.com/api-reference/bi/create-a-power-bi-integration.md)
- [Create a Tableau integration](https://docs.datafold.com/api-reference/bi/create-a-tableau-integration.md)
- [Get an integration](https://docs.datafold.com/api-reference/bi/get-an-integration.md): Returns the integration for Mode/Tableau/Looker/HighTouch by its id.
- [List all integrations](https://docs.datafold.com/api-reference/bi/list-all-integrations.md): Return all integrations for Mode/Tableau/Looker
- [Remove an integration](https://docs.datafold.com/api-reference/bi/remove-an-integration.md)
- [Rename a Power BI integration](https://docs.datafold.com/api-reference/bi/rename-a-power-bi-integration.md): It can only update the name. Returns the integration with changed fields.
- [Sync a BI integration](https://docs.datafold.com/api-reference/bi/sync-a-bi-integration.md): Start an unscheduled synchronization of the integration.
- [Update a DBT BI integration](https://docs.datafold.com/api-reference/bi/update-a-dbt-bi-integration.md): Returns the integration with changed fields.
- [Update a Hightouch integration](https://docs.datafold.com/api-reference/bi/update-a-hightouch-integration.md): It can only update the schedule. Returns the integration with changed fields.
- [Update a Looker integration](https://docs.datafold.com/api-reference/bi/update-a-looker-integration.md): It can only update the schedule. Returns the integration with changed fields.
- [Update a Mode Analytics integration](https://docs.datafold.com/api-reference/bi/update-a-mode-analytics-integration.md): It can only update the schedule. Returns the integration with changed fields.
- [Update a Tableau integration](https://docs.datafold.com/api-reference/bi/update-a-tableau-integration.md): It can only update the schedule. Returns the integration with changed fields.
- [List CI runs](https://docs.datafold.com/api-reference/ci/list-ci-runs.md)
- [Trigger a PR/MR run](https://docs.datafold.com/api-reference/ci/trigger-a-prmr-run.md)
- [Upload PR/MR changes](https://docs.datafold.com/api-reference/ci/upload-prmr-changes.md)
- [Create a data diff](https://docs.datafold.com/api-reference/data-diffs/create-a-data-diff.md): Launches a new data diff to compare two datasets (tables or queries).

A data diff identifies differences between two datasets by comparing:
- Row-level changes (added, removed, modified rows)
- Schema differences
- Column-level statistics

The diff runs asynchronously. Use the returned diff ID to poll for status and retrieve results.
- [Get a data diff](https://docs.datafold.com/api-reference/data-diffs/get-a-data-diff.md)
- [Get a data diff summary](https://docs.datafold.com/api-reference/data-diffs/get-a-data-diff-summary.md)
- [Get a human-readable summary of a DataDiff comparison](https://docs.datafold.com/api-reference/data-diffs/get-a-human-readable-summary-of-a-datadiff-comparison.md): Retrieves a comprehensive, human-readable summary of a completed data diff.

This endpoint provides the most useful information for understanding diff results:
- Overall status and result (success/failure)
- Human-readable feedback explaining the differences found
- Key statistics (row counts, differences, match rates)
- Configuration details (tables compared, primary keys used)
- Error messages if the diff failed

Use this after a diff completes to get actionable insights. For diffs still running,
check status with get_datadiff first.
- [List data diffs](https://docs.datafold.com/api-reference/data-diffs/list-data-diffs.md): All fields support multiple items, using just comma delimiter
Date fields also support ranges using the following syntax:

- ``<DATETIME`` = before DATETIME
- ``>DATETIME`` = after DATETIME
- ``DATETIME`` = between DATETIME and DATETIME + 1 MINUTE
- ``DATE`` = start of that DATE until DATE + 1 DAY
- ``DATETIME1<<DATETIME2`` = between DATETIME1 and DATETIME2
- ``DATE1<<DATE2`` = between DATE1 and DATE2
- [Update a data diff](https://docs.datafold.com/api-reference/data-diffs/update-a-data-diff.md)
- [Create a data source](https://docs.datafold.com/api-reference/data-sources/create-a-data-source.md)
- [Execute a SQL query against a data source](https://docs.datafold.com/api-reference/data-sources/execute-a-sql-query-against-a-data-source.md): Executes a SQL query against the specified data source and returns the results.

This endpoint allows you to run ad-hoc SQL queries for data exploration, validation, or analysis.
The query is executed using the data source's native query runner with the appropriate credentials.

**Streaming mode**: Use query parameter `?stream=true` or set `X-Stream-Response: true` header.
Streaming is only supported for certain data sources (e.g., Databricks).
When streaming, results are sent incrementally as valid JSON for memory efficiency.

Returns:
- Query results as rows with column metadata (name, type, description)
- Limited to a reasonable number of rows for performance
- [Get a data source](https://docs.datafold.com/api-reference/data-sources/get-a-data-source.md)
- [Get a data source summary](https://docs.datafold.com/api-reference/data-sources/get-a-data-source-summary.md)
- [Get data source testing results](https://docs.datafold.com/api-reference/data-sources/get-data-source-testing-results.md)
- [List data source types](https://docs.datafold.com/api-reference/data-sources/list-data-source-types.md)
- [List data sources](https://docs.datafold.com/api-reference/data-sources/list-data-sources.md): Retrieves all data sources accessible to the authenticated user.

Returns active data sources (not deleted, hidden, or draft) that the user has permission to access.
For non-admin users, only data sources belonging to their assigned groups are returned.
- [Test a data source connection](https://docs.datafold.com/api-reference/data-sources/test-a-data-source-connection.md)
- [Datafold API](https://docs.datafold.com/api-reference/datafold-api.md)
- [Datafold SDK](https://docs.datafold.com/api-reference/datafold-sdk.md)
- [Get translation projects](https://docs.datafold.com/api-reference/dma/get-translation-projects.md): Get all translation projects for an organization.
This is used for DMA v1 and v2, since it's TranslationProject is a SQLAlchemy model.
Version is used to track if it's a DMA v1 or v2 project.
- [Check status of a DMA translation job](https://docs.datafold.com/api-reference/dma_v2/check-status-of-a-dma-translation-job.md): Get the current status and results of a DMA translation job.

Poll this endpoint to monitor translation progress and retrieve results when complete.
Translation jobs can run for several minutes to hours depending on project size.
- [Get translation summaries for all transforms in a project](https://docs.datafold.com/api-reference/dma_v2/get-translation-summaries-for-all-transforms-in-a-project.md): Get translation summaries for all transforms in a project.

Returns a list of transform summaries including transform group metadata,
validation status, and execution results. Use this to monitor translation
progress and identify failed transforms.
- [Start a DMA translation job](https://docs.datafold.com/api-reference/dma_v2/start-a-dma-translation-job.md): Start a translation job for a DMA project.

Executes the DMA translation pipeline to convert source SQL code to target dialect.
The pipeline processes code through multiple stages (file operations, reference extraction,
template creation, SQL translation, validation, and bundling).

This endpoint launches a long-running background workflow and returns immediately with
a job_id. Use the get_translation_status endpoint to poll for progress and results.
- [Get column downstreams](https://docs.datafold.com/api-reference/explore/get-column-downstreams.md): Retrieve a list of columns or tables which depend on the given column.

View Complete File (396 lines)

Datafold is ready for AI search. Are you?

Join 742+ companies preparing their websites for the future of search. Create your llms.txt file in minutes.

Generate Your llms.txt

More Examples to Explore

View All

Don't get left behind

Your competitors are preparing for AI search.

Datafold has 2 organized sections ready for AI crawlers. Generate your llms.txt file and join the companies optimizing for the future of search.

Get Started Free Browse More Examples