Setting up Knowledge Graph data connectors
Connectors make it easy to keep your company's Knowledge Graph up-to-date and relevant. By syncing Knowledge Graph with your internal databases, you can ensure your team is working from the right source documents at all times.
In this article:
- Setting up OAuth app permissions
- Creating a data connector
- Managing data connectors
- Frequently asked questions
Setting up OAuth app permissions
Writer uses OAuth apps to manage access to your data sources. Here's the workflow:
- An org admin decides which OAuth app management strategy works best for their company.
- An authentication app is created for a type of connector. Depending on the strategy selected, either the org admin authenticates at a global level, or team admins can authenticate using their own credentials.
- Team admins will be able to create connectors for their team's Knowledge Graph(s) to any data sources which have OAuth authentication set up.
To begin, an org admin should navigate to Admin > OAuth apps.
Before you can begin connecting your data sources to Writer, you need to choose how to manage your OAuth apps:
- Writer-managed apps (faster but less flexible): A Writer org admin selects this option. Writer team admins can then create their own connectors using their own credentials. Writer-managed apps use preset permissions which cannot be modified. This option is faster and easier for your team, but doesn't offer org admins global control over data connectors. You won't be able to specify exact permissions, and you won't be able to revoke global access to a data source. If you decide that, say, your company should no longer use GDrive as a data source for Knowledge Graph, you won't be able to turn off any connectors made by team admins using their own credentials. Instead, the individual team admins will need to delete the sync from Knowledge Graph settings.
- Self-managed apps (longer setup but more granular control): A Writer org admin selects this option, and completes the authentication process for each data connector. This may require support from the IT admin of that data source. The org admin specifies permissions at the data connector level (including which drives are accessible); these permissions apply to all instances where the data connector is used throughout the org. Setting up this option takes more time, but offers more granular control over permissions, and the ability to globally revoke access to a source. This option allows org admins to turn off all syncing with GDrive at once, for example, or ensure that no team admin ever syncs to the Top-Secret Executive Drive even if they have access to it individually.
Authenticating with a data source
Once you've selected an OAuth app preference, select Set up a new app.
Select the data source you'd like to connect to your Knowledge Graph. Currently, we offer data connectors for Confluence, Google Drive, Sharepoint, and Notion:
Enter authentication credentials for the data connector you've selected. You may need help from the IT administrator of that tool. For more information about finding authentication credentials, see our developer documentation below:
- Configuring authentication values for Confluence
- Configuring authentication values for Sharepoint
- Configuring authentication values for Google Drive
- Configuring authentication values for Notion
Once you've authenticated with the data source, your team admins will be able to create data connectors between the source and their Knowledge Graph(s).
What do these options look like for team admins?
If you've selected Writer-managed apps, your team admins see a message on the Knowledge Graph settings page, prompting them to create their own app using their own credentials.
If you've selected Your OAuth apps, your team admins see a message prompting them to reach out to an org admin to create an app.
Creating a data connector
Once an org admin has configured an OAuth app strategy, team admins can create connectors between a data source and their Knowledge Graph(s).
To begin, navigate to Setup > Knowledge Graph. Select the Knowledge Graph you'd like to connect to a data source.
Connecting Google drive to a Knowledge Graph
Managing data connectors
When a team admin creates a data connector for a Knowledge Graph, Writer will display a summary of the Knowledge Graph's status, including file count and any syncing errors. Select reindex to crawl the data source again.
Writer will display a list of the files within your Knowledge Graph, including the file type, the file status, which user added the file, and when the file was added.
Which file types are supported?
Knowledge Graph supports pdf, txt, doc/docx, ppt/pptx, eml, html, srt, csv, and xls/xlsx file types.
What do the different file statuses mean?
Live in graph | Index and available in graph |
Syncing | File sync is in progress from connector |
Indexing | File is indexing to the Graph |
Sync error | Errors syncing file |
Indexing error | Errors in the Graph indexing process |
How often are data connectors synced?
We automatically resync every 24 hours at midnight UTC time.
Under which conditions will Writer resync a file or a folder?
If a file or a folder gets updated or deleted, the next sync will reflect these changes.
We do not re-sync assets which have not been changed.
How can you delete a data connector?
If your org admin has opted for self-managed OAuth apps, your org admin can delete the OAuth app. This will delete any data connectors for that app - in other words, deleting the Google Drive OAuth app will delete all data connectors between Google Drive and any Knowledge Graphs belonging to any team in your account. The IT admin of the data source can also do this from the data source side as well.
If your org admin has opted for Writer-managed OAuth apps, a team admin can delete the data connector from the Knowledge Graph setup page. There is no way for an org admin to delete all global connections between a data source and the various Knowledge Graphs owned by the different teams in your account — it must be done one by one.
What happens when you delete a data connector?
We stop syncing and delete all indexed data. If you want to resync those files again later, you'll start over.
What happens when you unsync a file or folder?
We stop syncing and delete indexed data.
Frequently asked questions
Connectors & data sources
Can I connect a Knowledge Graph to more than one data source?
No, you can only connect a Knowledge Graph to one data connector. Let's say your Sales team has two Knowledge Graphs called Sales Demo and Sales Docs. If you connect Sales Demo to GDrive, you cannot also connect it to Notion. You could still connect Sales Docs to Notion.
Can I connect multiple Knowledge Graphs for my team to the same data source?
Yes! If your Sales team has two Knowledge Graphs called Sales Demo and Sales Docs, you could connect the Sales Demo Knowledge Graph to a series of folders in Google Drive, and connect the Sales Docs Knowledge Graph to a different series of folders in Google Drive.
Can multiple teams within the same organization connect to the same data source?
Yes! If your Sales team and your Marketing team each want to link their Knowledge Graphs to Google Drive, they can.
Is there a limit to the number of files I can sync through a data connector?
You can sync up to 50,000 documents through a single data connector.
Is there a limit to the size of file I can sync through a data connector?
For performance considerations, each file in a data connector can be up to 150MB.
Managing files within a Knowledge Graph
Which file types are supported?
Knowledge Graph supports pdf, txt, doc/docx, ppt/pptx, eml, html, srt, csv, and xls/xlsx file types.
What do the different file statuses mean?
Live in graph | Index and available in graph |
Syncing | File sync is in progress from connector |
Indexing | File is indexing to the Graph |
Sync error | Errors syncing file |
Indexing error | Errors in the Graph indexing process |
How often are data connectors synced?
We automatically resync every 24 hours at midnight UTC time.
Under which conditions will Writer resync a file or a folder?
If a file or a folder gets updated or deleted, the next sync will reflect these changes.
We do not re-sync assets which have not been changed.
How can you delete a data connector?
If your org admin has opted for self-managed OAuth apps, your org admin can delete the OAuth app. This will delete any data connectors for that app - in other words, deleting the Google Drive OAuth app will delete all data connectors between Google Drive and any Knowledge Graphs belonging to any team in your account. The IT admin of the data source can also do this from the data source side as well.
If your org admin has opted for Writer-managed OAuth apps, a team admin can delete the data connector from the Knowledge Graph setup page. There is no way for an org admin to delete all global connections between a data source and the various Knowledge Graphs owned by the different teams in your account — it must be done one by one.
What happens when you delete a data connector?
We stop syncing and delete all indexed data. If you want to resync those files again later, you'll start over.
What happens when you unsync a file or folder?
We stop syncing and delete indexed data.
Security
How is the data in a Knowledge Graph stored?
We don't permanently store the original files synced through a data connector. We store them temporarily during the indexing process. After indexing is complete, we store the indexed data in a graph structure in a multitenant GCP environment and delete the original files.
Indexing a file refers to breaking the file down into text snippets (paragraphs or full pages, determined by the specialized LLM), extracting keywords, metadata, and summaries, and storing the indexed data in a graph structure.
Does customer data get used in LLM training?
No. When you make a request to Writer, we retrieve the relevant pieces of data from Knowledge Graph and then send that data to our LLMs to reason and generate an answer. The indexed data in your Knowledge Graph is stored completely separately from the LLM, and customer data is never used in LLM training or fine-tuning.
Who has access to view synced data in Knowledge Graph?
Once a data connector has been set up with a graph, everyone with permission to access that graph can access the data. Permissions from the data source don't transfer over to Knowledge Graph.