App Connectors: Github

👤 This documentation is intended for Site Administrators and/or Database Administrators.

Connectors are Sisense for Cloud Data Team’s built-in mechanisms for customers to connect to and ingest from popular data sources.

Note: Connectors are supported for customers with Data Engine on your Warehouse or Warehouse infrastructure. Site administrators can contact their Account Manager if interested in enabling Connectors.

<div>
<UL>
<LI><a href="#PreparingaConnector">Preparing to add a Connector</a>
</LI><LI><a href="#AddingaConnector">Adding a Connector</a></LI>
<LI><a href="#SetUpIngestion">Set Up Ingestion</a></LI>
<UL>
<LI><a href="#SetupDatabasePermissions">Setup Database Permissions</a></LI>
<LI><a href="#HowToSetUpIngestion">How To Set Up Ingestion</a></LI>
<LI><a href="#SuspendIngestionforaConnector">Suspend Ingestion for a Connector</a></LI>
</UL>
<LI><a href="#DeleteaConnector">Delete a Connector</a></LI>
<LI><a href="#SupportedConnectors">Supported Connectors</a></LI>
<LI><a href="#GithubTableSchemas">GitHub Table Schemas</a></LI>
</UL>
</div>
<HR>

<a name="PreparingaConnector"></a>

Preparing a Connector

To set up Github in Sisense, make sure to have the following:

  • Access to the projects you want to replicate data from. Sisense will only be able to access the same projects as the user who creates the access token.
  • A Github Access Token with full access to  the repo scope. The repo scope is required because of how GitHub structures permissions. Sisense will only read data from your Github Account

<a name="AddingaConnector"></a>

<a href="#top">Back to top</a>

Adding a Connector

To add a Connector, administrators can first click Settings menu in the bottom left hand corner. Then, click the App Connections option:

In the top right corner, click the green Connect Source button.


From the Data Source Type dropdown, select the 'Github'' option:

Enter the display name for the Connector in the Display Name section of the Connect Source menu. The Display name will become the schema name and cannot be changed later.

In the GitHub Personal Access Token field, paste the access token you created for this integration.

In the GitHub Repository Name(s) field, enter the paths of the repositories you want to track. The path is relative to https://github.com. For example: sisense/adobe-xd-blox-plugin. To track multiple repositories, enter a space delimited list of the repository paths. For example: sisense/adobe-xd-blox-plugin sisense/docs sisense/analytics-toolbox. Then, enter the additional requested information in spaces provided and click 'Add'.


The process of establishing a new connection will take a few minutes as all of the tables that are associated with that source are being discovered. During this time a spinning icon will appear. Available tables from the connected data source will appear as they become available.


In the event that the connection was unsuccessful, a Connection failed screen will appear. Please verify that the information entered for the Github connector is correct.

<a name="SetUpIngestion"></a>

<a href="#top">Back to top</a>

Setup Ingestion

<a name="SetUpDatabasePermissions"></a>

Setup Database Permissions

In order to ingest data into Sisense, certain database permissions will need to be updated the first time a connector is added.

Sisense Managed Warehouse:

Sisense will create a new database user and grant the necessary permissions to load data into the warehouse from the connected Sources. Sisense will also grant the Read User the necessary schema permissions in order to query the sources' data from within Sisense.

Data Engine on Your Warehouse:

In order to ingest data into the Redshift or Snowflake warehouse using Sisense App Connectors, Sisense requires a database user that can perform write operations on the database. Create permissions are required to create the necessary database objects to load and store the data. Read permissions on system tables are required to validate the existence and structure of existing database objects.

Please grant the following privileges to the database user that will be used for the App Connectors function:

Snowflake:

<body>
<blockquote>
<br>GRANT CREATE ON WAREHOUSE warehouse_name TO sisense_ingest;
<br>GRANT CREATE ON DATABASE database_name TO sisense_ingest;
</blockquote>
</body>

Redshift:

<body>
<blockquote>
<br>GRANT CREATE ON DATABASE database_name TO sisense_ingest;
<br>GRANT SELECT ON ALL TABLES IN SCHEMA information_schema TO sisense_ingest;
<br>GRANT SELECT ON ALL TABLES IN SCHEMA pg_catalog TO sisense_ingest;
</blockquote>
</body>

The above permissions can be added to the existing Warehouse Admin created for Data Engine, but Sisense recommends providing an additional user to perform the App Connectors function. This will ensure that one service account is not responsible for too many actions.

To query the data imported into the warehouse using the Connectors feature, it is also necessary to grant query permissions on the newly-created schema and its tables to the read user. The name of the schema will be the Display Name of the connected Source. After the first replication job has been completed, run the following commands to grant permissions to query the schema from within Sisense:

Snowflake:

<body>
<blockquote>
<br>GRANT USAGE ON SCHEMA database_name.schema_name TO ROLE sisense_read;
</blockquote>
</body>

Redshift:

<body>
<blockquote>
<br>GRANT USAGE ON SCHEMA schema_name TO sisense_read;
<br>GRANT SELECT ON ALL TABLES IN SCHEMA schema_name TO sisense_read;
</blockquote>
</body>

Note: The following IPs will need to be whitelisted for access to the destination database server:

  • 52.23.137.21/32
  • 52.204.223.208/32
  • 52.204.228.32/32
  • 52.204.230.227/32

<a name="HowToSetUpIngestion"></a>

<a href="#top">Back to top</a>

How to Setup Ingestion

Tables listed for the connector are available for ingestion into the warehouse. To select a table for ingestion, click on the box to the left of the table’s name.

Once a table has been selected, a list of columns that are available for ingestion will appear “Columns Selected for Ingestion”. To select specific columns for ingestion, click on the box to the left of the column’s name.

Once the desired tables and columns have been selected, click Save in the bottom right. To select an Update Interval, use the dropdown menu under Update Interval. Select the desired frequency for which the data should be replicated into the warehouse.

Use the Anchor Time dropdown menu to select an Anchor Time. This is the time that the ingestion job will start. Be sure to allow enough time for loading the data when setting this time. If an ingestion job is not complete before the next scheduled interval, the interval will be skipped.

The Fetch Records Newer Than field indicates the date from which the ingestion of the data into the warehouse should begin.  Select the fetch records date by clicking into the Fetch Records Newer Than box.  

Note: The fetch records date will default to one year in the past if a date is not manually selected. To select an Update Interval, use the dropdown menu under Update Interval. Select the desired frequency for which the data should be ingested into the warehouse. 

‍Note: The selected Update Interval and Fetch Records Newer Than date will apply for all tables and columns within a Connector that have been selected for ingestion and ingestion. Tables and columns within a given Connector cannot have differing Update Intervals and/or Fetch Records Newer Than dates.

Save the Update Interval and Fetch Records Newer Than data by clicking 'Save' in the lower right corner of the section. It will take a few minutes for the data to be ingested into the warehouse. After a few minutes, navigate to the Database Connections tab from the Settings menu.

Select the appropriate warehouse from the dropdown menu and click on the refresh icon next to the schema for the warehouse.

Once the schema has been refreshed, the source name that was entered when the Connector was set up will appear in the schema browser and data will now be queryable.

<a name="SuspendIngestionforaConnector"></a>

<a href="#top">Back to top</a>

Suspend Ingestion for a Connector

In the event that ingestion for a Connector should be suspended, please select the Connector from the dropdown menu within the Ingestion page and toggle on Suspend Ingest from the Update Interval menu. Lastly, please click 'Save' within this menu to save the changes made.

Note: It may be necessary to scroll within the menu to see this option.

<a name="DeleteaConnector"></a>

<a href="#top">Back to top</a>

Delete a Connector

To delete a Connector, navigate to the App Connections page from the Settings menu and select the name of the Connector to be deleted from the dropdown menu.

Click 'Delete' in the bottom left:

In the Delete Source window, follow the instructions provided by typing the name of the Connector to be deleted in the field provided.

Once the name of the Connector has been entered, the Delete button will become available. Click the 'Delete' button to delete the Connector.

Note: This action cannot be undone.

<a href="#top">Back to top</a>
<a name="GithubTableSchemas"></a>

Github Table Schemas

Primary keys noted in bold.

<details>
<summary><b><big><a href="https://developer.github.com/v3/issues/assignees/#list-assignees"> Assignees <a/> </big></b></summary>
<body>
<table style="border-collapse:collapse;">
<tr>
<th style="width:200px;text-align:left">Column</th>
<th style="width:200px;text-align:left">Data-Type</th>
<th style="width:200px;text-align:left">Description</th>
</tr>
</thead>
<tbody>
<tr>
<td><b>id</b></td>
<td>Integer</td>
<td>The assignee ID.</td>
</tr>
<tr>
<td>login</td>
<td>String</td>
<td>The User's username.</td>
</tr>
<tr>
<td>type</td>
<td>String</td>
<td>The User's type.</td>
</tr>
<tr>
<td>url</td>
<td>String</td>
<td>The profile URL associated with the user.</td>
</tr>
</table>
</body>
</details>


The assignees table contains info about the available assignees for issues in a repository.

<details>
<summary><b><big>
<a href="https://developer.github.com/v3/repos/collaborators/#list-collaborators"> Collaborators <a/> </big></b></summary>
<body>
<table style="border-collapse:collapse;">
<tr><th style="width:200px;text-align:left">Column</th>
<th style="width:200px;text-align:left">Data-Type</th>
<th style="width:200px;text-align:left">Description</th>
</tr>
</thead>
<tbody>
<tr>
<td><b>id</b></td>
<td>Integer</td>
<td>The collaborator's ID.</td>
</tr>
<tr>
<td>login</td>
<td>String</td>
<td>The collaborator's username.</td>
</tr>
<tr>
<td>type</td>
<td>String</td>
<td>The collaborator's type.</td>
</tr>
<tr>
<td>url</td>
<td>String</td>
<td>The profile URL associated with the collaborator.</td>
</tr>
</table>
</body>
</details>

The collaborators table contains info about the users who contribute to a repository.
For organization-owned repositories, this will include outside collaborators, organization owners, organization members that are direct collaborators, who have access through team memberships, or have access through default organization permissions.

<details>
<summary><b><big>
<a href="https://developer.github.com/v3/pulls/comments/#list-comments-on-a-pull-request"> Comments <a/> </big></b></summary>
<body>
<table style="border-collapse:collapse;">
<tr><th style="width:200px;text-align:left">Column</th>
<th style="width:200px;text-align:left">Data-Type</th>
<th style="width:200px;text-align:left">Description</th>
</tr>
</thead>
<tbody>
<tr>
<td><b>id</b></td>
<td>Integer</td>
<td>The comment ID.</td>
</tr>
<tr>
<td>updated_at</td>
<td>Date-Time</td>
<td>The time the comment was last updated.</td>
</tr>
<tr>
<td>body</td>
<td>String</td>
<td>The body of the comment.</td>
</tr>
<tr>
<td>created_at</td>
<td>Date-Time</td>
<td>The time the comment was created.</td>
</tr>
<tr>
<td>home_url</td>
<td>String</td>
<td>The home URL of the comment.</td>
</tr>
<tr>
<td>html_url</td>
<td>String</td>
<td>The HTML URL of the comment.</td>
</tr>
<tr>
<td>issue_url</td>
<td>String</td>
<td>The URL of the issue associated with the comment.</td>
</tr>
<tr>
<td>node_id</td>
<td>String</td>
<td>The node ID.</td>
</tr>
<tr>
<td>url</td>
<td>String</td>
<td>The GITHub URL of the comment.</td>
</tr>
<tr>
<td>user</td>
<td>Object</td>
<td>Details about the user who created the comment.</td>
</tr>
</table>
</body>
</details>

The comments table contains info about comments made on issues.

<details>
<summary><b><big>
<a href="https://developer.github.com/v3/repos/commits/#list-commits-on-a-repository"> Commits <a/> </big></b></summary>
<body>
<table style="border-collapse:collapse;">
<tr><th style="width:200px;text-align:left">Column</th>
<th style="width:200px;text-align:left">Data-Type</th>
<th style="width:200px;text-align:left">Description</th>
</tr>
</thead>
<tbody>
<tr>
<td><b>sha</b></td>
<td>String</td>
<td>The git commit hash.</td>
</tr>
<tr>
<td>comments_url</td>
<td>String</td>
<td>The URL to the commit's comments page.</td>
</tr>
<tr>
<td>commit</td>
<td>object</td>
<td>Details about the commit.</td>
</tr>
<tr>
<td>html_url</td>
<td>String</td>
<td>The HTL URL to the commit.</td>
</tr>
<tr>
<td>parents</td>
<td>Array</td>
<td>Details about the parent commits.</td>
</tr>
<tr>
<td>url</td>
<td>String</td>
<td>The URL of the commit.</td>
</tr>
</table>
</body>
</details>

The commits table contains info about repository commits in a project.

<details>
<summary><b><big>
<a href="https://developer.github.com/v3/issues/#list-issues-for-a-repository"> Issues <a/> </big></b></summary>
<body>
<table style="border-collapse:collapse;">
<tr><th style="width:200px;text-align:left">Column</th>
<th style="width:200px;text-align:left">Data-Type</th>
<th style="width:200px;text-align:left">Description</th>
</tr>
</thead>
<tbody>
<tr>
<td><b>id</b></td>
<td>String</td>
<td>The issue id.</td>
</tr>
<tr>
<td>updated_at</td>
<td>Date-Time</td>
<td>The last time the issue was updated.</td>
</tr>
</table>
</body>
</details>

The issues table contains info about repository issues.
Note: GitHub’s API considers every pull request an issue, but not every issue may be a pull request. Therefore, this table may contain both issues and pull requests.

<details>
<summary><b><big>
<a href="https://developer.github.com/v3/pulls/#list-pull-requests"> Pull_Requests <a/> </big></b></summary>
<body>
<table style="border-collapse:collapse;">
<tr><th style="width:200px;text-align:left">Column</th>
<th style="width:200px;text-align:left">Data-Type</th>
<th style="width:200px;text-align:left">Description</th>
</tr>
</thead>
<tbody>
<tr>
<td><b>id</b></td>
<td>Integer</td>
<td>The pull request ID.</td>
</tr>
<tr>
<td>updated_at</td>
<td>Date-Time</td>
<td>The last time the pull request was updated.</td>
</tr>
<tr>
<td>body</td>
<td>String</td>
<td>The description of the pull request.</td>
</tr>
<tr>
<td>closed_at</td>
<td>Date-Time</td>
<td>The time the pull request was closed.</td>
</tr>
<tr>
<td>created_at</td>
<td>Date-Time</td>
<td>The time the pull request was created.</td>
</tr>
<tr>
<td>merged_at</td>
<td>Date-Time</td>
<td>The time the pull request was merged.</td>
</tr>
<tr>
<td>number</td>
<td>Integer</td>
<td>The number of hte pull request in the repository.</td>
</tr>
<tr>
<td>state</td>
<td>String</td>
<td>The current status of the pull request. For example: open.</td>
</tr>
<tr>
<td>title</td>
<td>String</td>
<td>The title of the pull request.</td>
</tr>
<tr>
<td>url</td>
<td>String</td>
<td>The URL of the pull request.</td>
</tr>
<tr>
<td>user</td>
<td>Object</td>
<td>Details about the user who created the pull request.</td>
</tr>
</table>
</body>
</details>

The pull_requests table contains info about pull requests made against the repository.

<details>
<summary><b><big>
<a href="https://developer.github.com/v3/pulls/comments/#list-comments-on-a-pull-request"> Review Comments <a/> </big></b></summary>
<body>
<table style="border-collapse:collapse;">
<tr><th style="width:200px;text-align:left">Column</th>
<th style="width:200px;text-align:left">Data-Type</th>
<th style="width:200px;text-align:left">Description</th>
</tr>
</thead>
<tbody>
<tr>
<td><b>id</b></td>
<td>Integer</td>
<td>The review comment ID.</td>
</tr>
<tr>
<td>updated_at</td>
<td>Date-Time</td>
<td>The time the review comment was last updated.</td>
</tr>
<tr>
<td>body</td>
<td>String</td>
<td>The description of the review comment.</td>
</tr>
<tr>
<td>commit_id</td>
<td>String</td>
<td>The ID of the commit the review comment is associated with.</td>
</tr>
<tr>
<td>created_at</td>
<td>Date-Time</td>
<td>The time the review comment was created.</td>
</tr>
<tr>
<td>diff_url</td>
<td>String</td>
<td>The diff URL associated with the review comment.</td>
</tr>
<tr>
<td>html_url</td>
<td>String</td>
<td>The HTML URL of the review comment.</td>
</tr>
<tr>
<td>in_reply_to_id</td>
<td>Integer</td>
<td>If the review comment is a reply to another review comment.</td>
</tr>
<tr>
<td>issue_url</td>
<td>String</td>
<td>The URL of the issue associated with the review comment.</td>
</tr>
<tr>
<td>node_id</td>
<td>String</td>
<td>The review comment's node ID.</td>
</tr>
<tr>
<td>original_position</td>
<td>Integer</td>
<td>The original position of the review comment.</td>
</tr>
<tr>
<td>original_commit_id</td>
<td>String</td>
<td>The ID of the original comment the review comment is associated with.</td>
</tr>
<tr>
<td>pull_request_review_id</td>
<td>Integer</td>
<td>The ID of the pull request review the comment is a part of.</td>
</tr>
<tr>
<td>path</td>
<td>String</td>
<td>The path of the file the review comment was made on.</td>
</tr>
<tr>
<td>position</td>
<td>Integer</td>
<td>The position of the review comment.</td>
</tr>
<tr>
<td>pull_request_url</td>
<td>String</td>
<td>The URL of the pull request associated with the review comment.</td>
</tr>
<tr>
<td>url</td>
<td>String</td>
<td>The GitHub URL of the review comment.</td>
</tr>
<tr>
<td>user</td>
<td>Object</td>
<td>Details about the user who created the review comment.</td>
</tr>
</table>
</body>
</details>

The review_comments table contains info about comments made on pull request reviews.
Note: In order to replicate this table, you must also set the pull_requests table to replicate.

<details>
<summary><b><big>
<a href="https://developer.github.com/v3/activity/starring/#list-stargazers"> Stargazers <a/> </big></b></summary>
<body>
<table style="border-collapse:collapse;">
<tr><th style="width:200px;text-align:left">Column</th>
<th style="width:200px;text-align:left">Data-Type</th>
<th style="width:200px;text-align:left">Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>user_id</td>
<td>Integer</td>
<td>The user id.</td>
</tr>
<tr>
<td>started_at</td>
<td>String</td>
<td>The time the user starred the repository.</td>
</tr>
<tr>
<td>user</td>
<td>Object</td>
<td>Details about the user who starred the repository.</td>
</tr>
</table>
</body>
</details>

The stargazers table contains info about users who have “starred” a repository.

<details>
<summary><b><big>
<a href="https://developer.github.com/v3/pulls/reviews/#list-reviews-on-a-pull-request"> Reviews <a/> </big></b></summary>
<body>
<table style="border-collapse:collapse;">
<tr><th style="width:200px;text-align:left">Column</th>
<th style="width:200px;text-align:left">Data-Type</th>
<th style="width:200px;text-align:left">Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>id</td>
<td>Integer</td>
<td>The review id.</td>
</tr>
<tr>
<td>body</td>
<td>String</td>
<td>The description of the review.</td>
</tr>
<tr>
<td>commit_id</td>
<td>String</td>
<td>The ID of the commit the review was performed on.</td>
</tr>
<tr>
<td>html_url</td>
<td>String</td>
<td>The HTML URL to the review.</td>
</tr>
<tr>
<td>pull_request_url</td>
<td>String</td>
<td>The URL to the pull request being reviewed.</td>
</tr>
<tr>
<td>state</td>
<td>String</td>
<td>The state of the review.</td>
</tr>
<tr>
<td>user</td>
<td>Object</td>
<td>Details about the user who starred the repository.</td>
</tr>
</table>
</body>
</details>

The reviews table contains info about pull request reviews. A pull request review is a group of comments on a pull request.
Note: In order to replicate this table, you must also set the pull_requests table to replicate.


<a href="#top">Back to top</a>

Our support team is ready to help