For more background & overview information on Azure Synapse Analytics (what it does, how it can be used), please see documentation here.
Most users of Azure Synapse Analytics will need to use it within a team environment, as such using source control to handle multiple users writing code and building processes in the service will be needed. Thankfully Synapse Analytics Studio includes built-in support for 2 of the most popular Git service providers – GitHub & Azure DevOps.
This video shows how to connect your Synapse Analytics workspace to a GitHub repository using the Azure Synapse Studio tool.
The steps required include:
- You’ll need to have a repository created in your GitHub account and you’ll need to know your GitHub username and the name of the repository in order to properly configure in Synapse Studio. You’ll also need to know how your team handles branches – basically which branches are used for team collaboration.
- From Synapse Studio click the down arrow where by default it says Synapse live (this is viewable when in the Data, Develop, Integrate or Manage hubs [not Home or Monitor]).
- Select Set up code repository and then select the type of repository (for this example, we’re using GitHub).
- Type in the GitHub account name, in this example the account is OrrinEdenfield. You may be prompted here to authenticate to GitHub, this will ensure Synapse has the proper permissions (you, the author using Synapse Studio) to the repository.
- Click Continue.
- Synapse will present a drop down list of repositories under the account or you can use the repository link. Select the repository that you’d like to use for the Synapse Analytics project.
- Next set the collaboration branch. This is the branch where you and your team will be collaborating on code. It’s recommended to not allow direct check-ins to the collaboration branch. This restriction can help prevent bugs as every check-in will go through a pull request review process described in Creating feature branches. If you just initialized your repository in GitHub, you may need to create a collaboration branch and you can do that here on this configuration, click the down arrow and select Create new.
- If you’ve already saved some scripts/notebooks/pipelines in Synapse Studio and you want to make sure they are commited to the repository, be sure that Import existing resources to repository is checked.
- Click Apply.
- Synapse Studio will now ask you which branch you want to work from (and will every time it is loaded/opened), in this case I’ll select the collaboration branch and I can begin working with my team.
Once Synapse Studio publishes changes, you will need to create a pull request (link in the drop down menu by the name of the branch) to merge changes into main branch or other branches. At this point you would follow your regular process to complete pull request, review changes, and make commits.
Some additional information regarding Synapse & GitHub include:
- Azure Quickstart Template for Synapse Analytics
- Azure Synapse – Import Sample Data in 5 Minutes! – YouTube
- Continuous integration and delivery for Synapse workspace – Microsoft Docs
- Understanding the GitHub flow · GitHub Guides
- About continuous integration – GitHub Docs
- Azure Synapse Analytics Documentation – Microsoft Docs
Happy Synapsing with GitHub!