Getting acquainted with Azure Synapse SQL Serverless

Azure Synapse Analytics SQL Serverless is a distributed data processing/query service over data that lives in your Azure Data Lake. The service enables access to data through the following functionalities:

  • A familiar T-SQL syntax to query data in place without the need to copy or load data into another data store
  • Integrated connectivity via the MSSQL driver that has wide support amongst business intelligence and other query tools

Synapse SQL Serverless was built for large-scale data processing. The service provides fault-tolerance and enables high reliability even for long-running queries involving very large datasets.

In this video we will walk through some of the capabilities and benefits then in the 2nd video we’ll see some examples of how Synapse SQL Serverless can be used.

Video introduction to Synapse SQL Serverless:

Demo of Synapse SQL Serverless:

Be sure to check out documentation on Azure Synapse SQL Serverless:

Using COPY command with Azure Synapse Analytics SQL Dedicated Pool

Azure Synapse Analytics includes many features and capabilities, among those is the COPY command which makes copying data into a data warehouse very easy. This video will walk you though using the COPY command to import data into a data warehouse table for use by data consumers.

Additional Resources:
Azure Synapse Analytics SQL Dedicated Pool Architecture| Microsoft Docs
Distributed tables design guidance – Azure Synapse Analytics | Microsoft Docs
Design guidance for replicated tables – Azure Synapse Analytics | Microsoft Docs
COPY Command Reference | Microsoft Docs

Integration with On-Premises Data Sources in Azure Synapse Analytics

Today many organizations are cloud hybrid in nature so they need to read from and write to on-premises data stores including file systems and relational databases. In this video I show you how to connect to on-premises file systems and relational databases (like Azure SQL Database Edge) using the Integration Pipelines capabilities of Azure Synapse Analytics and the self-hosted integration runtime.

Notes:
Integration Runtime
Self-Hosted Integration Runtime
Service architecture of Self-Hosted Integration Runtime

Adding source control to Azure Synapse Analytics Studio

For more background & overview information on Azure Synapse Analytics (what it does, how it can be used), please see documentation here.

Most users of Azure Synapse Analytics will need to use it within a team environment, as such using source control to handle multiple users writing code and building processes in the service will be needed. Thankfully Synapse Analytics Studio includes built-in support for 2 of the most popular Git service providers – GitHub & Azure DevOps.

This video shows how to connect your Synapse Analytics workspace to a GitHub repository using the Azure Synapse Studio tool.

Video of Synapse Analytics configuration for GitHub

The steps required include:

  1. You’ll need to have a repository created in your GitHub account and you’ll need to know your GitHub username and the name of the repository in order to properly configure in Synapse Studio. You’ll also need to know how your team handles branches – basically which branches are used for team collaboration.
  2. From Synapse Studio click the down arrow where by default it says Synapse live (this is viewable when in the Data, Develop, Integrate or Manage hubs [not Home or Monitor]).
  3. Select Set up code repository and then select the type of repository (for this example, we’re using GitHub).
  4. Type in the GitHub account name, in this example the account is OrrinEdenfield. You may be prompted here to authenticate to GitHub, this will ensure Synapse has the proper permissions (you, the author using Synapse Studio) to the repository.
  5. Click Continue.
  6. Synapse will present a drop down list of repositories under the account or you can use the repository link. Select the repository that you’d like to use for the Synapse Analytics project.
  7. Next set the collaboration branch. This is the branch where you and your team will be collaborating on code. It’s recommended to not allow direct check-ins to the collaboration branch. This restriction can help prevent bugs as every check-in will go through a pull request review process described in Creating feature branches. If you just initialized your repository in GitHub, you may need to create a collaboration branch and you can do that here on this configuration, click the down arrow and select Create new.
  8. If you’ve already saved some scripts/notebooks/pipelines in Synapse Studio and you want to make sure they are commited to the repository, be sure that Import existing resources to repository is checked.
  9. Click Apply.
  10. Synapse Studio will now ask you which branch you want to work from (and will every time it is loaded/opened), in this case I’ll select the collaboration branch and I can begin working with my team.

Once Synapse Studio publishes changes, you will need to create a pull request (link in the drop down menu by the name of the branch) to merge changes into main branch or other branches. At this point you would follow your regular process to complete pull request, review changes, and make commits.

Some additional information regarding Synapse & GitHub include:

Happy Synapsing with GitHub!