Access S3 Tables with QuickSight
Posted: | Tags: aws til athena cloud s3I’ve been exploring AWS analytics services again so it’s time for another TIL. This post goes over accessing and querying S3 tables from QuickSight, via Athena, as S3 tables don’t natively work with QuickSight.
Here’s the TL;DR on what you need to do:
- Create your S3 Tables bucket, namespace and table.
- Ensure that your S3 Tables bucket has AWS analytics service integration enabled.
- Grant your QuickSight user permission to the S3 table via LakeFormation and your QuickSight service role permisisons through IAM.
- Add Athena as a data source in QuickSight.
- Add a dataset from the previously created data source and use custom SQL to query your S3 Tables.
- Verify you can use your table dataset from an analysis.
Those are a lot of steps and here are some AWS resources that explain them:
Tutorial: Getting started with S3 Tables: This will cover steps 1 and 2, and set you up to test your S3 Tables within Athena. This will come in handy when you want to troubleshoot issues with QuickSight especially if you haven’t used S3 Tables before.
Visualizing table data with Quick Suite: This theoretically should be all you need, but here I am writing this post because I didn’t find it clear enough.
Analyzing Amazon S3 Metadata with Amazon Athena and Amazon QuickSight: S3 Metadata uses S3 Tables and this blog provides images and instructions on setting up the Athena data source, dataset and granting permissions.
If you know your way around these services this should be all you need, but I’ve also described each step below in more detail in case you encounter any issues or future me needs help again.
Creating the S3 Table
My steps don’t differ from the AWS S3 Tables tutorial listed above.
Create an S3 table bucket via the console and ensure Integration with AWS analytics services is enabled. This step could be done via the CLI but you may still need to use the console to enable the analytics integration, as I cannot find that option in the CLI documentation.
Create a namespace in the table bucket.
aws s3tables create-namespace \
--table-bucket-arn arn:aws:s3tables:us-east-1:111122223333:bucket/amzn-s3-demo-table-bucket \
--namespace my_namespace
- Create a table in the namespace with your definition in a JSON file.
aws s3tables create-table --cli-input-json file://tabledefinition.json
Here’s my tabledefinition.json file:
{
"tableBucketARN": "arn:aws:s3tables:us-east-1:111122223333:bucket/amzn-s3-demo-table-bucket",
"namespace": "my_namespace",
"name": "my_table",
"format": "ICEBERG",
"metadata": {
"iceberg": {
"schema": {
"fields": [
{"name": "id", "type": "int","required": true},
{"name": "name", "type": "string"},
{"name": "some_int", "type": "int"},
{"name": "a_decimal", "type": "decimal(38,19)"}
]
}
}
}
}
The AWS tutorial now goes into granting permissions using LakeFormation for Athena and running a sample query to test everything works. I’d recommend giving this a go, if you haven’t done so already, since it would help with troubleshooting in case something doesn’t work later on with QuickSight.
Okay, that’s steps 1 and 2 done.
Granting access to QuickSight
There are two sets of access that need to be sorted here, IAM and LakeFormation.
First, the QuickSight service role needs to be able to get Glue catalogs. Find your QuickSight service role by clicking the user icon in the top right in the Quick Suite console. Then click Manage Quick Suite, followed by AWS Resources, note down the IAM role used.
Now, from the AWS IAM console find the role and add a policy to allow
glue:GetCatalogon all resources. This is described in the AWS documentation.
Next, to grant access to your QuickSight user through LakeFormation. This is also described in the AWS blog post.
- Find your QuickSight user using the CLI and then copy the ARN.
aws quicksight list-users --aws-account-id <AWS Account ID> --namespace default --region <AWS Region>
- From the LakeFormation console select Data permissions, then Grant. On the new page select SAML users and groups for the Principal enter the QuickSight user ARN. Next, select Named Data Catalog resources, under Catalog and select your S3 table bucket, for Databases select your namespace and for Tables select your S3 table. Give SELECT and GRANT permissions to your QuickSight user and click Grant.
We’ve now completed step 3.
Setting up QuickSight
In order to use S3 Tables you will first need to set up an Athena data source and then using that create a dataset that queries the table.
Within Amazon QuickSight, click Datasets in the left hand column then Data sources. From here click Create data source.
From the new dialog box enter a name for the data source and pick an Athena workgroup, if you’ve created a new one. You can now validate the connection and click Create data source when complete.
Now click Datasets, then Create dataset.
From the table in the dialog box select the newly created data source and click Create.
In the Choose your table dialog box click Use custom SQL, name your query something, such as the table name and enter the following SQL query:
SELECT * FROM "s3tablescatalog/<table-bucket-name>"."<namespace>"."<table-name>"
- Select Confirm query and in the new dialog box you may select Directly query your data or Import to SPICE for quicker analytics. Keep in mind SPICE may cost you extra depending on your available capacity. I selected to directly query my table and took the performance hit in my experiments. Once you’ve made your decision select Visualize. This will take you to an analysis to begin building your dashboard.
That’s it! We’ve now completed steps 4, 5, and 6. Now go build but don’t forget to watch the bill.