USER MANUALS

AWS S3

Before deploying the Denodo Lakehouse Accelerator (formerly known as Denodo Embedded MPP) on Amazon EKS check Denodo Lakehouse Accelerator AWS Checklist to make sure you have everything you need.

There are two options to deploy a Denodo Lakehouse Accelerator that will access AWS S3 datasets:

  1. The recommended one: provide no AWS S3 credentials to the kubectl create secret command.

    Used when the Denodo Lakehouse Accelerator will run in Amazon EKS and will access S3 with one of these three methods:

    1. EKS Pod Identities

      To do this, you must set up the Kubernetes serviceAccount name provided by your EKS administrator in the values.yaml, e.g:

      serviceAccount:
        create: true
        name: "pod-identity-service-account"
        annotations: {}
      
    2. IAM Roles for Service Accounts

      To do this you can associate an IAM role with the serviceAccount through the serviceAccount.annotations in the values.yaml, e.g:

      serviceAccount:
        create: true
        annotations:
          eks.amazonaws.com/role-arn: arn:aws:iam::<awsaccountid>:role/<role>
      

      and the Denodo Lakehouse Accelerator will access S3, using the permissions configured in that IAM role.

    3. IAM EC2 instance profile

  2. Provide the AWS S3 access and secret key ID to the kubectl create secret command:

    ENV Variable

    Description

    AWS_ACCESS_KEY_ID

    AWS access key ID

    AWS_SECRET_ACCESS_KEY

    AWS secret key

    kubectl create secret generic mpp-credentials
    --from-literal=METASTORE_DB_PASSWORD=hive
    --from-literal=AWS_ACCESS_KEY_ID=awsaccesskeyid
    --from-literal=AWS_SECRET_ACCESS_KEY=awssecretaccesskey
    

    It’s also necessary to set true the objectStorage.aws.securityCredentials.enabled property in values.yaml

    Run helm install sentence

    helm install lakehouseaccelerator lakehouseaccelerator/
    

AWS Privileges for AWS S3

The AWS privileges required by the Denodo Lakehouse Accelerator when accessing the AWS S3 buckets are:

  • Reading from AWS S3:

    • s3:GetObject

    • s3:ListBucket

  • Writing to AWS S3. Same as for reading and also:

    • s3:PutObject

    • s3:DeleteObject

Important

AWS credentials provider.

The Denodo Lakehouse Accelerator is shipped with a credentials provider chain configured by default DenodoAWSCredentialsProviderChain.

This chain looks for AWS credentials in this order:

  • SimpleAWSCredentialsProvider: Loads credentials from fs.s3a.access.key and fs.s3a.secret.key properties in Hadoop configuration files.

  • EnvironmentVariableCredentialsProvider: Loads credentials from environment variables AWS_ACCESS_KEY_ID and AWS_SECRET_KEY.

  • SystemPropertiesCredentialsProvider: Loads credentials from Java system properties aws.accessKeyId and aws.secretKey.

  • WebIdentityTokenCredentialsProvider: Loads Web Identity Token credentials from the environment or container.

  • ProfileCredentialsProvider: Load credential profiles file at the default location ~/.aws/credentials.

  • EC2ContainerCredentialsProviderWrapper: Loads credentials from EC2, typically using the InstanceProfileCredentialsProvider.

If none of these providers fit your needs, you will need to change the credentials provider configured in:

  1. presto/conf/catalog/core-site.xml:

    1. Replace the value of the presto.s3.credentials-provider property with the AWS credentials provider of your choice.

    2. Include any other properties required by this credential provider.

  2. hive-metastore/conf/core-site.xml:

    1. Replace the value of the fs.s3a.aws.credentials.provider property with the AWS credentials provider of your choice.

    2. Include any other properties required by this credential provider.

Add feedback