Can a AWS S3 used as a SQL/NOSQL Table ? Yes , by using TAGS

Maurya Allimuthu
2 min readDec 26, 2019

--

AWS S3 object / file

AWS S3 : short intro

AWS s3 is ‘simple storage service’ where the data as FILE or OBJECTS is stored. The familiar terms are 1) Buckets 2) Objects. Please refer the below links for additional infos a) https://medium.com/faun/what-is-amazon-s3-91b0480dedcc b) https://medium.com/@me.sanjeev3d/amazon-s3-4b2ae15f6c4d c) https://medium.com/@yjhyjhyjh0/aws-s3-overview-38bca96047b0

Data format of S3 objects

Usually s3 objects are json, jl, csv, text, zip, jpeg, png, xlsx etc. Please refer Data Format section in https://streamsets.com/documentation/datacollector/latest/help/datacollector/UserGuide/Destinations/AmazonS3.html for more info.

Concept of using S3 objects as table

Let it be any kind of file or object, the key idea is use the TAGS name of the S3 object as table columns and TAGS value as the column values.

TAGs of AWS s3 Object

Here, you can see, the key which can be used to refer as Table columns and the Value which act as the respective value. The keys can also be dynamically changed and values as well. While parsing/reading the S3 objects inside a Bucket, the program can just refer the META attributes like TAGS and need not open/download the file from the AWS Cloud.

Code to insert / add TAGS

Please refer the below code for add / insert tags and create object in S3

Known caveats

  1. Cannot have more than 10 tags per s3 object
  2. Cannot have more than 50 tags per S3 bucket
  3. Cannot create S3 bucket with underscores “_”

--

--

No responses yet