Experience in optimizing ETL workflows for large and unruly data.
Developed MapReduce programs to parse the raw data, populate staging tables and store the refined data in partitioned tables in the Data Warehouses(DW).
Hands on Experience on Linux systems
Big Data Ecosystems: Hadoop, MapReduce, HDFS, Spark, Cascading, Hive, Scoop
Programming Languages: Java , Scala, C
Databases: MySQL, Oracle
Services: Azure