Job Description
Hadoop JD:
Minimum 5 years of experience. At least 4 years experience in real time Big data implementation.
Responsibilities:
• Design and implement hive, Scala,Python and SPARK; Ability to design and implement end to
end solution.
• Build libraries, user defined functions, and frameworks around Hadoop
• Exposure to cloud platform such as AWS, Azure or equivalent is desired.
• Research, evaluate and utilize new technologies/tools/frameworks around Hadoop eco system
• Develop user defined functions to provide custom hive, HDFS, Kafka and SPARK capabilities
• Develop pipelines using PrestoSQL and or Dremio
• Define and build data acquisitions and consumption strategies
• Define & develop best practices
• Strong understanding of Hadoop internals
• Experience in Big Data ingestion tools like Apache Nifi, miNifi, Azure Data Factory, Streamset or
equivalent.
• Experience with databases like SQL Server, Oracle, DB2 etc.
• Experience with performance/scalability tuning, algorithms and computational complexity
• Experience (at least familiarity) with data warehousing, dimensional modeling and ETL
development
• Ability to understand and ERDs and relational database schemas
• Proven ability to work cross functional teams to deliver appropriate resolution
• Work with support teams in resolving operational & performance issues
• Work with architecture/engineering leads and other teams on capacity planning
• Work with Site-Operations team on configuration/upgrades of the cluster
• Experience with open source NOSQL technologies such as HBase and Cassandra
• Experience with messaging & complex event processing systems such as Kafka and Storm
• Machine learning framework (Nice to have)
• Excellent communication skills