Data Architect Associate
Req #: 190103990
Location: Bangalore East, KA, IN
Job Category: Operations
Other responsibilities will include the following:
- Collaborate with model developers to understand and prioritize their book of work, identify data requirements, ensure source data availability in the appropriate data environment and engineer “model ready” datasets
- Maintain a customer business focus on outcomes, work across business and functional boundaries, produce exceptional solutions at scale and drive the success of peers and stakeholders
- Participate in ongoing product backlog refinement planning with technology partners (may serve as an area product owner) to drive progress of modeling data needs spanning the full modeling life-cycle
- Manage data across multiple platforms and deployment paths, differentiate between legacy and target state platforms, manage and track “tech debt”
- Contribute to strategic roadmaps for the Business Modeling data domain that describe a sequence of projects to improve management and utility of the data for the business
- Develop data subject matter expertise corresponding to a particular sub-LOB or business function and be familiar with a broad range of cross domain data
- Develop skills and knowledge by learning from fellow team members and reciprocate by sharing knowledge
- Adhere to data use and controls requirements, complete and submit necessary requests for data intake
- Communicate project status, issues, or escalations with manager and team members
- Support legacy processes and platforms while driving toward target state
- Enable the management of data as a corporate asset: define data (metadata),identify systems of record and authoritative sources, create data quality rules, define security requirements, create data flow diagrams, and administer firm-wide principles, standards, and controls
- Create conceptual and logical models to describe a particular domain of data and use these models to inform the physical design of data-related projects
- Identify areas for efficiency across data domains, such as the elimination of duplicate data or platforms
- Profile, wrangle, and prepare data from diverse sources to support modeling efforts in languages such as Python or PySpark
- Conduct business process analysis and identify data needed to support the processes and determine whether the firm’s data is fit for use within a given process
- Conduct research and development with emerging technologies, determine their applicability to business use cases, document and communicate their recommended use in the firm
- BS or MS in STEM field with quantitative background or equivalent experience/knowledge
- 4+ years of experience on data infrastructure related efforts
- Knowledge of other languages such as Python, PySpark, R or Java/Scala including database design and query optimization
- Strong knowledge of data structures, algorithms and big data tools (Spark, Hive, HDFS, etc.)
- Understanding of Hadoop-related technologies & their applications
- Excellent command of the SQL language
- Strong design, coding, debugging and analytical skills, especially across the big data ecosystem
- Advanced analytical thinking and problem solving skills
- Deep technical knowledge of data infrastructure practices and tools
- Knowledge of version control tools and processes (e.g. Subversion, Git)
- Technical understanding of common RDBMS systems; (e.g. Teradata, Oracle)
- Must have ability to deliver high-quality results under tight deadlines and be comfortable manipulating and summarizing large quantities of data
- Exceptional communication skills, customer focus and necessary expertise in formal and informal testing techniques as well as quality measurement