Senior Data Engineer | Remote-Friendly
At Velotio, we are embracing a remote-friendly work culture where everyone has the flexibility to either work remotely or from our office in Pune.
Join us and work from wherever you feel most productive!
About Velotio:
Velotio Technologies is a product engineering company working with innovative startups and enterprises. We are a certified Great Place to Work® and recognized as one of the best companies to work for in India. We have provided full-stack product development for 110+ startups across the globe building products in the cloud-native, data engineering, B2B SaaS, IoT & Machine Learning space. Our team of 325+ elite software engineers solves hard technical problems while transforming customer ideas into successful products.
Requirements
- Design and build scalable data infrastructure with efficiency, reliability, and consistency to meet rapidly growing data needs
- Build the applications required for optimal extraction, cleaning, transformation, and loading data from disparate data sources and formats using the latest big data technologies
- Building ETL/ELT pipelines and work with other data infrastructure components, like Data Lakes, Data Warehouses and BI/reporting/analytics tools
- Work with various cloud services like AWS, GCP, Azure to implement highly available, horizontally scalable data processing and storage systems and automate manual processes and workflows
- Implement processes and systems to monitor data quality, to ensure data is always accurate, reliable, and available for the stakeholders and other business processes that depend on it
- Work closely with different business units and engineering teams to develop a long-term data platform architecture strategy and thus foster data-driven decision-making practices across the organization
- Help establish and maintain a high level of operational excellence in data engineering
- Evaluate, integrate, and build tools to accelerate Data Engineering, Data Science, Business Intelligence, Reporting, and Analytics as needed
- Focus on building test-driven development by writing unit/integration tests
- Contribute to design documents and engineering wiki
You will enjoy this role if you…
- Like building elegant well-architected software products with enterprise customers
- Want to learn to leverage public cloud services & cutting-edge big data technologies, like Spark, Airflow, Hadoop, Snowflake, and Redshift
- Work collaboratively as part of a close-knit team of geeks, architects, and leads
Desired Skills & Experience:
- 3-5 years of data engineering or equivalent knowledge and ability
- 3 years software engineering or equivalent knowledge and ability
- Strong proficiency in at least one of the following programming languages: Python, Scala, or Java
- Experience designing and maintaining at least one type of database (Object Store, Columnar, In-memory, Relational, Tabular, Key-Value Store, Triple-store, Tuple-store, Graph, and other related database types)
- Good understanding of star/snowflake schema designs
- Extensive experience working with big data technologies like Spark, Hadoop, Hive
- Experience building ETL/ELT pipelines and working on other data infrastructure components like BI/reporting/analytics tools
- Experience working with workflow orchestration tools like Apache Airflow, Oozie, Azkaban, NiFi, Airbyte, etc.
- Experience building production-grade data backup/restore strategies and disaster recovery solutions
- Hands-on experience implementing batch and stream data processing applications using technologies like AWS DMS, Apache Flink, Apache Spark, AWS Kinesis, Kafka, etc.
- Hands-on experience implementing big data solutions with tools like Snowflake, DataBricks, etc
- Knowledge of best practices in developing and deploying applications that are highly available and scalable
- Experience with or knowledge of Agile Software Development methodologies
- Excellent problem-solving and troubleshooting skills
- Process-oriented with excellent documentation skills
Bonus points if you:
- Have hands-on experience using one or multiple cloud service providers like AWS, GCP, and Azure and have worked with specific products like EMR, Glue, DataProc, DataBricks, Snowpark, DataStudio, etc
- Have hands-on experience working with either Redshift, Snowflake, BigQuery, Databricks, Azure Synapse or Athena and understand the inner workings of these cloud storage systems.
- Have experience building DataLakes, scalable data warehouses, and DataMarts
- Familiarity with tools like Jupyter Notebooks, Pandas, NumPy, SciPy, sci-kit learn, Seaborn, SparkML, etc.
- Have experience building and deploying Machine Learning models to production at scale
- Have experience with System Design and Architecture
- Possess excellent cross-functional collaboration and communication skills.
Benefits
Our Culture:
- We have an autonomous and empowered work culture encouraging individuals to take ownership and grow quickly
- Flat hierarchy with fast decision making and a startup-oriented “get things done” culture
- A strong, fun & positive environment with regular celebrations of our success. We pride ourselves in creating an inclusive, diverse & authentic environment
Note: Currently, all interviews and onboarding processes at Velotio are being carried out remotely through virtual meetings.