Job Description:
1. Develop, operate, and monitor ETL for data collection, cleansing, processing, storage, and analytics
2. Develop data visualization to track data usage, server status, and represent data insights
3. Automate data process tasks and support data analysis tasks
4. `Use existing` algorithms and implement data-mining strategies to achieve our end goal.
Required Experience & Skills:
1. At least 2 years of experience in building and operating large scale distributed systems or applications
2. At least 2 years of experience in processing data with Python `(scikit-learn)`
3. Experience with implementing highly scalable systems on top of cloud infrastructures, such as AWS or GCP
4. Expertise in Linux/Unix environments and familiar with shell scripting
5. Expertise in hands-on ETL job design, components and modules development of data process with Airflow
6. Great communication skill and can-do attitude
7. Desired skills: Python `(scikit-learn)`,Linux/Unix, AWS, Mongo/PostgreSQL, Airflow, BigQuery
加分:
1. Data mining and machine learning algorithm experience is a big plus