工作內容
- Develop and maintain data pipeline will be the primary responsibility
- Maintain, expand and improve our ETL process and computational infrastructure
- Working with different sources of data (Web crawling and data feed) to process and transform them during ETL process
- Ensure and sustain data quality, such as error detection and report generation
- Responsible of initiatives related to building, maintaining and orchestrating all the components in data platform
條件要求
- Familiar with search engine (elasticsearch or solr)
- Familiar with web crawling (scrapy, etc.)
- Experience in workflow management tool (airflow, nifi, etc.)
- Passion for software development and committed to delivering outstanding work
- Familiar with Linux operating system
- Good programming skill in Python
加分條件
- Experience in massive data processing platform (hadoop, spark, etc.)
員工福利
法定項目
其他福利
薪資範圍