Sergey Dmitriev, Developer in Seattle, United States
Sergey is available for hire
Hire Sergey

Sergey Dmitriev

Verified Expert  in Engineering

Software Architect and Developer

Location
Seattle, United States
Toptal Member Since
June 18, 2020

Sergey is a senior data management professional, solution architect, 云架构师,拥有超过20年的开发数据密集型应用程序的经验,以及构建和领导技术团队以成功交付具有挑战性的软件开发和迁移项目的经验. Sergey精通软件设计和开发的各个方面,并在应用程序交付计划方面表现出专业知识, design, and development.

Portfolio

Dropbox
SQL, Python, Apache Airflow, Spark, Spark SQL, Apache Hive...
Facebook
SQL, Python, Apache Hive, Presto, Spark SQL, Spark,数据库开发,ETL...
Gradient
Google Cloud, Apache Airflow, Python, Google BigQuery...

Experience

Availability

Part-time

Preferred Environment

Amazon Web Services (AWS), Google Cloud Platform (GCP), Spark, PySpark, Python, Apache Hive, Databricks, Snowflake, Apache Airflow, Fivetran

The most amazing...

...我完成的项目涉及到一个授权服务(Facebook)的异常检测, data infrastructure security).

Work Experience

Staff Data Engineer

2021 - PRESENT
Dropbox
  • Implemented a reliable integration with Google Ads Services.
  • 将数据管道从传统平台迁移到Spark和Airflow.
  • Consolidated legacy data platform elements into a strategic one.
Technologies: SQL, Python, Apache Airflow, Spark, Spark SQL, Apache Hive, Database Development, ETL, Database Design, Amazon S3 (AWS S3), Amazon EC2, Git, Databases, Data Modeling, Software Architecture, Shell Scripting, PySpark, Linux, Data Integration

Staff Data Engineer

2019 - 2021
Facebook
  • 为某数据平台授权服务搭建异常检测平台.
  • 实现了Instagram和Messenger之间实时通信的基础设施分析.
  • 为实时通信产品的运行状况监控创建了统一的分析平台.
Technologies: SQL, Python, Apache Hive, Presto, Spark SQL, Spark,数据库开发,ETL, Database Design, Databases, Data Modeling, Software Design, Shell Scripting, PySpark, Linux, Data Integration

Senior Back-end Engineer

2018 - 2019
Gradient
  • 构建了一个数据平台来摄取和处理来自不同来源的数据.
  • 为Power BI分析和ML算法创建数据管道.
  • 构建监控报警工具,简化数据平台操作.
Technologies: Google Cloud, Apache Airflow, Python, Google BigQuery, Amazon Web Services (AWS), Database Development, ETL, Database Design, Amazon S3 (AWS S3), Amazon EC2, Git, Databases, Data Modeling, Software Design, Software Architecture, Shell Scripting, Linux, Data Integration

Lead Solution Architect of Data Platforms

2017 - 2018
Amazon Web Services
  • Planned and implemented relational database migrations to AWS.
  • 在AWS中设计和实现数据仓库、数据湖和操作数据存储.
  • 使用SQL和Python在AWS上设计和实现数据管道.
  • 为托管在AWS RDS和EC2上的数据库(Oracle)创建数据模型和数据库组件.
  • 优化了托管在AWS中的数据库(Oracle)的报表和SQL查询的性能.
  • 为数据库迁移到AWS创建了实验室、销售演示和会议活动.
Technologies: Python, SQL, Amazon Web Services (AWS), Relational Databases, Database Development, ETL, Oracle RDBMS, Database Design, Amazon EC2, AWS CloudFormation, Amazon Aurora, Redshift, Pandas, Hadoop, Spark, Git, Databases, Data Modeling, Software Design, Software Architecture, Shell Scripting, PySpark, Linux, Amazon Elastic MapReduce (EMR), Data Integration

Lead Data Architect

2005 - 2017
Deutsche Bank
  • 在Exadata上整合遗留Oracle数据库(最大可达100TB),包括合并模型和数据, migrating data, and modifying PL/SQL, Shell, and Java code.
  • 将Exadata上报告组件的性能从几个小时的运行时间优化到几秒钟.
  • 为一个管理上市衍生品交易生命周期的应用程序创建了数据模型和数据库组件(Oracle),000 transactions per second).
  • 为风险管理平台设计并实现了数据模型和数据库代码,为合规报告的每个风险计算捕获风险模型参数.
  • 设计并实现了一个动态配置的报告引擎(在PL/SQL中),用于处理30TB的数据集.
  • 为销售IT部门设计并实现了一个实时仓库的数据模型和数据库代码, receiving information from 150+ feeds, and applying complicated logic to calculate sales commissions.
Technologies: Shell Scripting, Databases, SQL, Oracle, Python, Database Development, ETL, Oracle RDBMS, Database Design, Data Modeling, Software Design, Software Architecture, Linux, Data Integration

Senior Database Developer | Database Administrator

2000 - 2005
INIT-S
  • 设计和开发核电厂文件管理系统和资源管理系统,管理每个核电厂的整个文件流程.
  • Enhanced and automated resource management, 执行不同平台之间的数据库迁移(Sybase ASE), MS SQL Server, Oracle), database server administration, deployment packages creation, and consulting customers.
  • Migrated critical databases from Sybase ASE to Oracle.
Technologies: SQL, Erwin, Oracle, Database Development, ETL, Oracle RDBMS, Database Design, Databases, Data Modeling, Shell Scripting, Data Integration

核心银行股权结算应用程序的转换程序

I rebuilt a platform using COBOL, C++, and EJB 1.0个组件和6个数据库(在Oracle上100TB)到一个托管在本地云平台上的Java应用程序,在Oracle Exadata集群上有一个整合数据库.

Data Pipelines on Google Compute Cloud

我已经构建了数据管道,用复杂的SQL转换逻辑将数据从Google Cloud Storage中的JSON格式文件加载到BigQuery, 在PostgreSQL (Google SQL)中聚合和加载数据到一个数据集市,供应用程序的UI使用.

Data Pipelines on AWS

我使用Hadoop作为数据管道的来源,以及运行Pig的执行平台, Hive, and Presto. I used Spark in data pipelines to do ETL in batch mode

我的Python经验包括为数据仓库和数据科学项目构建数据管道. 我还在AWS上构建了本地和云数据管道以及后端无服务器云api.

我还用Spark在批处理模式下计算密集的数据管道和Spark SQL,所以我非常适应使用Spark,并将很快学习新的用例.

Anomaly Detection for Data Platform Access Authorization

该解决方案监视和验证数据平台授权服务的决策,并识别可疑行为. 这种可疑行为可能是由服务错误或安全配置问题引起的.

实时通信平台的基础结构分析

该项目涉及从基础设施服务和客户端应用程序收集日志,以及构建数据管道以创建核心数据集, metrics, and dashboards. 我还制作了ML管道来计算A/B测试的结果,并使用根本原因分析ML算法实现了一个工具来属性度量运动.

Languages

SQL, Snowflake, Python, COBOL

Frameworks

Presto, Spark, Hadoop, Apache Thrift

Libraries/APIs

PySpark, Pandas

Tools

Apache Airflow, Erwin, Oracle Exadata, Spark SQL, Amazon Elastic MapReduce (EMR), AWS CloudFormation, Git, Tableau

Paradigms

Database Development, ETL, Database Design

Platforms

Oracle, Amazon Web Services (AWS), Databricks, Google Cloud Platform (GCP), Amazon EC2, Linux, Apache Pig

Storage

Relational Databases, Databases, Oracle RDBMS, Redshift, Amazon S3 (AWS S3), Data Integration, Apache Hive, Amazon Aurora, Google Cloud, PostgreSQL

Other

数据建模,Shell脚本,Google BigQuery,软件设计,软件架构,Amazon RDS

1997 - 2003

Master's Degree in Computer Science

Moscow Power Engineering Institute - Moscow, Russia

JULY 2017 - JULY 2019

AWS Certified Solution Architect - Associate

AWS

MARCH 2005 - PRESENT

Oracle Certified Professional (DBA)

Oracle

Collaboration That Works

How to Work with Toptal

在数小时内,而不是数周或数月,我们的网络将为您直接匹配全球行业专家.

1

Share your needs

在与Toptal领域专家的电话中讨论您的需求并细化您的范围.
2

Choose your talent

在24小时内获得专业匹配人才的简短列表,以进行审查,面试和选择.
3

Start your risk-free talent trial

与你选择的人才一起工作,试用最多两周. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring