30.11.2025 aktualisiert

**** ******** ****
verifiziert
Premiumkunde
100 % verfügbar

Lead Data Engineer

Berlin, Deutschland
Deutschland +6
info: Deutschland, Österreich, Schweiz, Belgien, Irland, Luxemburg, Niederlande
Bachelors of Actuarial Science
Berlin, Deutschland
Deutschland +6
info: Deutschland, Österreich, Schweiz, Belgien, Irland, Luxemburg, Niederlande
Bachelors of Actuarial Science

Profilanlagen

Dylan McCullough CV.pdf

Über mich

Lead Data Engineer with 6+ years building enterprise platforms. Led teams of 8 engineers delivering lakehouse migrations and multitenant architectures. Expert in Databricks, Snowflake, AWS, Azure, and GCP. Track record: 10x performance gains, greenfield CI/CD, and GDPR/HIPAA compliance.

Skills

JavaScriptKünstliche IntelligenzApache AirflowLuftfahrtAmazon Web ServicesAmazon Elastic Compute CloudAmazon S3Data AnalysisArchitekturMicrosoft AzureBusiness IntelligenceGoogle BigQueryStrategisches ManagementCloud ComputingCloud-SpeicherComputerprogrammierungContinuous IntegrationMarktsegmentierungData ArchitectureInformation EngineeringETLDatenqualitätData WarehousingDevOpsDialektisch-Behaviorale TherapieDimensionale ModellierungFinanzenGitHubR (Programmiersprache)Medizinische VersorgungHIPAASkalierbarkeitUnternehmensstrukturPythonMachine LearningNatural Language ProcessingPharmazieVisualisierungAbfrageoptimierungMicrosoft Power BIScalaSox-ComplianceSQLStakeholder ManagementStreamingSalesforce TableauTypeScriptGoogle CloudAzure Data FactoryLarge Language ModelsSnowflakeApache SparkAmazon RdsData LakePySparkScikit-learnQlikviewTeam ManagementApache KafkaSpark StreamingDatenmanagementDSGVOApi DesignTerraformAzure Synapse AnalyticsLooker AnalyticsJenkinsAmazon RedshiftDatabricksProgramming Languages
Lead Data Engineer / Data Architect
Lead Data Engineer with 6+ years building enterprise platforms for Johnson & Johnson, Aer Lingus, and Australian Football League. Led teams of 8 engineers delivering lakehouse migrations and multi-tenant architectures. Expert in Databricks, Snowflake, AWS, Azure, and GCP. Track record: 10x performance gains, greenfield CI/CD, and GDPR/HIPAA compliance.
Skills
Platforms: Databricks, Snowflake, Delta Lake, Azure Synapse, BigQuery Cloud: AWS, Azure, GCP Programming: Python, PySpark, SQL, Scala Orchestration: Airflow, Delta Live Tables, Azure Data Factory, Matillion DevOps: GitHub Actions, Jenkins, Terraform, Databricks Asset Bundles Data Quality: Great Expectations, dbt, DLT Expectations AI/ML: OpenAI API, LangChain, Vertex AI, Prophet, scikit-learn Visualisation: Power BI, Tableau, Looker
Languages
English (Native), German (A2)

Sprachen

DeutschgutEnglischMuttersprache

Projekthistorie

Lead Data Engineer

Aer Lingus

Automobil und Fahrzeugbau

5000-10.000 Mitarbeiter

- Led 8 engineers migrating fragmented Oracle, Informatica, Snowflake, and Airflow landscape to unified Databricks medallion architecture
- Ingesting 100M+ daily events from reservations, departure control, baggage handling, flight telemetry, finance, and HR systems
- Built parameterized ingestion framework with Delta Live Tables cutting pipeline delivery from 3 weeks to 3 days
- Migrated 60+ Airflow DAGs to Databricks Workflows, eliminating orchestration tool sprawl and reducing maintenance overhead
- Implemented CI/CD framework from scratch using Databricks Asset Bundles and GitHub Actions, enabling automated testing and deployment across all environments
- Established standardized data quality and testing framework with unit tests, integration tests, and DLT expectations embedded into CI/CD pipeline
- Hackathon Winner 2025: Built AI-powered crew recovery system using LLMs and MCP protocols, reducinng disruption response time from hours to minutes
- Reduced platform costs by €400K/year through platform consolidation, spot instances, auto-scaling clusters, and retiring legacy jobs
- Enforced GDPR compliance via Unity Catalog row/column security and dynamic PII masking on 50+ tables
- Tech: Databricks, Unity Catalog, Delta Live Tables, PySpark, AWS S3, Terraform, GitHub Actions

Senior Data Engineer

Johnson & Johnson

Pharma und Medizintechnik

>10.000 Mitarbeiter

- Led 6 engineers replacing legacy SQL Server and SSIS pipelines with Azure Databricks lakehouse architecture
- Processing batch and near-real-time data from 25+ manufacturing lines across pharmaceutical and medical device production
- Built automated data quality framework with Great Expectations and DLT expectations, catching defects before they impact €45M+ product lines
- Architected fault-tolerant pipelines with circuit breaker patterns, dead-letter handling, and PagerDuty integration
- Implemented Jenkins CI/CD pipeline from scratch with Terraform IaC, eliminating manual deployments and reducing release cycles from days to hours
- Cut production incidents by 70% through automated testing gates and environment promotion workflows
- Achieved 35% average query performance improvement through Delta Lake migration, Z-ORDER clustering, and adaptive query execution
- Reduced annual platform costs by €280K through reserved instances, spot pools, and automated lifecycle policies
- Enforced HIPAA compliance via row-level security, dynamic column masking, and immutable audit logging
- Tech: Azure Databricks, Delta Lake, Azure Data Factory, PySpark, SQL Server, Jenkins, Terraform

Lead Data Scientist

FGS Global

Marketing, PR und Design

1000-5000 Mitarbeiter

- Led 4 engineers building company's first centralized data platform on GCP, replacing fragmented spreadsheets and siloed manual processes
- Ingesting data from 20+ sources including Twitter/X API, NewsAPI, Salesforce CRM, and Google Analytics
- Designed dimensional model in BigQuery supporting campaign analytics, media monitoring, and sentiment tracking across 50+ Fortune 500 client accounts
- Built Airflow orchestrations managing (Cloud Composer), 150+ daily DAGs across ingestion, dbt transformations, and ML inference
- Implemented dbt transformation layer with 200+ models, automated schema tests, and freshness SLAs
- Built NLP pipeline using GPT-3.5-turbo for sentiment and topic classification at scale, with GPT-4 for high-value client deliverables, 50K+ documents daily, 40% accuracy improvement over keyword baseline
- Implemented LangChain orchestration for prompt chaining, context management, and structured output parsing
- Deployed inference endpoints on Vertex AI with cost controls, rate limiting, and response caching
- Reduced analyst reporting workload by 60% through self-service Looker dashboards with row-level client isolation
- Tech: GCP, BigQuery, Airflow, dbt, Python, OpenAI API, Vertex AI, Looker

Senior Data Engineer

Australian Football League (AFL)

Sonstiges

5000-10.000 Mitarbeiter

- Led 5 engineers building multi-tenant Snowflake lakehouse serving all 18 AFL clubs and league headquarters
- Ingesting data from GPS player tracking, Champion Data match statistics, injury management systems, and club finance/HR systems
- Designed Kimball dimensional model with 50+ tables spanning player performance, recruitment, salary cap, and medical analytics
- Implemented Snowflake row-level security and column masking, providing full club data isolation with shared league-wide aggregation layer
- Achieved 25% query performance gain through clustering keys, search optimization service, and query result caching
- Built 120+ Matillion ELT pipelines with incremental loading, schema drift detection, and freshness SLAs
- Implemented dbt transformation layer enforcing standardized KPIs and automated testing across all 18 club data marts
- Delivered parameterized Tableau dashboards enabling club-level self-service, reducing ad-hoc reporting by 70%
- Established league-wide data governance with metric definitions, data contracts, and full lineage documentation
- Tech: Snowflake, Matillion, dbt, Python, AWS S3, Tableau

Senior Data Engineer

Sydney Water

Energie, Wasser und Umwelt

1000-5000 Mitarbeiter

- Led 4 engineers refactoring legacy PySpark pipelines processing 10B+ records across metering, billing, and asset management domains
- Built ingestion layer extracting data from SAP ECC via SAP BODS, Oracle databases, and IoT metering systems into Delta Lake
- Delivered 10x batch performance improvement (4-5h down to 20-30m) through partition pruning, predicate pushdown, and broadcast joins on skewed datasets
- Tuned Spark cluster configurations, shuffle partitions, executor memory, serialization, and adaptive query execution
- Architected reusable SCD Type 2 framework using Delta Lake MERGE across 30+ dimensional tables with full audit history
- Replaced full-table refreshes with incremental processing patterns, reducing daily compute hours by 60%
- Built Great Expectations data quality suite with 500+ validations, automated profiling, and alerting
- Reduced annual platform costs by €180K through cluster rightsizing, spot instance pools, and off-peak job scheduling
- Tech: Azure Databricks, Delta Lake, PySpark, Azure Data Factory, SAP BODS, Great Expectations

Data Architect

Kinetic

Transport und Logistik

1000-5000 Mitarbeiter

- Led 4 engineers migrating legacy SQL Server data warehouse to Azure Synapse for Melbourne Bus Franchise contract
- Ingesting 2M+ daily events from Myki ticketing gateways, GPS fleet telemetry, and driver rostering systems
- Designed Kimball dimensional model with 40+ tables spanning passenger journeys, route performance, driver compliance, and fleet maintenance
- Built Azure Data Factory pipelines with watermark-based incremental loading, reducing daily ETL runtime by 70%
- Implemented hash distribution and columnstore indexing in Synapse, achieving 50% query performance gain on billion-row fact tables
- Reduced infrastructure costs by $165K annually through dedicated/serverless workload balancing and reserved capacity
- Built Azure DevOps CI/CD framework with YAML pipelines, automated dacpac deployments, and gated environment promotion
- Reduced release cycles from weekly to daily, cutting production incidents by 60% through automated pre-deployment testing
- Delivered Power BI operational dashboards with near-real-time passenger loads, route delays, and driver shift compliance
- Tech: Australia's largest private bus operator managing 3,000+ vehicles across Melbourne, Sydney, and Brisbane

Data Engineer

Mirvac

Banken und Finanzdienstleistungen

1000-5000 Mitarbeiter

- Led data migration from legacy on-premise Yardi Voyager to Snowflake for $2B residential and commercial property portfolio
- Ingesting data from Yardi property management, Salesforce CRM, HubSpot marketing automation, and building management systems
- Designed Kimball dimensional model with 30+ tables spanning property financials, tenant lifecycle, lease performance, and customer engagement
- Built Matillion ELT pipelines with 80+ transformations, CDC patterns, and automated schema drift detection
- Achieved 22% query performance improvement through clustering keys, search optimization service, and warehouse auto-scaling
- Replaced manual Excel reporting with automated Snowflake views, reducing finance team month-end workload by 40%
- Built customer segmentation model using k-means clustering and RFM scoring across 50K+ residential buyer and renter profiles
- Delivered targeted marketing segments increasing campaign conversion rates by 35% across email and digital channels
- Tech: Snowflake, Matillion, Python, Yardi Voyager, Salesforce, scikit-learn

Data Engineer

Longtail UX

Marketing, PR und Design

50-250 Mitarbeiter

- Designed and built company's first centralized data warehouse on AWS S3 with Athena serverless query layer
- Ingesting data from 100+ Google Analytics properties, Google Search Console, SEMrush, and Ahrefs via scheduled API extractions
- Built serverless ETL pipelines using Lambda and Step Functions with CloudWatch triggers, eliminating manual data collection
- Designed star schema with 20+ tables spanning marketing attribution, channel performance, and keyword ranking analytics
- Implemented date-partitioned S3 storage with incremental loading, reducing Athena query costs by 40%
- Developed organic traffic forecasting models using Prophet and ARIMA, achieving 23% MAPE improvement over seasonal naive baseline
- Automated weekly client reporting via Lambda and SES, replacing 10+ hours of manual Excel consolidation per week

SAP Business Analyst

Bingo Industries

Energie, Wasser und Umwelt

1000-5000 Mitarbeiter

- Led data migration for SAP SuccessFactors full-suite implementation across Employee Central, Payroll, Compensation, and Recruiting
- Migrated 2,000+ employee records from legacy HRIS, SAP ECC HR, and Excel-based tracking systems with zero data loss
- Built SQL validation framework with 100+ rules ensuring referential integrity across employee, position, and org hierarchy data
- Reduced off-cycle payroll frequency from 30% to under 1% through upstream data quality controls and error prevention
- Designed data mapping and transformation specifications across 5+ legacy sources to SuccessFactors target schema
- Developed Excel/VBA automation for data cleansing and pre-migration validation, saving 20+ hours per migration cycle
- Conducted requirements workshops with HR, payroll, and finance stakeholders across 8 functional areas
- Managed payroll parallel runs validating 3 months of historical accuracy prior to go-live
- Delivered training for 50+ end users covering Employee Central, compensation workflows, and manager self-service

Kontaktanfrage

Einloggen & anfragen.

Das Kontaktformular ist nur für eingeloggte Nutzer verfügbar.

RegistrierenAnmelden