talk-advanced-data-engineering-patterns-with-airflow

https://youtu.be/23_1WlxGGM4

At nine configuration as code

At 12:20 A B testing

At 14 there is a thing called auto Dag which allowed people at Airbnb to set up sequel that would run every day very easily without having to create their own pipe lines. It is a plug-in for airflow.

At 2250 talks about a metrics repository and a confidence interval and P values and how the data can be exported into my SQL

At 2310 defining metrics. There are owners.

At 2650 the ML feature repository is D centralized place for storing all metrics that are used for machine learning experiments. It is comparable to no the data warehouse was the centralized place for data used in analytics in an organization.

At 2730 this version of this ML feature this ML teacher has been trained on a specific version of the data in the ML feature repository.

At 2940 stats demon is an Airbnb projects where when the meta-data data base associated with hives would indicate that a new partition had been loaded in the stats do you mean would run and gather statistics like men max value is for column, cardinality, etc.

At 3030 the statistics turn out to be very useful for capacity planning, data quality Monitoring and all sorts of debugging. It is useful to have these metrics I had of time, rather than needing to run a bunch of big series to gather them. Also very useful for anomaly detection and that's with the stuff. That sort of stuff

At 38:15 a Google project called facet that helps you explore your data

People

person-maxime-beauchemin