Centrifuge : Data quality in Spark without the costs! (CurryOn 2017 - Curry On Talks)

Write a Blog >>

Sun 18 - Fri 23 June 2017 Barcelona, Spain

co-located with PLDI, ECOOP, Curry On, DEBS, LCTES and ISMM

Track

CurryOn 2017 Curry On Talks

Time Zone

The program is currently displayed in (GMT+02:00) Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna.

Use conference time zone: (GMT+02:00) Amsterdam, Berlin, Bern, Rome, Stockholm, ViennaSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Mon 19 Jun 2017 11:20 - 12:00 at Sala Agora - Monday - 10:30 - 12:50 - Sala Agora

Abstract

Data quality is a growing concern in Big Data as more and more bugs are due to the lack of quality in data. However, data quality efforts come in second, and often too late. In this talk, we will apply algebraic abstraction to the composition of data pipelines resulting in inlined, unified and performant data quality checks. We will see how these techniques can be used to find different classes of bugs in pipelines and make “same day delivery” possible in production-critical projects.

Time Zone

The program is currently displayed in (GMT+02:00) Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna.

Use conference time zone: (GMT+02:00) Amsterdam, Berlin, Bern, Rome, Stockholm, ViennaSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Mon 19 Jun
Displayed time zone: Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna change

10:30 - 12:50	Monday - 10:30 - 12:50 - Sala AgoraCurry On Talks at Sala Agora

10:30 40m Talk		Impromptu: A Lightweight, dependently-typed async framework for Scala Curry On Talks Jon Pretty Propensive Ltd
11:20 40m Talk		Centrifuge : Data quality in Spark without the costs! Curry On Talks Jonathan Winandy Univalence
12:10 40m Talk		Angelina Ballerina Learns About Memory Allocation Curry On Talks Allison McMillan Collective Idea

Centrifuge : Data quality in Spark without the costs!

Program Display Configuration

Program Display Configuration

Mon 19 JunDisplayed time zone: Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna change

Jonathan Winandy

Univalence

Mon 19 Jun
Displayed time zone: Amsterdam, Berlin, Bern, Rome, Stockholm, Vienna change