Lessons from a Big Data Machine Learning Startup
I co-founded H2O - and left it ~5 years later (and it’s still an on-going concern), that was about 3 years ago. Along the way we pivoted, rebuilt the entire code base 3 times from scratch. We delivered the fastest K/V store on the planet which is also exact & consistent, parallelized & distributed, with ML algorithms and a clicky-clicky GUI for doing hard-core data science. We ran R on TB of data and Logistic Regression on 7 TB in minutes. We amazingly broken stuff: crash-on-1st-use in front of Netflix. We raised money at least 3 times, hired and fired (your first fire is always traumatic!). We discovered it’s really hard to make a platform company, much much harder than making a platform. This less talk and more war stories from my H2O days.
Cliff Click was the CTO of Neurensic, and CTO and Co-Founder of h2o.ai (formerly 0xdata), a firm dedicated to creating a new way to think about web-scale data storage and real-time analytics. I wrote my first compiler when I was 15 (Pascal to TRS Z-80!), although my most famous compiler is the HotSpot Server Compiler (the Sea of Nodes IR). I helped Azul Systems build an 864 core pure-Java mainframe that keeps GC pauses on 500Gb heaps to under 10ms, and worked on all aspects of that JVM. Before that I worked on HotSpot at Sun Microsystems, and am at least partially responsible for bringing Java into the mainstream.
Previously I was with Motorola where I helped deliver industry leading SpecInt2000 scores on PowerPC chips, and before that I researched compiler technology at HP Labs. I am invited to speak regularly at industry and academic conferences including JavaOne, JVM, ECOOP and VEE; I’ve served on the Program Committee of many conferences (including PLDI and OOPSLA); and have published many papers about HotSpot technology. I hold a PhD in Computer Science from Rice University and about 20 patents.