Keynote: Kyle Kingsbury (Call me maybe project)

Kyle Kingsbury is the author of Riemann, Tesser, and Jepsen. At Stripe, he evaluates distributed systems safety by bashing them with network partitions and seeing what data falls out.


Jepsen V

We'll explore the safety and performance characteristics of distributed datastores under duress. When the network partitions, when nodes fail, when resources are scarce, how do databases recover? I've been running some new experiments, and would like to share my findings.

Salvatore Sanfilippo (Inventor of Redis)

Salvatore Sanfilippo is the lead developer of the Redis and Disque project. He lives and codes from Sicily. He works for Pivotal that is sponsoring the Redis development.


Disque: a detailed overview of the distributed implementation

In this talk Salvatore Sanfilippo will talk about the internals of his distributed queue Disque. The talk will cover the general architecture of the system, the delivery guarantees provided, its data replication, and finally the best-effort protocols employed in order to avoid useless multiple deliveries of the same messages.

Charity Majors (Parse / Facebook)

Charity is a Production Engineering Manager at Facebook, building out the next generation mobile app platform at Parse. She likes free speech, free software and single-malt scotch.


Upgrade your database: without losing your data, your perf or your mind

Upgrading databases can be terrifying and perilous, and for good reason: you can totally screw yourself! Every workload is unique, and standardized test suites will never give you enough information to evaluate how an upgrade will perform for your query set. We will talk about how paranoid you should be about various types of workloads and upgrades, how to balance risk vs. engineering effort, and how to safely execute the most challenging upgrades by capturing and replaying real production workloads. The principles apply to any database, but we’ll go particularly deep into war stories and tooling options for MongoDB and MySQL.

Stefan Siprell (codecentric AG)

Stefan Siprell is the branch manager of codecentric Karlsruhe and responsible for Big Data, especially Spark, Cassandra and Elasticsearch as well as Agile Software Enigneering. He is a frequent speaker on various conferences like Javaland, Data2Day and Continuous-Lifecycle.


Stream based textanalytics with Spark and Elasticsearch

This talk covers Apache Spark based machine learning in combination with Elasticsearch search capabilities to analyse data and train a modell. As a concrete example a historie of tweets will be accordingly analyzed to predict if the tweeter is a soccer fan and what's his favorite team. Attendees will hear about the concept of Machine Learning and Naive Bayes, text classification as well as stream processing in particular

co-speaker: Hendrik Saly (codecentric AG)

Hendrik Saly is working for codecentric AG as an IT-Consultant and is doing a lot of Elasticsearch consulting and development. He is also involved within the Java Community process (JCP) as an Expert Group Member of JSR 367 and JSR 374 and engaged as an active Apache committer. He spoke already on IX Conference, local meetups and customer events. Hendrik is an IT professional since 2001 and has worked for Pixelpark AG, akquinet AG and PTA GmbH and as a freelancer.


Stream based textanalytics with Spark and Elasticsearch

This talk covers Apache Spark based machine learning in combination with Elasticsearch search capabilities to analyse data and train a modell. As a concrete example a historie of tweets will be accordingly analyzed to predict if the tweeter is a soccer fan and what's his favorite team. Attendees will hear about the concept of Machine Learning and Naive Bayes, text classification as well as stream processing in particular

Arnaud Cogoluègnes (Zenika)

Arnaud Cogoluègnes is a software developer and author with deep expertise in middleware, software architecture, and Spring technologies. Arnaud spent a number of years developing complex business applications and integrating Java-based products. A Pivotal certified trainer for the Spring framework and RabbitMQ courses, Arnaud has trained hundreds of people around the world on these technologies and the Java platform. He also co-authored several books, among others Spring Batch in Action.


Microservices with Netflix OSS and Spring Cloud

Decomposing a system into microservices is not for the faint of heart. It’s not only about technical skills and tools, but, hey, that can help. Netflix OSS provides a bunch of battle-tested microservices-oriented components, but how to easily use them, especially together, in harmony? This is where Spring Cloud comes in. It brings all the productivity and easiness of the Spring stack to the world of microservices: configuration server, circuit breaker, service registry, client load balancing, all of this in a few lines of code. Come to this presentation to discover how to leverage microservices practices and tools in a couple of minutes!

Joe Nash (Braintree/PayPal)

Joe is a Developer Advocate at Braintree Payments, a PayPal Company. Having learnt the dark arts of FP at the University of Nottingham, Joe is passionate about functional techniques and their benefits to developers. He believes in the educational benefits of hackathons and hack culture, and supports student hackathons as part of the European team at Major League Hacking.


Clojure at Braintree: Real-time Data Pipeline with Kafka

Braintree is a payment processor, serving clients such as Uber, AirBnB and Mojang (Minecraft). Mostly a Ruby shop, Braintree chose Clojure as the core technology for the construction of a real-time data pipeline, built upon Kafka. This pipeline moves data from a short-term SQL based storage solution, to a long-term data warehouse. This talk will cover the reasons for choosing the technologies involved (Clojure, Kafka, RedShift), knowledge gained from building this particular part of the stack, and also the implications this somewhat radical technology shift has on future technology choices at the company.

Kai Davenport (ClusterHQ)

Kai works on the developer relations team at ClusterHQ - the creators of Flocker. He has been busy working on the Docker Volume Plugin and previously was hard at work on Powerstrip - a prototyping tool for Docker extensions. In a previous life Kai was developing educational software and has been developing web-based software for 15 years.


Running database containers using Marathon and Flocker

As microservices become more and more popular - we are encouraged to choose the right database for the job, resulting in an increase in the number of database processes in the cluster. Wouldn't it be great if we could use a Marathon manifest for our entire application including these stateful database processes. The problem is that when a database process writes to disk, it turns that server into a pet where it was cattle before. This talk will introduce Flocker, talk about Docker plugins and finally demonstrate the two working together to acheive the seamless scheduling and migration of stateful database containers using Marathon.

Lena Wiese (Georg August University Göttingen)

Dr. Lena Wiese is head of the research group Knowledge Engineering and lecturer at the Georg August University Göttingen. She has been teaching advanced courses on data management and database technology for several years at both graduate and undergraduate level.


Replication and Synchronization Algorithms for Distributed Databases

This talk will provide in-depth background on strategies for replication and synchronization as implemented in modern distributed databases. The following topics will be covered: master-slave vs multi-master replication; epidemic protocols; two-phase commit vs Paxos; multiversion concurrency control; read and write quorums. A concise overview of implementations in current NoSQL databases will be presented.

Marcos Placona (Twilio)

Marcos Placona is a developer evangelist at Twilio, a company founded to disrupt telecommunications. He spends most of his time working with Java and .Net open source projects while equipping and inspiring developers to build killer applications. He’s also a great API enthusiast and believes they bring peace to the Software Engineering world.


Just queue it!

How many times did you have to get two different API’s to communicate with each other and were left wondering what was the best way to get them talking? XML? JSON? HTTP? Sound familiar? You have used service oriented architecture but your projects turned out “speaking different languages” and you’re now faced with the arduous task of being the translator. Life’s too short and #yolo! Although you may have opted for the most appropriate technology, the correct design pattern and the optimal algorithms, if you don’t get your applications talking correctly they will be as good as a plate of spaghetti. In this presentation I’ll show you the secret many companies have been using for years to be able scale and respond to requests faster. I’ll show you a demo of a life-like application that consumes messages added to a queue and deals with them in a different process. I will tell you about some of the things you should look for when choosing your messaging system, and what are the things to look for when you start developing your messaging system.

Mark Nadal (GUN)

Mark is a mathematician turned programmer, and has created 2 startups one of which is a VC backed Open Source company. He has traveled to over 20 countries and fallen in love with the unique culture and food from around the world. Part of his passion for Open Source is related to the wonderful diversity he has experienced, without sharing and learning from others progress cannot be ascertained.


Conflict Resolution with Guns

How do you survive failure? You distribute. But distributed systems are distressingly complicated, everything from PAXOS to RAFT. Can we do better? Can we simplify things while still achieving reasonable results? This talk will explore a new method for conflict resolution that has been in the works for 5 years of R&D, and promises to mathematically ease the pain. All work is open source and has been implemented in the GUN database, a VC backed startup. We'll demonstrate some amazing recoveries from harsh failures and go into detail about how the fully peer-to-peer distributed algorithm works. We'll also discuss its weaknesses, where it fits in the CAP Theorem, and how it can be used as a building block for higher level consistency guarantees. You'll come out of the talk with a fresh perspective on concurrency that will help you make better decisions for your company's tough distributed problems. Ones based on mathematics and science, not hype of buzz words and marketing.

Martin Esmann (Couchbase)

Before joining Couchbase in January 2015 Martin worked as a freelance consultant. Martin has a broad knowlage within Microsoft technologies from he's 6 years as a Developer Evangelist in Microsoft Denmark where he worked with both Academic and professional developers with ares as Research, Azure, Mobile etc.


NoSQL's biggest lie: SQL never went away

NoSQL databases threw out SQL for querying, while their authors focused on solving problems on scale, speed and availability. The trouble is, the need for rich query never went away. Neither did SQL; it was only resting. Today, non-relational databases are bringing back SQL-like languages and other query mechanisms to help them integrate with existing data query layers (e.g. Hibernate) and to fit in with the overwhelming weight of database query practice of the past 40 years. In this talk I’ll cover how to query data in Couchbase using a SQL like query language called N1QL (Nickel) from .NET. I will also touch on topics like installing, setup, hosting, cloud, azure and perfomance. Demo's will be in C#, but there are SDK's and API’s for all the major languages, therefor if you don’t have a .NET background you can still use what you learn in your favourite language.

Matti Palosuo (EA tracktwenty)

Matti Palosuo has a history with mobile game development since the early 2000s. Currently he's leading a server engineering team at EA tracktwenty studio in Helsinki, Finland - designing and developing highly scalable backend systems used by all the studio game releases. The first published title with this backend was SimCity BuildIt, a popular mobile version of the classic SimCity franchise. Before joining Electronic Arts, Matti was developing highly scalable backend architecture at Digital Chocolate.


SimCity BuildIt - Building Highly Scalable and Cost Efficient Server Architecture

This session will focus on how the SimCity BuildIt engineering team tackled the challenges of scalability, service availability, performance and cost efficiency. The speaker will present a high level view of the game server architecture and share learnings from the project. The SimCity BuildIt backend is based on a unique approach using Redis and MongoDB in liaison as the data storage. We'll take a look at how these NoSQL databases were used and how they work together to solve different scalability and availability related problems.

Michael Hackstein (ArangoDB)

Michael is a JavaScript and NoSQL enthusiast. In his spare time he is organising colognejs, the JavaScript user group in Cologne Germany scheduled every second month. In his professional life Michael holds a master degree in Computer Science. As Front End and Graph Specialist he is member of the ArangoDB core team. There he is a real full-stack developer as he is developing on the web frontend including graph visualisation, the Foxx micro-service framework and core graph features.


NoSQL meets Microservices

Just a few years ago all software systems were designed to be monoliths running on a single big and powerful machine. But nowadays most companies desire to scale out instead of scaling up, because it is much easier to buy or rent a large cluster of commodity hardware then to get a single machine that is powerful enough. In the database area scaling out is realized by utilizing a combination of polyglot persistence and sharding of data. On the application level scaling out is realized by microservices. In this talk I will briefly introduce the concepts and ideas of microservices and discuss their benefits and drawbacks. Afterwards I will focus on the point of intersection of a microservice based application talking to one or many NoSQL databases. We will try and find answers to these questions: Are the differences to a monolithic application? How to scale the whole system properly? What about polyglot persistence? Is there a data-centric way to split microservices?

Michael Hausenblas (Mesosphere Inc.)

Michael is a Datacenter Application Architect with Mesosphere. He helps devops to build and operate scalable & elastic distributed applications. His background is in large-scale data integration, Hadoop & NoSQL, IoT, as well as Web applications and he's experienced in advocacy and standardization. Michael is contributing to open source software at Apache (Myriad, Drill) and shares his experience with the Datacenter OS and large-scale data processing through blog posts and public speaking engagements.


Containers! Containers! Containers! And Now?

Containers are all the hype, and rightly so. But what do you do after you've build your Docker images? Are you going all-in concerning microservices? What about existing workloads you can't or don't want to containerize? We will discuss options and tooling to address these and related questions.

Phil Calcado (SoundCloud)

Phil is the Director of Core Engineering at SoundCloud, and his team is responsible for “keeping the trains running” in our microservices architecture. Before that he was the Director of Product Engineering at SoundCloud, and before joining SoundCloud he was a Lead Consultant for ThoughtWorks in Australia and the UK.


No Free Lunch, Indeed: Three Years of Microservices at SoundCloud

SoundCloud is the largest repository of audio on the web, used by more than 200 million people every month, who upload more than 11 hours of audio every minute. Like so many others, we have migrated from a typical monolithic architecture to microservices. While the benefits brought by this style of SOA to our productivity and reliability are clear, the architecture required some non-obvious changes in the way we operate systems, and a way to tackle the overhead associated with having hundreds of small moving parts to serve every request. In this talk we’ll share the toolkit and strategy SoundCloud uses to keep its microservices explosion manageable. What do we do about the operations overhead? How to spread devops skills across teams to support the “you build it, you run it” vision? How to deal with breaking changes and asynchronous behaviours? How to deal with chatty interactions? Which protocol? How do I even get a diagram telling me how all this stuff is put together?

Philipp Krenn (ecosio)

Philipp Krenn is running everything database related and the cloud infrastructure of the Vienna based B2B startup ecosio. When not fighting MongoDB, MySQL, Jenkins, or AWS, he is giving NoSQL and cloud computing trainings or organizes his meetups ViennaDB and Papers We Love Vienna.


A tale of queues — from ActiveMQ over Hazelcast to Disque

After all the attention databases have been getting over the last years, it is high time to give some thought to queues. We will kick off with some considerations why you need queues in distributed systems and what their limitations are — in particular the at-least-once and at-most-once decision. Next we discuss our specific use case and why * we started off with ActiveMQ, * it's working ok for us, * we are looking for a better solution. While looking for a better solution, we considered Amazon SQS and RabbitMQ, but finally selected Hazelcast — which seemed to do everything for us. After the initial phase of enchantment, we came to realize that Hazelcast is actually not the right tool for us and why we do not want to fully rely on it. Luckily, Disque has just been released and looks really promising. And we have already started migrating to it, even though it's currently marked as alpha code.

Uwe Friedrichsen (codecentric AG)

Uwe Friedrichsen travels the IT world for many years. As a fellow of codecentric AG he is always in search of innovative ideas and concepts. His current focus areas are resilience, scalability and the IT of (the day after) tomorrow. Often, you can find him on conferences sharing his ideas, or as author of articles, blog posts, tweets and more.


Microservices - stress-free and without increased heart-attack risk

A microservice is written quickly: Reasonable scope, a small REST interface, nice and easy and way lot cooler than those fat web applications we did before. But, is it really that easy? Well - no, not really! A single service is quite easy to manage, but therefrom the overall complexity does not go away. Instead of a few big web applications we now have lots of microservices - and to make sure that integration, operations and maintenance will not become a lottery game with increased heart-attack risk, it is crucial to consider a few things, that were not (so) important for traditional web applications. Should I use REST or would event driven be the better choice? How can I make sure the service collaboration works as desired? With GUI or better without GUI? How can I guarantee availability and scalability in production? How to deploy best? How I can I make sure that services are easily replaceable? How can I avoid service spaghetti? Those and many more questions will be answered in this session - to make sure the encounter with microservices will not become a health risk.

Program Committee

Stefan Edlich

Prof. Dr. Stefan Edlich is a senior lecturer at Beuth University of Applied Sciences Berlin. He wrote the world’s first NoSQL books and twelve other IT books for publishers as Apress, O’Reilly, Spektrum/Elsevier, Hanser and others. Additionally he runs nosql-database.org, organizes Big Data and Clojure related events, and is a founding member of the Data Science Lab at Beuth University. Since 2014 he is a member of BBDC, the national competence center for Big Data. The variety of topics that surrounds the work of Stefan Edlich makes him the perfect candidate to chair the distributed matters conference program committee.

Carl Azoury

Graduate of the National Agronomic Institute Paris-Grigon (NAI-PG), Carl Azoury specializes in the 3rd year in computer science and joined the company Ingenia in 1996, leader in object-oriented programming and Smalltalk. Then, in 1999 and for 7 years, he joined the Sysdeo Company as a technical architect on the new Java technologies. In 2006, he established the Zenika Company with the ambition to create a company in which he would like to be as an IT consultant. After a first year in London as a technical architect, Carl Azoury goes through several activities as the growth of Zenika (consultant, trainer, HR, finance, marketing, sales, and strategic partnerships) and offers innovative concepts in France discovered during his year in London. Today, CEO of Zenika, chief Enabler Officer, according to his definition of a President, Carl Azoury is very concerned with entrepreneurship and corporate culture. As such, he meets other entrepreneurs via Startup programs, the CCIP and the competitiveness cluster System@tic.

Frank Celler

As head of Dr. Celler Cologne Lectures, Frank Celler is the host of the NoSQL matters conference series as well as of the NoSQL Cologne User Group. Since 20 years he is working in the field of software business and entered the world of NoSQL more than 13 years ago. Working for different companies he early discovered the potential of high-performance databases. Today he is passionate about promoting the importance of NoSQL to the world. Together with Stefan Edlich and Marc Planagumà, he chairs the NoSQL matters 2014 program committee to select the finest talks for the conference’s agenda.

Olaf Bachmann

Olaf Bachmann is Engineering Director at Google responsible for Ads Traffic Quality. He has learned the hard way that efficient analysis of really BIG data is key to stay ahead in the arms-race with the spammers. Unfortunately, the standard Google tools were not powerful enough for those needs. So, his teams set out to build their own highly distributed data-analysis tools over the years that pushed out the boundaries on efficiency and usability by orders of magnitudes.