Marek Grzenkowicz | Devoxx

Marek Grzenkowicz
Marek Grzenkowicz Twitter

From Roche Global IT Solutions

Marek Grzenkowicz uses Hadoop and Python to build BI solutions for Big Data. He works with datasets ranging from hyper-structured machine data to unstructured text data and changes the toolkit depending on the use case - he is comfortable with the Cloudera Hadoop ecosystem but he also develops custom solutions to run natural language processing algorithms at scale.

He started working for Roche Global IT Solutions 5 years ago as an ETL specialist and became a full-stack Business Intelligence developer over time.

All in all, he has more than 10 years of IT experience, with different technologies (VB6, .NET, SharePoint, SQL Server, PowerCenter, Tableau) and in different positions (developer, administrator, team lead).

Blog: http://it.roche.pl/

bigd Big Data & Analytics

Data warehousing on Hadoop - after a few months in production

BOF (Bird of a Feather)

Medical laboratory instruments produce immense volumes of log files. Until recently, they were used only for trouble shooting and maintenance purposes. However, hidden inside are insights that could allow the laboratory managers to streamline and optimize the diagnostic process.

The goal of the StraDa project is to gather the log files from hundreds of Roche diagnostic instruments spread around the world, transform these TBs of data into actionable information and make it available for the business users.

Yet another data warehouse? Sounds reasonably easy? Well, so we thought. And we were wrong.

Please come to learn about some of the mistakes we made and problems we encountered, so you can avoid them.

I will have slides, but I don’t need to follow them. Ask questions! Challenge me! Share you own experience!

bigd Big Data & Analytics

Data warehousing on Hadoop - one important DON'T and a few DOs

Quickie

At the same time, you can have your lunch and learn why you should attend my evening BOF session. Does it sound like a good deal?

The brave new world of Big Data has been around for a while and its tools have been successfully applied to solve different problems. But is it a silver bullet? Is it really completely new? Can you forget the old truths of design, architecture and project management?

The goal of the StraDa project is to gather the log files from hundreds of Roche diagnostic instruments spread around the world and transform these TBs of data into a data warehouse.

We built it, but it wasn't a straightforward task.

Please come to learn what was the biggest mistake we made and then come back at 19:30 for a full-blown session.