I'm Philipp Grulich

Computer Scientist - Big Data Software Engineer


I’m Philipp Grulich a second-year computer science Master’s student at the Technische Universität Berlin, specializing in big data analytics systems. Currently, I’m studying as an exchange student for two quarters at the UC Santa Cruz. Besides the university, I have worked for several companies and collected experiences in frontend and backend software development. In my bachelor thesis, I evaluated Apache Spark Streaming as the foundation of an IoT platform. At the TUB, I joined a streaming systems oriented research project involving Apache Flink as a German Research Center for Artificial Intelligence (DFKI) research assistant.


Apache Flink
Distributed Systems
Database Systems
Apache Spark Streaming
Spring MVC
Apache Calcite
Apache Zeppelin
Deep Learning

Work Experiences:

Research Assistant
Oct. 2016 - Now
Data Engineer Student Trainee
Apr. 2016 - July 2016
  • Data analytics with python and pandas
  • Evaluated Apache Spark for several transaction analysis use cases
Software Developer Student Trainee
June 2014 - June 2015
  • Contributed to an AngularJS application for low end embedded devices
  • Performance profiling with chrome dev tools
Apprenticeship as a Computer Science Expert in Software Development
Aug. 2010 - Jan. 2013
  • Developed several intern applications with Java RCP and Spring
  • Developed a crowdsourcing platform to improve product data quality based on Amazon Mechanical Turk


Efficient real time Object Detection for Edge-Cloud systems
I realised this project together with Prof. Faisal Nawab during my exchange at the UC Santa Cruz. We evaluated different techniques to minimize the bandwidth consumption between the Edge and Cloud nodes in the context of object detection with deep neural networks. For this I used the YOLO model and modified its Darknet implementation.
Deep Shirt
For the CruzHacks Hackathon we created a novel way to express creativity in designing unique t-shirts by intelligently transforming custom pictures using Machine Learning. We used deep learning based style creation to generate stylish t-shirt which are unique for all our users. For creating this we used Tensorflow, React and Firebase. See our Youtube clip or GitHub for more details.
1st place for Google Cloud Platform usage, 2nd place in the category Innovation, 2nd place Project YX Fashion Prize, 1st place from the Santa Cruz Accelerator
Efficient Window Aggregation for out-of-order Stream Processing on Apache Flink
In this project, we implemented a prototype of slicing window aggregations for out of order data streams on Apache Flink. Our implementation enables a higher and more consistent throughput in comparision to the current state of art and Flink implementation. For further details see our publication [ 2].
I²: Interactive Real-Time Visualization for Streaming Data
With I², we developed an interactive development environment based on Apache Zeppelin that coordinates running cluster applications and corresponding visualizations. With this, we can offload computation to the cluster and enable stable and high framerates in the browser. For further details check out the GitHub project, our presentation from the Flink Forwarad 2017 or the EDBT 2017 publication [ 3].
EDBT 2017 Best Demo Award
Bachelor Thesis: Scalable real-time processing with Spark Streaming
In my bachelor thesis I evaluated Spark Streaming as a foundation for a smart car platform. I evaluated mainly its scalability in combination with the Kappa architecture. For further details see the thesis (german).
I created an Android Multiplayer version of the popular German pen and paper game "Stadt-Land-Fluss". Over a short time, it got downloaded more than 50.000 times in the google play store. The app is not maintained anymore.


Master Computer Science
Apr. 2016 - Now
Exchange Student - GPA 4.0
Sep. 2017 - Mar. 2018
Bachelor Computer Science - Very Good - 1.28
Mar. 2013 - Mar. 2016


  1. VLDB 2018
    Collaborative Edge and Cloud Neural Networks for Real-Time Video Processing
    Philipp Marian Grulich, Faisal Nawab
    In 44th International Conference on Very Large Data Bases (VLDB), 2018, to appear.
  2. EDBT 2018
    Scalable Detection of Concept Drifts on Data Streams with Parallel Adaptive Windowing
    Philipp Marian Grulich, René Saitenmacher, Jonas Traub, Sebastian Breß, Tilmann Rabl, Volker Markl.
    In International Conference on Extending Database Technology (EDBT), 2018.
  3. ICDE 2018
    Scotty: Efficient Window Aggregation for out-of-order Stream Processing
    Jonas Traub, Philipp Marian Grulich, Alejandro Rodrıguez Cuéllar, Sebastian Breß, Asterios Katsifodimos, Tilmann Rabl, Volker Markl.
    In 34th IEEE International Conference on Data Engineering (ICDE), 2018.
  4. EDBT 2017
    I2: Interactive Real-Time Visualization for Streaming Data.
    Jonas Traub, Nikolaas Steenbergen, Philipp M Grulich, Tilmann Rabl, and Volker Markl.
    Proceedings of the 20th International Conference on Extending Database Technology (EDBT'17), March 21-24, 2017, Venice, Italy.
  5. EDBT 2017
    STREAMLINE-Streamlined analysis of data at rest and data in motion
    Philipp Marian Grulich, Tilmann Rabl, Volker Markl, Csaba István Sidló, Andras Benczur
    Proceedings of the 20th International Conference on Extending Database Technology (EDBT'17), March 21-24, 2017, Venice, Italy.
  6. SmartData 2017
    Smart Stream-Based Car Information Systems that Scale: An Experimental Evaluation.
    Philipp Marian Grulich, Olaf Zukunft.
    Proceedings of the 3rd IEEE International Conference on Smart Data (SmartData 2017), June 21-23, 2017, Exeter, UK.
  7. Innovate-Data 2017
    Bringing Big Data into the Car: Does it scale?.
    Philipp Marian Grulich, Olaf Zukunft.
    Proceedings of the The 3rd International Conference on Big Data Innovations and Applications (Innovate-Data 2017), Aug. 21-23, 2017, Prague, Czech Republic.