Research

This page is dedicated to the research of me and my group. My academic profile including a CV and a list of grants (projects) can also be found via my ZHAW profile page. A preserved version of my research website at the University of Marburg can be found at my Phillips University profile page.

TOC

  1. The group
  2. Research examples
  3. Collaborations
  4. Publications

The group

Team

The Computer Vision, Perception and Cognition Group at the ZHAW Centre for Artificial Intelligence, 2020 (top-left to bottom-right): Pascal Sager, Thilo Stadelmann, Claude Lehmann, Javier Montoya, Lukas Tuggener, Raphael Emberger, Ricardo Chavarriaga, Frank-Peter Schilling and Mohammadreza Amirian.

My Computer Vision, Perception and Cognition Group performs pattern recognition research, working on a wide variety of tasks on image, audio or generally “signal” data. As a group, we focus on deep learning and reinforcement learning methodology, inspired by biological learning. Each task that we study has its own learning target (e.g., detection, classification, clustering, segmentation, novelty detection, control) and use case (e.g., predictive maintenance, speaker diarization for multimedia indexing, document analysis, optical music recognition, computer vision for industrial quality control, automated machine learning, deep reinforcement learning for automated game play or building control), which in turn sheds light on different aspects of the learning process.

The group has very diverse backgrounds, which helps us complement each other’s skills:

  • Prof. Dr. Thilo Stadelmann, team lead: Diplom (THM) & PhD (U. Marburg) in computer science; I come from the tradition of the field of artificial intelligence, and my tool is machine learning. My research interest is to discover and understand how smart behaviour can be evoked through self-learning, including practical (robustness, generalization) and ethical (algorithmic bias) side effects.

  • Dr. Ricardo Chavarriaga, senior researcher & head of CLAIRE office Zurich: B.Sc. electronics engineering (Pontificia Universidad Javeriana, Columbia), graduate school in computer/communication/information science (EPFL), PhD neuroscience (EPFL), Postdoc neuroscience/brain-computer interfaces (EPFL & IDIAP)

  • Lukas Tuggener, PhD student: B.Sc. industrial engineering (ZHAW), M.Sc. statistics (ETH), doctorate in conjunction with Juergen Schmidhuber of the University of Lugano / Switzerland

  • Mohammadreza Amirian, PhD student: B.Sc. (U. Shiraz) & M.Sc. (U. Ulm) in electrical engineering, doctorate in conjunction with Friedhelm Schwenker of Ulm University / Germany

  • Waqar Ali, PhD student: M.Sc. (CU Islamabad) in computer science, doctorate in conjunction with Marcello Pelillo of the University of Venice / Italy

  • Peng Yan, PhD student: B.Sc. measuring & control technology (Jianghan U.) & M.Sc. control engineering (Xiamen U.), doctorate in junctionion with Benjamin F. Grewe of the University of Zurich / Switzerland

  • Pascal Sager, research assistant & M.Sc. student: B.Sc. computer science (ZHAW)

  • Raphael Emberger, research assistant & M.Sc. student: B.Sc. computer science (ZHAW)

  • Paul Luley, research assistent & M.Sc. student at UZH: B.Sc. biomedical engineering (Technikum Vienna)

  • Benjamin Meyer, research assistent & M.Sc. student at U Basel: B.Sc. Information Technology (FHNW)

  • Sebastian Salzmann, M.Sc. student: B.Sc. biology (ETH) & M.Sc. molecular health sciences (ETH)

  • Sydney M. Nguyen, M.Sc. student: B.Sc. in computer science (ZHAW)

  • Livia Luescher, M.Sc. student: B.Sc. in business administration with honors (HWZ Zurich)

Our Alumni are:

  • Adhiraj Ghosh, B.Tech. (Manipal Institute of Technology), research intern 2021/22 (ZHAW), now M.Sc. student at Tübingen University

  • Claude Olivier Lehmann, M.Sc. and B.Sc. (ZHAW), now PhD student with Prof. Kurt Stockinger of ZHAW InIT

  • Dr. Frank-Peter Schilling Diplom & PhD (U. Heidelberg) in physics, PostDoc at CERN, now senior lecturer at ZHAW Centre for AI

  • Dr. Javier Montoya, B.Sc. computer engineering (UCSP, Arequipa, Peru), M.Sc. computer science (Unicamp, Sao Paulo, Brazil), M.Sc. computer science (ENSIMAG, Grenoble, France), PhD computer vision and machine learning (ETHZ), now senior lecturer with Lucerne University of Applied Sciences

  • Stefan Huschauer: B.Sc. computer science (ETH & ZHAW), M.Sc. (ZHAW), now with Noimos, an AXA insuretech company

  • Daniel Neururer, B.Sc. in systems engineering (ZHAW), M.Sc. 2020 (ZHAW), now software/ML engineer with Prof. Cieliebak of ZHAW

  • Dr. Ismail Elezi, B.Sc. (U. Pristina), M.Sc. (Ca’Foscari) & PhD (Ca’Foscari & ZHAW, July 2020) in computer science, now PostDoc with Prof. Laura Leal-Taixé of TU Munich

  • Yvan Satyawan, B.Sc. computer science (U. Freiburg, Germany), now pursuing a M.Sc. degree in Denmark

  • Katharina Rombach, B.Sc. & M.Sc. (ETH Zurich; admitted to EDIC and ERDS doctoral schools at EPFL), now PhD student with Prof. Olga Fink of ETH Zurich

  • Reza Kakooee, M.Sc. (U. Mashhad), now PhD student with Prof. Marc Pouly of Lucerne University of Applied Sciences

  • Benjamin Bruno Meier, M.Sc. (ZHAW; Hirschmann scholarship holder & Dr. Waldemar Jucker laureate), now Software Engineer at Argus Data Insights Schweiz AG

  • Gabriel Eyyi, M.Sc. (ZHAW), now Software Engineer / Machine Learning Engineer at dizmo AG

  • Thierry Musy, B.Sc. (ZHAW), now Partner / Senior Data Scientist at Foursight Digital AG

  • Jan Stampfli, B.Sc. (ZHAW), now Big Data Engineer at Migros-Genossenschafts-Bund

 

 

Research examples

  1. Robust and practical deep learning
  2. Industrial Computer Vision
  3. Medical Imaging
  4. More General AI
  5. Trustworthy AI
  6. Learning to learn
  7. Document Recognition
  8. Voice recognition
  9. Data science

Robust and practical deep learning

Examples of automatic segmentation Newspaper article segmentation architecture

Deep learning has long reached the point of practical applicability in solving day-to-day tasks in many non-AI businesses, for instance manufacturing SMEs. Specific challenges arise and are tackled in our applied research projects, ranging from data quality and quantity issues to higher requirements on robustness and resilience of the models. For instance, segmenting newspaper pages into articles that semantically belong together is a necessary prerequisite for article-based information retrieval on print media collections like e.g. archives and libraries. It is challenging due to vastly differing layouts of papers, various content types and different languages, but commercially very relevant for e.g. media monitoring.

Here, we have developed a semantic segmentation approach based on the visual appearance of each page. We apply a fully convolutional neural network (FCN) that we train in an end-to-end fashion to transform the input image into a segmentation mask in one pass. We show experimentally that the FCN performs very well: it outperforms a deep learning-based commercial solution by a large margin in terms of segmentation quality while in addition being computationally two orders of magnitude more efficient. The whole system is trained with only 5,500 images of which less than 500 are fully labeled.

Detecting adversarial examples using local spatial entropy on feature response maps

At the same time, the existence of adversarial attacks on convolutional neural networks (CNN) questions the fitness of such models for serious applications. Such attacks manipulate an input image such that misclassification is evoked while still looking normal to a human observer - they are thus not easily detectable. In a different context, backpropagated activations of CNN hidden layers - “feature responses” to a given input - have been helpful to visualize for a human “debugger” what the CNN “looks at” while computing its output. We have proposed a novel detection method for adversarial examples to prevent attacks. We do so by tracking adversarial perturbations in feature responses, allowing for automatic detection using average local spatial entropy. The method does not alter the original network architecture and is fully human-interpretable. Experiments confirm the validity of our approach for state-of-the-art attacks on large-scale models trained on ImageNet.

Selected references (see also below)

 

Industrial Computer Vision

Semantic segmentation pipeline to detect food waste

A rich source for open research questions in deep-learning-based pattern recognition is found in various industrial processes like engineering and production. For example, we developed methods to reliably classify and quantify food waste in large kitches through new semantic segmentation pipeline and worked a lot on automatic quality control. Other work in this area is going address the problem of transferability of learnt knowledge from one tool to the next. As in industrial settings labelled data is usually scarce, a particular focus of our work is to make approaches more sample (or label) efficient than usual benchmark-beating models from the literature.

Selected references (see also below)

 

Medical Imaging

Vertebrae detection by deep learning model trained with unsupervised domain adaptation

Health applications of deep learning started to become a major topic in the group in 2019. For example, in a collaboration with the AI and Data Science CoE of the Kantonsspital Aarau, we have developed a novel approach to reliably identify vertebrae in 3D CT scans. Our primary contribution is a new Domain Sanity Loss (DSL) function for unsupervised domain adaptation. We achieve results that are on par with the current state-of-the-art algorithms for full supervised learning while using about 20 times fewer labels. This is a very interesting instance of a data-centric approach to be pursued further in our research.

Additional applications include the reduction of motion artifacts in CT images, which can significantly reduce the risk for patients to develop secondary cancer during radiation therapy; the automated monitoring of patients in intensive care; or the homogenization of data to train large, bias-free medical imaging models, as we showed in the context of the COVID-19 spread.

Selected references (see also below)

 

More General AI

Illustration of Kolmogorov complexity: Perception and efficient learning are possible by reducing the flood of sensory signals produced by the environment to an underlying low-complexity description

Deep learning has propelled the surge in AI in the last decade and will continue to find new applications and improvements. However, it is forseeable that the methodology in itself will not produce higher-level cognition. Inspiration comes from neuroscientific research on understanding brain functionality. Here, we find the principle of self-organizing net fragments to be the inductive bias that fits models (developed brains) to the natural environment. We seek ways of implementing related ideas into deep learning frameworks to increas the generability and robustness of the approaches, which for example means to move away from purely error-driven learning by backgpropagation.

Selected references (see also below)

 

Trustworty AI

t-SNE visualization of the embedding space created by two face recognition models, coloured by ethnicity and gender

How a powerful technology like artificial intelligence engages with the world around us and impacts our societies must concern us as engineers. Assessing the impact of the technology on society is a first step to shape AI for good. In this direction, we for example analyzed how the effect of bias in face recognition systems can be quantified and mitigated. As our study shows, AI systems are very different from us humans: while for humans, conceiling information from us on sensitive attributes keeps us more fair / less biased, doing the same thing for a machine learning system - blinding it to attributes concerning e.g. gender or skin colour - does not result in less bias. Thus, bias does not equal awareness.

We also follow this thread of trustworthiness through projects on AI verification and certification as well as collaborations with colleagues from the humanities in projects and committees.

Selected references (see also below)

 

Learning to learn

Learning to cluster model architecture Example clusterings

We have for instance built a novel end-to-end neural network architecture that, once trained, directly outputs a probabilistic clustering of a batch of input examples in one pass. It estimates a distribution over the number of clusters and, for each number of clusters up to a maximum, distributions over the respective data partitioning. The neural network is trained in a supervised fashion to group data by any perceptual similarity criterion based on pairwise labels (same/different group). It does not expect to have seen any of the groups that appear during model application already during training. We demonstrate promising performance on high-dimensional data like images (COIL-100) and speech (TIMIT). We call this learning to cluster. We have also produced a survey and some novel results on the more general topic of learning to learn.

Selected references (see also below)

 

Document Recognition

Deep Watershed Detector architecture

We have applied our skills in pattern recognition frequently to use cases in document recognition, for example to convert music scores to machine-readable form. Written music is a large and important part of cultural heritage worldwide. While there are many archives containing thousands of music scores, they are paper-based, so public access is cumbersome or even impossible. Digitization of these scores has for a long time been impossible due to the non-availability of scanning software that can convert hand-written scores to machine-readable format (Optical Music Recognition – OMR). Our DeepScore and RealScore projects aimed at bringing bleeding edge technology form computer vision the field of OMR. The impact of OMR on how we curate, preserve and access music manuscripts cannot be overstated. Fully functional OMR would lead to a democratization of the musical cultural heritage by enabling cheap and efficient access by everyone. It would also enable more efficient music training, and enable orchestras to run cheaper and rehearse more efficiently.

To facilitate deep learning for OMR, we built the DeepScores dataset with the goal of advancing the state-of-the-art in small object recognition by placing the question of object recognition in the context of scene understanding. DeepScores contains high quality images of musical scores, partitioned into 300’000 sheets of written music that contain symbols of different shapes and sizes. With close to a hundred million small objects, this makes our dataset not only unique, but also the largest public dataset. DeepScores comes with ground truth for object classification, detection and semantic segmentation. We provide baseline performances for object classification and intuition for the inherent difficulty that DeepScores poses to state-of-the-art object detectors like YOLO or R-CNN.

We introduced a novel object detection method, based on synthetic energy maps and the watershed transform, called Deep Watershed Detector (DWD). Our method is specifically tailored to deal with high resolution images that contain a large number of very small objects and is therefore able to process full pages of written music. We present state-of-the-art detection results of common music symbols and show DWD’s ability to work with synthetic scores equally well as on handwritten music. Further results in making OMR more robust trhough domain adaptation can also be found here.

Selected references (see also below)

 

Voice recognition

Architecture of the successful RNN model for speaker clustering

My PhD research focused on the task of speaker clustering: grouping speech segments by speaker identity without prior knowledge of the number or identity of speakers (a prerequisite for e.g. content-based media indexing). While speaker identification usually achieved accuracy percentages in their high nineties, the state of the art for the more complex task of clustering performed an order of magnitude worse.

My 2009 ACM Multimedia paper on Unfolding Speaker Clustering Potential – a Biomimetic Approach (see also the code) not only analyzed this fact, but also identified deficiencies in modeling the sequence of speech features as the bottleneck responsible for the slump in performance. The prediction of potentially raising speaker clustering performance by an order of magnitude by better sequence modeling has led to exciting discoveries so far. We successively built deep learning models with more clustering capability to exploit the sequence information: a simple CNN, CNN with optimized clustering loos and finally a RNN to improve the capturing of prosodic voice information, to reduce the error rate for pure voice comparison by the predicted rate (see code). Additionally, using a different clustering algorithm on top of the simple CNN feature embeddings also proved valuable.

This line of research has also lead to work on other audio processing tasks like media segmentation and classification, musical instrument recognition, audio fingerprinting, or voice transfer, mainly driven forward in student thesis projects.

Selected references (see also below)

 

Data science

The data science skill set map

I helped in creating one of Europe’s first dedicated research centers for data science, the ZHAW Datalab, and lead it until 2018. Subsequently, my colleagues and I created one of Switzerland’s first continuing education programs in data science, the MAS Data Science, where I teach machine learning. In 2015, we started rolling out the successful Datalab collaboration model country-wide in founding the Data Innovation Alliance, a network of industrial and academic partner institutions that also furthered the Swiss Conference on Data Science series of events that started in Winterthur. The experience gained in these activities, together with the feedback from the applied research projects described above, lead to a book I co-edited together with my colleagues Martin Braschler and Kurt Stockinger.

In 2021, I co-chaired the 1st International Symposium on the Science of Data Science. In the course of reviewing the development of the field in the past decade, we came to the conclusion that it is data centrism – the reliance on data itself, in mindset, methods and products – that makes data science more than the sum of its parts, as this is not done in any other discipline.

Selected references (see also below)

 

 

Collaborations

I frequently collaborate with industry to work on novel pattern recognition applications. Partners come from start-ups, SMEs and multi-national enterprises alike.

In academia, I collaborate e.g. with the Machine Learning and Optimization Lab of Martin Jaggi at EPFL, Marcello Pelillo of the Ca’Foscari University of Venice, Juergen Schmidhuber’s group at IDSIA, Friedhelm Schwenker of Ulm University, Insitute of Neural Information Processing, Boi Faltings of EPFL, Benjamin F. Grewe’s Neural Learning and Intelligent Systems Group at UZH/ETHz’s Institute of Neuroinformatics and our joint visiting professor Christoph von der Malsburg of the Frankfurt Institute of Advanced Studies. We have joint research projects, publications and/or co-supervise PhD students.

If you are interested in a collaboration, please contact me.

 

 

Publications

Compare bibliometrics on Google scholar and ResearchGate.

2023

Thilo Stadelmann. KI als Chance für die angewandten Wissenschaften im Wettbewerb der Hochschulen. Workshop (“Atelier”) at the Bürgenstock-Konferenz der Schweizer Fachhochschulen und Pädagogischen Hochschulen 2023, Luzern, Schweiz, 20. Januar 2023.

2022

Lukas Tuggener, Jürgen Schmidhuber, and Thilo Stadelmann. Is it enough to optimize CNN architectures on ImageNet?. Computer Vision - Frontiers in Computer Science, DOI 10.3389/fcomp.2022.1041703, 15 November 2022.

Felix M. Schmitt-Koopmann, Elaine M. Huang, Hans-Peter Hutter, Thilo Stadelmann, and Alireza Darvishy. FormulaNet: A Benchmark Dataset for Mathematical Formula Detection. IEEE Access 2022, 10, pp. 91588-91596, DOI 10.1109/ACCESS.2022.3202639, August 2022.

Pascal Sager, Sebastian Salzmann, Felice Burn, and Thilo Stadelmann. Unsupervised Domain Adaptation for Vertebrae Detection and Identification in 3D CT Volumes Using a Domain Sanity Loss. J. Imaging 2022, 8(8), 222, MDPI, Basel, Switzerland.

Christoph von der Malsburg, Benjamin F. Grewe, and Thilo Stadelmann. Making Sense of the Natural Environment. Proceedings of the KogWis 2022 - Understanding Minds Biannual Conference of the German Cognitive Science Society, Freiburg, Germany, September 5-7, 2022.

Christoph von der Malsburg, Thilo Stadelmann, and Benjamin F. Grewe. A Theory of Natural Intelligence. arXiv preprint, arXiv:2205.00002, April 2022.

Frank-Peter Schilling, Dandolo Flumini, Rudolf M. Füchslin, Elena Gavagnin, Armando Geller, Silvia Quarteroni and Thilo Stadelmann. Foundations of Data Science: A Comprehensive Overview Formed at the 1st International Symposium on the Science of Data Science. Archives of Data Science, Series A 8(2), accepted for publication, 2022.

Ivo Herzig, Pascal Paysan, Stefan Scheib, Frank-Peter Schilling, Javier Montoya, Mohammadreza Amirian, Thilo Stadelmann, Peter Eggenberger, Rudolf M. Fuechslin, and Lukas Lichtensteiger. “Deep Learning-Based Simultaneous Multi-Phase Deformable Image Registration of Sparse 4D-CBCT”. In: Proceedings of the American Association of Physics in Medicine Annual Meeting (AAPM’22), Washington, DC, USA, July 10-14, 2022.

Thilo Stadelmann, Tino Klamt, and Philipp H. Merkt. Data Centrism and the Core of Data Science as a Scientific Discipline. Archives of Data Science, Series A 8(2), pp.1-16, April 2022.

Andreas Geyer-Schultz and Thilo Stadelmann (Editors). Archives of Data Science, Series A 8(2), April 2022. Special Issue.

Frank-Peter Schilling and Thilo Stadelmann (Editors). Special Issue “Advances in Deep Neural Networks for Visual Pattern Recognition”, J. Imaging, MDPI, March 2022.

2021

Samuel Wehrli, Corinna Hertweck, Mohammadreza Amirian, Stefan Glüge, and Thilo Stadelmann. Bias, awareness and ignorance in deep-learning-based face recognition. AI and Ethics, DOI 10.1007/s43681-021-00108-6, Springer, October 27, 2021.

Mohammadreza Amirian, Javier A. Montoya-Zegarra, Jonathan Gruss, Yves D. Stebler, Ahmet Selman Bozkir, Marco Calandri, Friedhelm Schwenker, and Thilo Stadelmann. PrepNet: A Convolutional Auto-Encoder to Homogenize CT Scans for Cross-Dataset Medical Image Analysis. In: Proceedings of the 14th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics (CISP-BMEI’21), Shanghai, China, 2021.

Thilo Stadelmann, Julian Keuzenkamp, Helmut Grabner, and Christoph Würsch. The AI Atlas: Didactics for Teaching AI and Machine Learning On-Site, Online, and hybrid. Educ. Sci. 2021, 11, 318, MDPI, Basel, Switzerland, June 25, 2021.

Evelyne Knapp, Mattia Battaglia, Thilo Stadelmann, Sandra Jenatsch, and Beat Ruhstaller. XGBoost Trained on Synthetic Data to Extract Material Parameters of Organic Semiconductors. In: Proceedings of the 8th Swiss Conference on Data Science (SDS’21), Lucerne, Switzerland, 2021. Best paper award.

Niclas Simmler, Pascal Sager, Philipp Andermatt, Ricardo Chavarriaga, Frank-Peter Schilling, Matthias Rosenthal, and Thilo Stadelmann. A Survey of Un-, Weakly-, and Semi-Supervised Learning Methods for Noisy, Missing and Partial Labels in Industrial Vision Applications. In: Proceedings of the 8th Swiss Conference on Data Science (SDS’21), Lucerne, Switzerland, 2021.

2020

Thilo Stadelmann, and Christoph Würsch. Maps for an Uncertain Future: Teaching AI and Machine Learning Using the ATLAS Concept. Technical report (didactic concept), ZHAW, Winterthur, Switzerland, 2020.

Lukas Tuggener, Mohammadreza Amirian, Fernando Benites, Pius von Däniken, Prakhar Gupta, Frank-Peter Schilling, and Thilo Stadelmann. Design Patterns for Resource-Constrained Automated Deep-Learning Methods. AI section “Intelligent Systems: Theory and Applications” 1(4):510-538, MDPI, Basel, Switzerland, November 06, 2020.

Lukas Tuggener, Yvan Putra Satyawan, Alexander Pacha, Jürgen Schmidhuber, and Thilo Stadelmann. The DeepScoresV2 Dataset and Benchmark for Music Object Detection. In: Proceedings of the 25th International Conference on Pattern Recognition (ICPR’20), IAPR, Milan, Italy, January 10-15 (online), 2021.

Frank-Peter Schilling, and Thilo Stadelmann (Eds.). “Artificial Neural Networks in Pattern Recognition - 9th IAPR TC3 Workshop, ANNPR 2020, Winterthur, Switzerland, September 2–4, 2020, Proceedings”. Lecture Notes in Artificial Intelligence 12294, Springer, September 02, 2020.

Dano Roost, Ralph Meier, Giovanni Toffetti Carughi, and Thilo Stadelmann. Combining Reinforcement Learning with Supervised Deep Learning for Neural Active Scene Understanding. In: Proceedings of the Active Vision and Perception in Human(-Robot) Collaboration Workshop at IEEE RO-MAN 2020 (AVHRC’20), online, August 31, 2020. Dr. Waldemar Jucker award 2020.

Stefan Glüge, Mohammadreza Amirian, Dandolo Flumini, and Thilo Stadelmann. How (Not) to Measure Bias in Face Recognition Networks. In: Proceedings of the 9th IAPR TC 3 Workshop on Artificial Neural Networks for Pattern Recognition (ANNPR’20), Springer, LNAI, Winterthur, Switzerland, September 02-04, 2020. Top-5 paper, invitation for extended journal paper.

Mohammadreza Amirian, Lukas Tuggener, Ricardo Chavarriaga, Yvan Putra Satyawan, Frank-Peter Schilling, Friedhelm Schwenker, and Thilo Stadelmann. Two to Trust: AutoML for Safe Modelling and Interpretable Deep Learning for Robustness. In: Proceedings of the 1st TAILOR Workshop on Trustworthy AI at ECAI 2020, Santiago de Compostela, Spain, September 04-06, 2020. Springer.

Dano Roost, Ralph Meier, Stephan Huschauer, Erik Nygren, Adrian Egli, Andreas Weiler, and Thilo Stadelmann. Improving Sample Efficiency and Multi-Agent Communication in RL-based Train Rescheduling. In: Proceedings of the 7th Swiss Conference on Data Science (SDS’20), Lucerne, Switzerland, June 26, 2020. IEEE. Best poster presentation award.

2019

Mohammadreza Amirian, Katharina Rombach, Lukas Tuggener, Frank-Peter Schilling, and Thilo Stadelmann. Efficient Deep CNNs for Cross-Modal Automated Computer Vision under Time and Space Constraints. In: AutoCV2 Workshop at European Conference on Machine Learning / European Conference on Principles and Practice of Knowledge Discovery in Databases (ECML-PKDD), Wuerzburg, Germany, September 16-19, 2019.

Kurt Stockinger, Martin Braschler, and Thilo Stadelmann. Lessons Learned from Challenging Data Science Case Studies. In: Martin Braschler, Thilo Stadelmann, and Kurt Stockinger (Editors). “Applied Data Science - Lessons Learned for the Data-Driven Business”. Springer, 2019.

Lukas Hollenstein, Lukas Lichtensteiger, Thilo Stadelmann, Mohammadreza Amirian, Lukas Budde, Jürg Meierhofer, Rudolf M. Füchslin, and Thomas Friedli. Unsupervised Learning and Simulation for Complexity Management in Business Operations. In: Martin Braschler, Thilo Stadelmann, and Kurt Stockinger (Editors). “Applied Data Science - Lessons Learned for the Data-Driven Business”. Springer, 2019.

Thilo Stadelmann, Vasily Tolkachev, Beate Sick, Jan Stampfli, and Oliver Dürr. Beyond ImageNet - Deep Learning in Industrial Practice. In: Martin Braschler, Thilo Stadelmann, and Kurt Stockinger (Editors). “Applied Data Science - Lessons Learned for the Data-Driven Business”. Springer, 2019.

Jürg Meierhofer, Thilo Stadelmann, and Mark Cieliebak. Data Products. In: Martin Braschler, Thilo Stadelmann, and Kurt Stockinger (Editors). “Applied Data Science - Lessons Learned for the Data-Driven Business”. Springer, 2019.

Thilo Stadelmann, Kurt Stockinger, Gundula Heinatz-Bürki, and Martin Braschler. Data Scientists. In: Martin Braschler, Thilo Stadelmann, and Kurt Stockinger (Editors). “Applied Data Science - Lessons Learned for the Data-Driven Business”. Springer, 2019.

Martin Braschler, Thilo Stadelmann, and Kurt Stockinger. Data Science. In: Martin Braschler, Thilo Stadelmann, and Kurt Stockinger (Editors). “Applied Data Science - Lessons Learned for the Data-Driven Business”. Springer, 2019.

Thilo Stadelmann, Martin Braschler, and Kurt Stockinger. Introduction to Applied Data Science. In: Martin Braschler, Thilo Stadelmann, and Kurt Stockinger (Editors). “Applied Data Science - Lessons Learned for the Data-Driven Business”. Springer, 2019.

Martin Braschler, Thilo Stadelmann, and Kurt Stockinger (Editors). Applied Data Science - Lessons Learned for the Data-Driven Business. Springer, 2019.

Thilo Stadelmann. Wie maschinelles Lernen den Markt verändert. In: Reinhard Haupt, Stephan Schmitz (Editors), “Digitalisierung: Datenhype mit Werteverlust? Ethische Perspektiven für eine Schlüsseltechnologie”, pp. 67-79, ISBN 377516040X, SCM Hänssler, 2019.

Lukas Tuggener, Mohammadreza Amirian, Katharina Rombach, Stefan Lörwald, Anastasia Varlet, Christian Westermann, and Thilo Stadelmann. Automated Machine Learning in Practice: State of the Art and Recent Results. In: Proceedings of the 6th Swiss Conference on Data Science (SDS’19), Bern, Switzerland, June 14, 2019. IEEE.

2018

Ismail Elezi, Lukas Tuggener, Marcello Pelillo, and Thilo Stadelmann. DeepScores and Deep Watershed Detection: current state and open issues. In: Proceedings of the 1st International Workshop on Reading Music Systems (WoRMS’18), Paris, France, September 20, 2018.

Thilo Stadelmann, Mohammadreza Amirian, Ismail Arabaci, Marek Arnold, Gilbert François Duivesteijn, Ismail Elezi, Melanie Geiger, Stefan Lörwald, Benjamin Bruno Meier, Katharina Rombach, and Lukas Tuggener. Deep Learning in the Wild. In: Proceedings of the 8th IAPR TC 3 Workshop on Artificial Neural Networks for Pattern Recognition (ANNPR’18), Springer, LNAI 11081, pp. 17-38, Siena, Italy, September 19-21, 2018. Invited paper.

Mohammadreza Amirian, Friedhelm Schwenker, and Thilo Stadelmann. Trace and Detect Adversarial Attacks on CNNs using Feature Response Maps. In: Proceedings of the 8th IAPR TC 3 Workshop on Artificial Neural Networks for Pattern Recognition (ANNPR’18), Springer, LNAI 11081, pp. 346-358, Siena, Italy, September 19-21, 2018.

Thilo Stadelmann, Sebastian Glinski-Haefeli, Patrick Gerber, and Oliver Dürr. Capturing Suprasegmental Features of a Voice with RNNs for Improved Speaker Clustering. In: Proceedings of the 8th IAPR TC 3 Workshop on Artificial Neural Networks for Pattern Recognition (ANNPR’18), Springer, LNAI 11081, pp. 333-345, Siena, Italy, September 19-21, 2018.

Benjamin Bruno Meier, Ismail Elezi, Mohammadreza Amirian, Oliver Dürr, and Thilo Stadelmann. Learning Neural Models for End-to-End Clustering. In: Proceedings of the 8th IAPR TC 3 Workshop on Artificial Neural Networks for Pattern Recognition (ANNPR’18), Springer, LNAI 11081, pp. 126-138, Siena, Italy, September 19-21, 2018.

Lukas Tuggener, Ismail Elezi, Jürgen Schmidhuber, and Thilo Stadelmann. Deep watershed detector for music object recognition. In: Proceedings of the 19th International Society for Music Information Retrieval Conference (ISMIR’18), Paris, 23. - 27. September 2018. Paris: Society for Music Information Retrieval. DOI 10.21256/zhaw-3760.

Feliks Hibraj, Sebastiano Vascon, Thilo Stadelmann, and Marcello Pelillo. Speaker clustering using dominant sets. In: Proceedings of the 24th International Conference on Pattern Recognition (ICPR 2018). 24th International Conference on Pattern Recognition (ICPR’18), Beijing, China, 20-28 August 2018. Beijing: IAPR. DOI 10.21256/zhaw-4254.

Lukas Tuggener, Ismail Elezi, Jürgen Schmidhuber, Marcello Pelillo, and Thilo Stadelmann. DeepScores: a dataset for segmentation, detection and classification of tiny objects. In: Proceedings of the 24th International Conference on Pattern Recognition. 24th International Conference on Pattern Recognition (ICPR’18), Beijing, China, 20-28 August 2018. Beijing: IAPR. 1-6. DOI 10.21256/zhaw-4255.

2017

Benjamin Meier, Thilo Stadelmann, Jan Stampfli, Marek Arnold, and Mark Cieliebak. Fully convolutional neural networks for newspaper article segmentation. In: Proceedings of the 14th IAPR International Conference on Document Analysis and Recognition (ICDAR’17). 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), Kyoto Japan, November 13-15, 2017. Kyoto, Japan: CPS. DOI 10.21256/zhaw-1533.

Yanick X. Lukic, Carlo Vogt, Oliver Dürr, and Thilo Stadelmann. Learning Embeddings for Speaker Clustering Based on Voice Equality. In: Proceedings of the 27th IEEE International Workshop on Machine Learning for Signal Processing (MLSP’17). Roppongi, Tokyo, Japan: IEEE. DOI 10.21256/zhaw-3762.

2016

Yanick Lukic, Carlo Vogt, Oliver Dürr, and Thilo Stadelmann. Speaker Identification and Clustering using Convolutional Neural Networks. In: Proceedings of IEEE International Workshop on Machine Learning for Signal Processing (MLSP’16). Salerno: IEEE. DOI 10.21256/zhaw-3761.

Kurt Stockinger, Thilo Stadelmann, and Andreas Ruckstuhl. Data Scientist als Beruf. Big Data – Grundlagen, Systeme und Nutzungspotenziale, Springer Verlag, Edition HMD 59-81, 2016. DOI 10.1007/978-3-658-11589-0_4.

2015

Jean-Daniel Dessimoz, Jana Koehler, and Thilo Stadelmann. AI in Switzerland. AI Magazine. 36(2), S. 102-105, 2015. DOI 10.21256/zhaw-3642. Invited paper.

Thilo Stadelmann, Mark Cieliebak, and Kurt Stockinger. Toward automatic data curation for open data. ERCIM News. 2015(100), S. 32-33. DOI 10.21256/zhaw-3643.

2014

Kurt Stockinger, and Thilo Stadelmann. Data Science für Lehre, Forschung und Praxis. HMD Praxis der Wirtschaftsinformatik. 51(4), S. 469-479, 2014. DOI 10.21256/zhaw-3759.

2013

Thilo Stadelmann, Kurt Stockinger, Martin Braschler, Mark Cieliebak, Gerold Baudinot, Oliver Dürr, and Andreas Ruckstuhl. Applied data science in Europe: challenges for academia in keeping up with a highly demanded topic. In: Proceedings of the 9th European Computer Science Summit (ECSS’13), Amsterdam, October 8–9, 2013.

2012

Thilo Stadelmann, Sven Johr, Michael Ditze, Florian Dittman, and Viktor Fässler. FABELHAFT - Fahrerablenkung: Entwicklung eines Meta-Fahrerassistenzsystems durch Echtzeit-Audioklassifikation. In Proceedings of 28. VDI-VW Gemeinschaftstagung Fahrerassistenzsysteme und Integrierte Sicherheit ‘12, Wolfsburg, Germany, October 10.-11., 2012. VDI Wissensforum.

2010

Thilo Stadelmann. Voice Modeling Methods for Automatic Speaker Recognition. Dissertation, Philipps-Universität Marburg. Available online, 2010.

Thilo Stadelmann & Bernd Freisleben. On the MixMax Model and Cepstral Features for Noise-Robust Voice Recognition. Technical report, Philipps-Universität Marburg, April 2010.

Christian Beecks, Thilo Stadelmann, Bernd Freisleben, and Thomas Seidl. Visual Speaker Model Exploration, In Proceedings of the IEEE International Conference on Multimedia and Expo (ICME’2010), pages 727-728, Singapore, July 19-23, 2010, IEEE.

Thilo Stadelmann, Yinghui Wang, Matthew Smith, Ralph Ewerth, and Bernd Freisleben. Rethinking Algorithm Development and Design in Speech Processing. In Proceedings of the 20th International Conference on Pattern Recognition (ICPR’10), pages 4476-4479, Istanbul, Turkey, August 2010a. IAPR.

Thilo Stadelmann and Bernd Freisleben. Dimension-Decoupled Gaussian Mixture Model for Short Utterance Speaker Recognition. In Proceedings of the 20th International Conference on Pattern Recognition (ICPR’10), pages 1602-1605, Istanbul, Turkey, August 2010a. IAPR.

2009

Markus Mühling, Ralph Ewerth, Thilo Stadelmann, Bing Shi, and Bernd Freisleben. University of Marburg at TRECVID 2009: High-Level Feature Extraction. In Proceedings of TREC Video Retrieval Evaluation Workshop (TRECVid’09).

Ernst Juhnke, Dominik Seiler, Thilo Stadelmann, Tim Dörnemann, and Bernd Freisleben. LCDL: An Extensible Framework for Wrapping Legacy Code. In Proceedings of International Workshop on @WAS Emerging Research Projects, Applications and Services (ERPAS’09), pages 638-642, Kuala Lumpur, Malaysia, December 2009.

Dominik Seiler, Ralph Ewerth, Steffen Heinzl, Thilo Stadelmann, Markus Mühling, Bernd Freisleben, and Manfred Grauer. Eine Service-Orientierte Grid-Infrastruktur zur Unterstützung Medienwissenschaftlicher Filmanalyse. In Proceedings of the Workshop on Gemeinschaften in Neuen Medien (GeNeMe’09”), pages 79-89, Dresden, Germany, September 2009.

Thilo Stadelmann and Bernd Freisleben. Unfolding Speaker Clustering Potential: A Biomimetic Approach. In Proceedings of the ACM International Conference on Multimedia (ACMMM’09”), pages 185-194, Beijing, China, October 2009. ACM.

Thilo Stadelmann, Steffen Heinzl, Markus Unterberger, and Bernd Freisleben. WebVoice: A Toolkit for Perceptual Insights into Speech Processing. In Proceedings of the 2nd International Congress on Image and Signal Processing (CISP’09), pages 4358-4362, Tianjin, China, October 2009.

Steffen Heinzl, Markus Mathes, Thilo Stadelmann, Dominik Seiler, Marcel Diegelmann, Helmut Dohmann, and Bernd Freisleben. The Web Service Browser: Automatic Client Generation and Efficient Data Transfer for Web Services. In Proceedings of the 7th IEEE International Conference on Web Services (ICWS’09), pages 743-750, Los Angeles, CA, USA, July 2009. IEEE Press.

Steffen Heinzl, Dominik Seiler, Ernst Juhnke, Thilo Stadelmann, Ralph Ewerth, Manfred Grauer, and Bernd Freisleben. A Scalable Service-Oriented Architecture for Multimedia Analysis, Synthesis, and Consumption. International Journal of Web and Grid Services, 5(3):219-260, 2009. Inderscience Publishers.

2008

Markus Mühling, Ralph Ewerth, Thilo Stadelmann, Bing Shi, and Bernd Freisleben. University of Marburg at TRECVID 2008: High-Level Feature Extraction. In Proceedings of TREC Video Retrieval Evaluation Workshop (TRECVid’08).

2007

Markus Mühling, Ralph Ewerth, Thilo Stadelmann, Bing Shi, Christian Zöfel, and Bernd Freisleben. University of Marburg at TRECVID 2007: Shot Boundary Detection and High-Level Feature Extraction. In Proceedings of TREC Video Retrieval Evaluation Workshop (TRECVid’07).

Ralph Ewerth, Markus Mühling, Thilo Stadelmann, Julinda Gllavata, Manfred Grauer, and Bernd Freisleben. Videana: A Software Toolkit for Scientific Film Studies. In Proceedings of the International Workshop on Digital Tools in Film Studies ‘07, pages 1-16, Siegen, Germany, 2007. Transcript Verlag.

Markus Mühling, Ralph Ewerth, Thilo Stadelmann, Bernd Freisleben, Rene Weber, and Klaus Mathiak. Semantic Video Analysis for Psychological Research on Violence in Computer Games. In Proceedings of the ACM International Conference on Image and Video Retrieval (CIVR’07), pages 611-618, Amsterdam, The Netherlands, July 2007. ACM.

2006

Ralph Ewerth, Markus Mühling, Thilo Stadelmann, Ermir Qeli, Björn Agel, Dominik Seiler, and Bernd Freisleben. University of Marburg at TRECVID 2006: Shot Boundary Detection and Rushes Task Results. In Proceedings of TREC Video Retrieval Evaluation Workshop (TRECVid’06).

Thilo Stadelmann and Bernd Freisleben. Fast and Robust Speaker Clustering Using the Earth Mover’s Distance and MixMax Models. In Proceedings of the 31st IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP’06), volume 1, pages 989-992, Toulouse, France, April 2006. IEEE.

2005

Ralph Ewerth, Christian Behringer, Tobias Kopp, Michael Niebergall, Thilo Stadelmann, and Bernd Freisleben. University of Marburg at TRECVID 2005: Shot Boundary Detection and Camera Motion Estimation Results. In Proceedings of TREC Video Retrieval Evaluation Workshop (TRECVid’05).

2004

Thilo Stadelmann. Sprechererkennung in Videos. Diplomarbeit, Fachhochschule Giessen-Friedberg, 2004.

comments powered by Disqus